JonathanErhardt — LessWrong

A Game About AI Alignment (& Meta-Ethics): What Are the Must Haves?

Not yet unfortunately, as our main project (QubiQuest: Castle Craft) has taken more of our resources than I had hoped. The goal is to release it this year in Q3. We do have a Steam page and a trailer now: https://store.steampowered.com/app/2086720/Elementary_Trolleyology/

Consciousness as a conflationary alliance term for intrinsically valued internal experiences

JonathanErhardt2y2017

My hunch is that with your interview setup you're not getting people to elaborate the meaning of their terms but to sketch their theories of consciousness. We should expect some convergence for the former but a lot of disagreement about the latter - which is what you found.

By excluding "near-synonyms" like "awareness" or "experience" and by insisting to describe the structure of the consciousness process you've made it fairly hard for them to provide the usual candidates for a conceptual analysis or clarification of "consciousness" (Qualia, Redness of Red, What-it-Is-Likeness, Subjectivity, Raw Character of X, etc.") and encouraged them to provide a theory or a correlate of consciousness.

(An example to make my case clearer: The meaning/intension of "heat" is not something like "high average kinetic energy of particles" - that's merely its extension. You can understand the meaning of "heat" without knowing anything about atoms.

But by telling people to not use near-synonyms of "heat" and instead focus on the heat process, we could probably get something like "high average kinetic energy of particles" as their analysis.)

It's a cool survey, I just don't think it shows what it purports to show. Instead it gives us a nice overview of some candidate correlates of consciousness.

A Game About AI Alignment (& Meta-Ethics): What Are the Must Haves?

JonathanErhardt3y10

We will post more when the game is announced, which should be in 2-3 weeks. For now I'm mostly interested in getting feedback on whether this way of setting the problem up is plausible and doesn't miss crucial elements, less about how to translate it into gameplay and digestible dialogue.

Once the annoucement (including the teaser) is out I'll create a new post for concrete ideas on gameplay + dialogue.

A Game About AI Alignment (& Meta-Ethics): What Are the Must Haves?

JonathanErhardt3y30

Thanks for the link, I will read that!

A Game About AI Alignment (& Meta-Ethics): What Are the Must Haves?

JonathanErhardt3y10

I really like that and it happens to fit well with the narrative that we're developing. I'll see where we can include a scene like this.

A Game About AI Alignment (& Meta-Ethics): What Are the Must Haves?

JonathanErhardt3y*10

Good point, I see what you mean. I think we could have 2 distinct concepts of "ethics" and 2 corresponding orthogonality theses:

Concept "ethics1" requires ethics to be motivational. Some set of rules can only be the true ethics if, necessarily, everyone who knows them is motivated to follow them. (I think moral internalist probably use this concept?)
Concept "ethics2" doesn't require some set of rules to be motivational to be the correct ethics.

The orthogonality thesis for 1 is what I mentioned: Since there are (probably) no rules that necessarily motivate everyone who knows them, the AI would not find the true ethical theory.

The orthogonality thesis for 2 is what you mention: Even if the AI finds it, it would not necessarily be motivated by it.

The Zombie Argument for Empiricists

JonathanErhardt4y10

"Yet the average person would say it isn't possible."

I'd distinguish conceivability from possibility. In the case of possibility there are many types: logical possibility (no logical contradiction), broad logical possibility (no conceptual incoherence), nomological possibility, physical possibility, etc. Most people would probably agree that levitating frogs are logically possible, broadly logically possible, but not physically or nomologically possible as this would contradict the laws of physics.

It's less clear to me that there are many different types of conceivability. But even if they are: the type I care about in the post above is something like "forming a mental model of".

"But lots of other things were conceivable before the discovery. The narrowing is that, in terms of the correct explanation, the possibility that you get sodium and chlorine is no longer tenable ."

I see, that's a helpful example.

The Zombie Argument for Empiricists

JonathanErhardt4y*10

I'd say both of these discoveries/explanations didn't change what is conceivable. Even before the water=H2O discovery it was conceptually coherent/conceivable that electrolysing water yields hydrogen. And it was and is conceivable to levitate a frog as there is no contradiction in this idea. It's just very surprising that it can actually be done.

The Zombie Argument for Empiricists

JonathanErhardt4y10

Could you give me an example of a case where an explanation has broadened or narrowed what is conceivable, so I understand better what you have in mind?

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments