switham - LessWrong

Want to relate Wolfram's big complexity question to three frameworky approaches already in use.

Humans have ideas of rights and property that simplify the question "How do we want people to act?" to "okay well What are we pretty sure we want people not to do?" and simplify that another step to "okay, let's Divide the world into non-intersecting Spheres of control, one per person, say you can do what you want within your sphere, and only do things outside your sphere by mutual agreement with the person in charge of the other sphere. (And one thing that can be mutually agreed on is redrawing sphere boundaries between the people agreeing.)

These don't just simplify ethics as a curious side-effect; both start as practical chunks of what we want people not to do, then evolved into customs and other forms of hardening. I guess they evolved to the point where they're common because they were simple enough.

The point I'm making relative to Wolfram is: (inventing ratios) 90% of the problem of ethics is simplified away with 10% of the effort, and it's an obvious first 10% of effort to duplicate.

And although they present simpler goals they don't implement them.

Sometimes ethics isn't the question and game theory or economics is (to the extent those aren't all the same thing). For example, for some reason there are large corporations that cater to millions of poor customers.

With computers there are attempts at security. Specifically I want to mention the approach called object-capability security, because it's based on reifying rights and property in fine-grained composable ways and building underlying systems that support and if done right only support rightful actions (in the way they allow to reify).

This paragraph is amateur alignment stuff: The problem of actually understanding how and why humans are good is vague but my guess is it's more tractable than defining ethics in detail with ramifications. Both are barely touched, and we've been getting off easy. It's not clear that many moral philosophers will jump into high gear based on no really shocking AI alignment disasters (which we survive to react to) so far. At this point I believe there's something to goodness, that there's something actually and detectably cool about interacting with (other) humans. It seems to provide a reason to get off one's butt at all. The value of it could be something that's visible when you have curiosity plus mumble. I.e. complex, but learnable given the right bootstrap. I don't know how to define whether someone has reconstructed the right bootstrap.

Returning to Wolfram: but at this point it seems possible to me that whatever-good-is exists and bootstrapping it is doable.

LESSWRONG
LW

Posts

Wiki Contributions

Comments