Ball+Gravity has a "Downhill" Preference

TristanTrim

[epistemic status: This is a rambling thought experiment with the goal of clarifying my ontological understanding of "agent foundations" type stuff. Scroll to the bottom for the resulting two "interesting focuses of confusion".]

Returning to the "ball rolls down a hill" example of Alex Altair's My research agenda in agent foundations, I think there's three important objects here: The ball, the hill, and gravity. Friction and inertia and other physics are also doing important things here in a real system, but can be ignored for this thought experiment.

I think this is a place where the "densly venn" aspect of OISs is a valuable lens.

The ball on it's own has the property of roundness which expresses preference for certain kinds of mechanic interactions, most relevantly: rolling. The ball+gravity then has preference for rolling down hills. Only all three together, ball+gravity+hill, has the preference for the ball being in the specific location at the bottom of the hill.

Yes, I do mean this in a technical sense that these preferences are the same kind of thing as the kind of thing we call preferences in humans. I am using "preference" as a specific keyword within OIS theory and I feel justified doing so. I welcome debate on the topic.

Anyway, the interesting question I think this leads to: Where is the separation between the ball+gravity's preference for rolling down hills and the ball+gravity's capability to roll down hills?

Why is this question interesting? Well, first because it makes me feel confused, and finding places where things feel confusing and becoming less confused about them feels like progress, but that's worryingly susceptible to bias and subjectivity, so more concretely, one of the problems that feels important and difficult when looking at deep neural networks is "Can I figure out how to look at this thing's preferences and capabilities separately from one another?"

I feel that question also applies to large sociotechnical OISs, a context where it is even more difficult and even more important to understand.

So I think this question of the separation of preferences and capabilities is very interesting. I'd love to know if other people have or are approaching it from within other ontologies. I think inverse reinforcement learning fits this category. Perhaps I should learn more about that.

But I think in the case of the ball+gravity, the preferences and the capabilities are kinda the same object. I think it's only once you get more highly symbolic mechanisms that it starts to seem like there is a separation between preferences and capabilities, because of how different the behaviour of the system can be by changing such a small highly symbolic part of it, without making any changes to larger, less symbolic parts.

I think keys are an interesting example here. The key is shaped so when it is put in the lock the pins go to the right heights to line up with the shear line allowing the cylinder to rotate. Is the shape of the key mechanical, or symbolic?

So these are two questions I'm interested in becoming less confused about:

What are good ways to think about the separation between preferences and capabilities?
What are good ways to think about the separation between symbolic and mechanical?

LESSWRONG
LW

LESSWRONG
LW

4

Ball+Gravity has a "Downhill" Preference

4

4

4