PhD student in theoretical computer science (distributed computing) in France. Currently transitioning to AI Safety and fundamental ML work.
Nice post. Being convinced myself of the importance of mathematics both for understanding the world in general and for the specific problems of AI safety, I found it interesting to see what arguments you marshaled in and against this position.
About the unreasonable effectiveness of mathematics, I'd like to throw the "follow-up" statement: The unreasonable ineffectiveness of mathematics beyond physics (for example in biology). The counter argument, at least for biology, is that Wigner was talking a lot about differential equations, which seems somewhat ineffective in biology; but theoretical computer science, which one can see as the mathematical study of computation, and thus somewhat a branch of mathematics, might be better fitted to biology.
A general comment about your perspective is that you seem to equals mathematics with formal specification and proofs. That's not necessarily an issue, but most modern mathematicians tend to not be exact formalists, so I thought it important to point out.
For the rest of my comments:
I find discussions about AI takeoff to be very confusing.
So do I. So thanks a lot for this summary!
Why should all equivalence classes of linked world have the same average utility? That ensures the unicity of the utility function up to translation, but I'm not sure that's always the best way to do it. What is the intuition behind this specific choice?
Thanks, I'll keep going then.
I don't see the link with my objection, since you quote a part of your post when you write of value impact (which is dependent on the values of the specific agents) and I talk about the need for context even for objective impact (which you present as independent of values and objectives of specific agents)
I have one potential criticism of the examples:
Because I was not sure what was the concrete implication of the asteroid impact, the reveal was unimpactful on me (pun inteded) that it was objectively valued negatively by anybody because they risk death. Had you written that the asteroid strikes near the agent, or that this causes massive catastrophes, then I would probably have though that it mattered the same for local peeblehoarders and for humans. Also, the asteroid might destroy pebbles (or depending on your definition of pebble, make new ones).
Also, I feel that some of your examples of objective impact are indeed relevant to agents in general (not dying/being destroyed), while other depends on sharing a common context (cash, which would be utterly useless in Pebblia if the local economy was based on exchanging peebles for peebles).
Do you just always consider this context as implicit?
Thanks, I'm looking into the toy model. :)
I really like the refinement of the formalization, with the explanations of what to keep and what was missing.
That said, I feel like the final formalization could be defined directly as a special type of preorder, one composed only of disjoint chains and cycles. Because as I understand the rest of the post, that is what you use when computing the utility function. This formalization would also be more direct, with one less layer of abstraction.
Is there any reason to prefer the "injective function" definition to the "special preorder" one?
Another modality of relating introduced to me by a friend a couple of weeks ago is "what kind of experience do you take from this relation". My friend has a quite idiosyncratic classification, but you could separate people you see between combinations of intellectual stimulation, sense of security, being cared for... In my mind this is quite orthogonal to other directions: whatever this relation holds for you, it might matter tremendously or very little.
The main use I have for this modality is to clarify what I am missing in my life. For example, when I feel lonely, I feel a discrepancy with my social situation: I have many friends, some really close who care about me and about whom I care. But when considering what experience I feel I am missing in my relationships, I can say that it's attraction and passion for the other and sexual tension and action.
Yes, I agree that you are focusing more on how to see the mistake in a meta-way, instead of an outside view as Nate do.
Though I don't think your example of the distinction is exactly the right one: the idea from Nate of banning "should" or cashing out "should" would be able IMHO to unearth the underlying "I should be taking things seriously" apply the consequentialist analysis of "you will not be measured by how you felt or who you punished. You will be measured by what actually happened, as will we all" (paraphrasing). What I feel is different is that the Way provide a mean for systematically findind this underlying should and explaining it from the inside.
Nonetheless, I find both useful, and I am better for having the Curse of the Counterfactual in my mental toolbox.