Agent foundations, AI macrostrategy, civilizational sanity, human enhancement.
I endorse and operate by Crocker's rules.
I have not signed any agreements whose existence I cannot mention.
There are mathematical theorems of the form: "Under assumption X, these 6 conditions are equivalent. We can prove this claim by a chain of implications: 1→2→3→4→5→6→1.". These theorems are often very non-trivial. (If they were trivial, they presumably wouldn't be theorems but corollaries.)
I appreciate the concept of "tracking which world the referent lives in". Plausibly tracking the referent-world in your head is useful for preventing blindness to certain optimization channels, not letting the social world constrain you too much,[1] etc.
That being said, I don't think this is true:
I claim that this example generalizes: insofar as Joe’s “fake thinking” vs “real thinking” points to a single coherent distinction, it points to thoughts which represent things in other worlds vs thoughts which represent things in our physical world.
One example from Joe's post that got ingrained in my memory was this passage about Derek Parfit.
And I got a similar vibe, too, from various professors. I would talk to them, and I would get some strange sense like “oh, they are holding some real project. They believe in it, they believe it’s possible to make real progress, they think it’s possible to do this together.” I only met Parfit a few times, but he famously had a ton of this, even as he was dying. “Non-religious ethics is at an early stage,” he writes on the last page of Reasons and Persons. Look, there, that sense of project. And to talk, as Parfit did, about having “wasted much of his life” if his brand of moral realism were false – that too implies a sense of trying to do something.
I take this seriousness and intensity — actually trying to meet reality face-to-face — as core to the distinction that Joe is pointing at (according to my interpretation of it, at least). Parfit actually believed that there is something out there, Reality, that could "hit him in the face", if he got it wrong, and the part of Reality that would hit him in the face would not quite be physical, unless you want to say "It is physical because it is implemented on physical brains." but then we lose the distinction between real thinking and fake/fictional thinking because all thinking is implemented on physical brains (or whatever mind-substrate). Alternatively, you could ground it in the physical world by saying that it should have some implications for the moral convergence of certain classes of agents but I don't think this was essential to Parfit's project and it would be coherent for him to aim for objective moral truth without the assumption that any kind of moral convergence occurs.[2]
Or take math. The monster group most likely isn't instantiated anywhere in reality except on a cognitive/computational substrate that was (downstream from minds) particularly interested in abstract algebra. It is, in the sense you're using this word here, fictional. But does it mean that all thinking about the monster group is fake in the sense that Joe's using this word? I don't think so. Andrew Wiles proved Fermat's Last Theorem, which is straightforwardly physical-world-interpretable in terms of real countable thingies. I think his project, which resulted in a proof of the theorem, is a good example of real thinking. But his thinking would not be less real if, instead of FLT, he chose something about the monster group as a target.
I think the median conversation for me is zero or positive-but-very-small epsilon fun, whereas the 90th percentile is maybe as fun as discovering a new song/band/album that I like a lot or listening to one of my favorite songs after several weeks of not listening to it. The most fun conversations I've ever had are probably the most fun experiences I've ever had.
I don't find conversations-in-general draining, although I can get exhausted by social activities where I'm supposed to play some role that is out of character for me, like in LARPing (though that might be a learnable-skill issue) or extended-family reunions.
I've come to believe that sin interferes with consciousness because that applies to all the sins I've been able to think of (e.g. murder).
How come?
I know that you didn't mean it as a serious comment, but I'm nevertheless curious about what you meant by "the universe is a teleology".
I would appreciate it if you put probabilities on at least some of these propositions.
At this point something even stranger happens.
What this "something even stranger" is seems rather critical.
I think that if you want to compute logical correlations between programs, you need to look at their, well, logic. E.g., if you have some way of extracting a natural abstraction-based representation of their logic, build something like a causal graph from it, and then design a similarity metric for comparing these representations.
I have a suspicion, though, that this is not the right approach for handling ECL because ECL (I think?) involves the agent(s) looking at (some abstraction over) their "source code(s)" and then making a decision based on that. I expect that this ~reflection needs to be modeled explicitly.
My forefrontest thought as I was finishing this essay was "Applying this concept to AI risk is left as an exercise for the reader.".
Then I thought that AI risk, if anything, is characterized by kinda the opposite dynamic: lots of groups with different risk models, not that rarely explicitly criticizing the strategy/approach of the others as net-negative or implicitly complacent with the baddies, finding it hard to cooperate despite what locally seems like convergent local subgoals. (To be clear: I'm not claiming that everybody's take/perspective on this is valid or that everybody in this field should cooperate with everybody else or whatever.)
Then I thought that, actually, even within what seems like somewhat coherent factions, we would probably see some tails coming apart once their goal (AI moratorium, PoC aligned AGI, existential security, exiting the acute risk period) is achieved.
And then I thought, well...
And then there were conversations where people I viewed as ~allied turned out to bite bullets that I considered (and still consider) equivalent to moral atrocities.
I may want to think more about this but ATM it seems to me like AI risk as a field (or loose cluster of groups) is failing both on cooperating to achieve locally cooperation-worthy convergent subgoals and also failing on seeing past the moral homophones.
(When I say "failing", I'm inclined to ask myself what standard I should apply but reality doesn't grade on a curve and stakes are huge.)
---
Anyway, thanks for the post and the concept!
[Caveat lector: I don't know that much about this, but that's how I see it at the moment.]
The amount of information required to find a proof of a proposition given all the information implicit in (the statement of) the proposition depends on the system doing the proof search. The system's search order can be seen as a prior over proofs (and over potentially useful intermediate proof steps, auxiliary definitions, etc). Additional info can update the prior, so that you find the proof faster, or using fewer resources in general (or misupdate and find the proof later, or in a more costly way, or never at all).
So I think you're right, but there are caveats.
Although it's also possible that there's some well-behaved class of proof-finding systems such that for any two of them, those caveats become irrelevant in the limit (of proposition/proof complexity or something?). (Logical inductors?)