We (Connor Leahy, Gabriel Alfour, Chris Scammell, Andrea Miotti, Adam Shimi) have just published The Compendium, which brings together in a single place the most important arguments that drive our models of the AGI race, and what we need to do to avoid catastrophe.
We felt that something like this has been missing from the AI conversation. Most of these points have been shared before, but a “comprehensive worldview” doc has been missing. We’ve tried our best to fill this gap, and welcome feedback and debate about the arguments. The Compendium is a living document, and we’ll keep updating it as we learn more and change our minds.
We would appreciate your feedback, whether or not you agree with us:
I think that arguments for why godlike AI will make us extinct are not described well in the Compendium. I could not find them in AI Catastrophe, only a hint at the end that it will be in the next section:
"The obvious next question is: why would godlike-AI not be under our control, not follow our goals, not care about humanity? Why would we get that wrong in making them?"
In the next section, AI Safety, we can find the definition of AI alignment and arguments for why it is really hard. This is all good, but it does not answer the question of w...
Some meditation advice has a vibe like... “To become more present, all you need to do is practice! It's just a skill, like learning to ride a bike 😊”
This never worked for me. When I tried to force presence through practice, I made little progress.
Being present isn't a skill to build — it's the natural state! What blocks presence is unconscious predictions of bad outcomes. Remove these blocks to make presence automatic.
Logical counterfactuals are when you say something like "Suppose , what would that imply?"
They play an important role in logical decision theory.
Suppose you take a false proposition and then take a logical counterfactual in which is true. I am imagining this counterfactual as a function that sends counterfactually true statements to 1 and false statements to 0.
Suppose is " not Fermat's last theorem". In the counterfactual where Fermat's last theorem is false, I would still expect 2+2=4. Perhaps not with measure 1, but close. So
On the other hand, I would expect trivial rephrasing of Fermat's last theorem to be false, or at least mostly false.
But does this counterfactual produce a specific counter example? Does it think that ? Or does it do something where the counterfactual insists a counter-example exists, but...
If they have source code, then they are not perfectly rational and cannot in general implement LDT. They can at best implement a boundedly rational subset of LDT, which will have flaws.
Assume the contrary: Then each agent can verify that the other implements LDT, since perfect knowledge of the other's source code includes the knowledge that it implements LDT. In particular, each can verify that the other's code implements a consistent system that includes arithmetic, and can run the other on their own source to consequently verify that they themselves impl...
Rationalists are people who have an irrational preference for rationality.
This may sound silly, but when you think about it, it couldn't be any other way. I am not saying that all reasons in favor of rationality are irrational -- in fact, there are many rational reasons to be rational! It's just that "rational reasons to be rational" is a circular argument that is not going to impress anyone who doesn't already care about rationality for some other reason.
So when there is a debate like "but wouldn't the right kind of self-deception be more instrumentally useful than perfectly calibrated rationality? do you care more about rationality or about winning?", well... you can make good arguments for both sides...
On one hand, yes, if your goal is to maximize your...
I have an irrational preference
If your utility function weights you knowing things higher than most people's, that is not an irrationality.
Pacing—walking repeatedly over the same ground—often feels ineffably good while I’m doing it, but then I forget about it for ages, so I thought I’d write about it here.
I don’t mean just going for an inefficient walk—it is somehow different to just step slowly in a circle around the same room for a long time, or up and down a passageway.
I don’t know why it would be good, but some ideas:
Pacing is a common stimming behavior. It's associated with autism / sensory processing disorder, but neurotypical people do it too.
As the year comes to an end, the days have grown shorter and the nights have grown longer. Tonight we remember the ancient days of humanity's past, and sing of the future we might hope for. There is uncertainty in both directions. We cannot know for sure what has been, and we cannot say what the future will be. There is this, however:
Tomorrow will be brighter than today.
The Montréal Solstice Celebration will be on December 21th. Doors open at 6:30; Solstice starts at 7:00. Please arrive on time to avoid disrupting the somber and more serious parts of the program. RSVPs are appreciated so we can plan appropriately.
If you have never been to a Secular Solstice in this style before, you should expect alternating speeches and songs,...
@BionicD0LPH1N : there will be ~8 candles :)
I am currently looking for a system which will help me execute some of my massive backlog of ideas. By “ideas” I include my hundreds and hundreds of story outlines for films with a handful of finished screenplays, but also things like: alternative income streams, or day jobs, or skills or abilities I’d like to learn/get (coding, traditional animation, dance the Tango, conversational Italian), as well as a host of other projects.
Before I get to the determining how to better pick which ideas I should pursue, I was wondering if there was any more I could do to optimize my current idea recording method. Some of this overlaps with the GTD concept of the "Someday" bucket. But what I don't like about that is that I'd very...
It started with me taking notes while playing RPGs, but turned into a daily journal.
Interesting! Is that because you find that your most creative while playing RPGs? How much detail are in those notes? How often do you find you pause the game to write one (reminds me of the Mitch Hedberg joke about thinking of a joke at night, he either needs to get up, or convince himself that the joke isn't that funny).
How often do you text search for ideas? What seems to trigger revisiting an idea?
Many people who find value in the Sequences do something which looks to me like adopting a virtue called "align your map to the territory." I recently was thinking about experimental results, and it got me thinking about how we don't really know what the territory is, other than the thing we look at to see if our maps are right. Everything we know is map. What we know consists of a variety of models that describe aspects of reality, and we have to treat them like reality to get anything done. It wasn't relevant to my post at the time, but it occurred to me that it doesn't really matter what reality is, because my values live at a higher level of abstraction with my sense...
Nod.
Fwiw I mostly just thought it was funny in a way that was sort of neutral on "is this a reasonable frame or not?". It was the first thing I thought of as soon as I read your post title.
(I think it's both true that in an important sense everything we care about is in the Map, and also true in an important sense that it's not, and in the ways it was true it felt like a kind of legitimately poignant rewrite that felt like it helped me appreciate your post, and insofar as it was false it seemed hilarious (non-meanspiritedly, just in a "it's funny that so many lines from the original remain reasonable sentences when you reframe it as about epistemology"))
Sometimes two people are talking past each other, and I try to help them understand each other (with varying degrees of success).
It’s as if they are looking at the same object, but from different angles. Mostly they see the same thing – most of the words have shared meanings. But some key words and assumptions have a different meaning to them.
Often, I find that one person (call them A) has a perspective that’s easier for me to understand. It comes naturally. But B’s perspective is initially harder. So if I want to translate from B to A, I first need to understand B.
I remember a time when I sat listening to two people having a conversation, both getting increasingly agitated and repeating the same points without making...
It seems to me this is an example of you and Kaj talking past each other. To you, B's perspective is "eminently reasonable" and needs no further explanation. To Kaj, B's perspective was a bit unusual, and to fully inhabit that perspective, Kaj wanted a bit more context to understand why B was holding that principle higher than other things (enjoying the social collaboration, the satisfaction of optimally solving a problem, etc.).