In case you missed it: we now have an FAQ for this project, last updated Jan. 7.
(Daniel Dennett's book Darwin's Dangerous Idea does a good job I think of imparting intuitions about the 'Platonic inevitability' of it.)
Possibly when Richard says "evolutionary theory" he means stuff like 'all life on Earth has descended with modification from a common pool of ancestors', not just 'selection is a thing'? It's also an empirical claim that any of the differences between real-world organisms in the same breeding population are heritable.
how do you get some substance into every human's body within the same 1 second period? Aren't a bunch of people e.g. in the middle of some national park, away from convenient air vents? Is the substance somehow everywhere in the atmosphere all at once?
I think the intended visualization is simply that you create a very small self-replicating machine, and have it replicate enough times in the atmosphere that every human-sized organism on the planet will on average contain many copies of it.
One of my co-workers at MIRI comments:
(further conjunctive detail for
When I look at factory-farmed animals, I feel awful for them. So coming into this, I have some expectation that my eventual understanding of consciousness, animal cognition, and morality (C/A/M) will add up to normalcy (i.e. not net positive for many animals).
Note that there might be other crucial factors in assessing whether 'more factory farming' or 'less factory farming' is good on net — e.g., the effect on wild animals, including indirect effects like 'factory farming changes the global climate, which changes various ecosystems around the world, which increases/decreases the population of various species (or changes what their lives are like)'.
It then matters a lot how likely various wild animal species are to be moral patients, whether their lives tend to be 'worse than death' vs. 'better than death', etc.... (read more)
I'd guess the most controversial part of this post will be the claim 'it's not incredibly obvious that factory-farmed animals (if conscious) have lives that are worse than nonexistence'?
But I don't see why. It's hard to be confident of any view on this, when we understand so little about consciousness, animal cognition, or morality. Combining three different mysteries doesn't tend to create an environment for extreme confidence — rather, you end up even more uncertain in the combination than in each individual component.
And there are obvious (speciesist) r... (read more)
Pretty much all the writing I've read by Holocaust survivors says that this was not true, that the experience was unambiguously worse than being dead, and that the only thing that kept them going was the hope of being freed. (E.g. according to Victor Frankl in "Man's Search for Meaning", all the prisoners in his camp agreed that, not only was it worse than being dead, it was so bad that any good experiences after being freed could not make up for it how bad it was. Why they didn't kill themselves is an interesting question that he explores a bit in the book.) Are there any Holocaust survivors who claim otherwise?
I would guess that humans' nightmarish experience in concentration camps was usually better than nonexistence; and even if you suspect this is false, it seems easy to imagine how it could be true, because there's a lot more to human experience than 'pain, and beyond that pain, darkness'.
I can't really imagine this – at least for people in extermination camps, who weren't killed. I'd assume that, all else equal, the vast majority of prisoners would choose to skip that part of their life. But maybe I'm missing something or have unusual intuitions.
I haven't done anything like a careful analysis, but at a guess, this shift has some promise for unifying the classical split between epistemic and instrumental rationality. Rationality becomes the art of seeking interaction with reality such that your anticipations keep synching up more and more exactly over time.
"Unifying epistemic and instrumental reality" doesn't seem desirable to me — winning and world-mapping are different things. We have to choose between them sometimes, which is messy, but such is the nature of caring about more than one thing in l... (read more)
I don't use microCOVID much. Two things I'd like from the site:
The latter goal seems more useful in general, and my sense is that microCOVID isn't currently set up to do that kind of thing -- the site currently says "Not yet updated for the Omicron variant", over a month in.
For the latter go... (read more)
Reply by Holden Karnofsky: https://www.lesswrong.com/s/n945eovrA3oDueqtq/p/nNqXfnjiezYukiMJi
Buddhism is a huge part of Joshin's life (which seems fine to note), but if there's an implied argument 'Buddhism is causally responsible for this style of discourse', 'all Buddhists tend to be like this', etc., you'll have to spell that out more.
Sounds right, yeah!
Firstly, I (partially?) agree that the current DL paradigm isn't strongly alignable (in a robust, high certainty paradigm), we may or may not agree to what extent it is approximately/weakly alignable.
I don't know what "strongly alignable", "robust, high certainty paradigm", or "approximately/weakly alignable" mean here. As I said in another comment:
There are two problems here:Problem #1: Align limited task AGI to do some minimal act that ensures no one else can destroy the world with AGI.Problem #2: Solve the full problem of using AGI to help us achieve an
There are two problems here:
Logical induction, Löbian cooperation, reflection in HOL, and functional decision theory are all results where researchers have expressed surprise to MIRI that the results were achievable even in principle.
I think a common culprit is people misunderstanding Gödel's theorems as blocking more things than they actually do. There's also field-specific folklore — e.g., a lot of traditional academic decision theorists seem to have somehow acquired the belief that you can't assign probabilities to your own actions, on pain of paradox.
I... think that makes more sense? Though Eliezer was saying the field's progress overall was insufficient, not saying 'decision theory good, ML bad'. He singled out eg Paul Christiano and Chris Olah as two of the field's best researchers.
In any case, thanks for explaining!
I'd argue instead that MIRI bet heavily against connectivism/DL, and lost on that bet just as heavily.
I think this is straightforwardly true in two different ways:
Problem #1 is the one I was talking about in the OP, and I think of it as the problem we need to solve on a deadline. Problem #2 is also indispensable (and a lot more philosophically fraught), but it's something humanity can solve at its leisure once we've solved #1 and therefore aren't at immediate risk of destroying ourselves.
The rhetorical approach of the comment is also weird to me. 'So you've never heard of CIRL?' surely isn't a hypothesis you'd give more weight to than 'You think CIRL wasn't a large advance', 'You think CIRL is MIRI-ish', 'You disagree with me about the size and importance of the alignment problem such that you think it should be a major civilizational effort', 'You think CIRL is cool but think we aren't yet hitting diminishing returns on CIRL-sized insights and are therefore liable to come up with a lot more of them in the future'. etc. So I assume the question is rhetorical; but then it's not clear to me what you believe about CIRL or what point you want to make with it.
(Ditto value learning, IRL, etc.)
So you haven't heard of IRL, CIRL, value learning, that whole DL safety track, etc? Or are you outright dismissing them? I'd argue instead that MIRI bet heavily against connectivism/DL, and lost on that bet just as heavily.
This comment and the entire conversation that spawned from it is weirdly ungrounded in the text — I never even mentioned DL. The thing I was expressing was 'relative to the capacity of the human race, and relative to the importance and (likely) difficulty of the alignment problem, very few research-hours have gone into the alignment prob... (read more)
Relative to what I mean by 'reasoning about messy physical environments at all', MuZero and Tesla Autopilot don't count. I could see an argument for GPT-3 counting, but I don't think it's in fact doing the thing.
Making a map of your map is another one of those techniques that seem to provide more grounding but do not actually.
Sounds to me like one of the things Eliezer is pointing at in Hero Licensing:
Look, thinking things like that is just not how the inside of my head is organized. There’s just the book I have in my head and the question of whether I can translate that image into reality. My mental world is about the book, not about me.
You do want to train your brain, and you want to understand your strengths and weaknesses. But dwelling on your biases at the ex... (read more)
(I'm not sure whether your summary captures Eliezer's view, but strong-upvoted for what strikes me as a reasonable attempt.)
My first order response to this is in https://www.lesswrong.com/posts/Js34Ez9nrDeJCTYQL/politics-is-way-too-meta
My Eliezer-model thinks that "there will be a complete 4 year interval in which world output doubles, before the first 1 year interval in which world output doubles" is far less than 30% likely, because it's so conjunctive:
Is this 5 years of engineering effort and then humans leaving it alone with infinite compute?
Maybe something like '5 years of engineering effort to start automating work that qualitatively (but incredibly slowly and inefficiently) is helping with AI research, and then a few decades of throwing more compute at that for the AI to reach superintelligence'?
With infinite compute you could just recapitulate evolution, so I doubt Paul thinks there's a crux like that? But there could be a crux that's about whether GPT-3.5 plus a few decades of hardware progress achieves superintelligence, or about whether that's approximately the fastest way to get to superintelligence, or something.
Do you think that human generality of thought requires a unique algorithm and/or brain structure that's not present in chimps? Rather than our brains just being scaled up chimp brains that then cross a threshold of generality (analogous to how GPT-3 had much more general capabilities than GPT-2)?
I think human brains aren't just bigger chimp brains, yeah.
(Though it's not obvious to me that this is a crux. If human brains were just scaled up chimp-brains, it wouldn't necessarily be the case that chimps are scaled-up 'thing-that-works-like-GPT' brains, or sca... (read more)
I think I don't understand Carl's "separate, additional miracle" argument. From my perspective, the basic AGI argument is:
Thanks for the in-depth response! I think I have a better idea now where you're coming from. A couple follow-up questions:
But I don't in fact believe on this basis that we already have baby AGIs. And if the argument isn't 'we already have baby AGIs' but rather 'the idea of "AGI" is wrong, we're going to (e.g.) gradually get one science after another rather than getting all the sciences at once', then that seems like directionally the wrong update to make from Atari, AlphaZero, GPT-3, etc
Do you think that human generality of thought requires a unique algori... (read more)
"I suspect I would start to attach that same meaning to any code phrase" and "I think that even talking about either using a code phrase or to spell it out inevitably pushes toward that being a norm" are both concerns of mine, but I think I'm more optimistic than you that they just won't be big issues by default, and that we can deliberately avoid them if they start creeping in. I'm also perfectly happy in principle to euphemism-treadmill stuff and keep rolling out new terms, as long as the swap is happening (say) once every 15 years and not once every 2 years.
Why not say something like "hey, I'm bowing out of this conversation now, but it's not intended to be any sort of reflection on you or the topic, I'm not making a statement, I'm just doing what's good for me and that's all"?
That seems fine too, if I feel like putting the effort into writing a long thing like that, customizing it for the particular circumstances, etc. But I've noticed many times that it's a surprisingly large effort to hit exactly the right balance of social signals in a case like this, given what an important and commonplace move it is. (A... (read more)
I think in most cases with public, online, asynchronous communication, it probably makes the most sense to just exit without a message about it.
In a minority of cases, though (e.g., where I've engaged in a series of back-and-forths and then abruptly stopped responding, or where someone asks me a direct Q or what-have-you), I find that I want an easy boilerplate way to notify others that I'm unlikely to respond more. I think "(Leaving orbit. 🙂)" or similar solves that specific problem for me.
Yeah, I would favor "tapping out" if it felt more neutral to me. 'Tapping out', 'bowing out', etc. sound a little resentful/aggressive to my ear, like you're exiting an annoying scuffle that's beneath your time. Even the combat-ish associations are a thing I'd prefer to avoid, if possible.
When I try to mentally simulate negative reader-reactions to the dialogue, I usually get a complicated feeling that's some combination of:
I think part of what I was reacting to is a kind of half-formed argument that goes something like:
I had mixed feelings about the dialogue personally. I enjoy the writing style and think Eliezer is a great writer with a lot of good opinions and arguments, which made it enjoyable.
But at the same time, it felt like he was taking down a strawman. Maybe you’d label it part of “conflict aversion”, but I tend to get a negative reaction to take-downs of straw-people who agree with me.
To give an unfair and exaggerated comparison, it would be a bit like reading a take-down of a straw-rationalist in which the straw-rationalist occasionally insists such things as ... (read more)
When I try to think of gift ideas for dolphins, am I failing to notice some way in which I'm "selfishly" projecting what I think dolphins should want onto them, or am I violating some coherence axiom?
I think it's rather that 'it's easy to think of ways to help a dolphin (and a smart AGI would presumably find this easy too), but it's hard to make a general intelligence that robustly wants to just help dolphins, and it's hard to safely coerce an AGI into helping dolphins in any major way if that's not what it really wants'.
I think the argument is two-part, a... (read more)
No, 'rational' here is being used in opposition to 'irrational', 'religious', 'superstitious', etc., not in opposition to 'empirical'.
In politics, rationalism, since the Enlightenment, historically emphasized a "politics of reason" centered upon rational choice, deontology, utilitarianism, secularism, and irreligion – the latter aspect's antitheism was later softened by the adoption of pluralistic reasoning methods practicable regardless of religious or irreligious ideology. In this regard, the philosopher John Cottingham
Note: I've written up short summaries of each entry in this sequence so far on https://intelligence.org/late-2021-miri-conversations/, and included links to audio recordings of most of the posts.
I've gotten one private message expressing more or less the same thing about this post, so I don't think this is a super unusual reaction.
I don't know Eliezer's view on this — presumably he either disagrees that the example he gave is "mundane AI safety stuff", or he disagrees that "mundane AI safety stuff" is widespread? I'll note that you're a MIRI research associate, so I wouldn't have auto-assumed your stuff is representative of the stuff Eliezer is criticizing.
Safety Interruptible Agents is an example Eliezer's given in the past of work that isn't "real" (back in 2017):
[...]It seems to me that I've watched organizations like OpenPhil try to sponsor academics to work on AI alignment, and
It seems to me that I've watched organizations like OpenPhil try to sponsor academics to work on AI alignment, and
Eliezer is referring to the Dath Ilan stories.
Not believing theories which don’t make new testable predictions just because they retrodict lots of things in a way that the theories proponents claim is more natural, but that you don’t understand, because that seems generally suspicious
My Eliezer-model doesn't categorically object to this. See, e.g., Fake Causality:
[Phlogiston] feels like an explanation. It’s represented using the same cognitive data format. But the human mind does not automatically detect when a cause has an unconstraining arrow to its effect. Worse, thanks to hindsight bias, it may fe
(This post was partly written as a follow-up to Eliezer's conversations with Paul and Ajeya, so I've inserted it into the conversations sequence.)
It does fit well there, but I think it was more inspired by the person I met who thought I was being way too arrogant by not updating in the direction of OpenPhil's timeline estimates to the extent I was uncertain.
(I'll emphasize again, by the way, that this is a relative comparison of my model of Paul vs. Eliezer. If Paul and Eliezer's views on some topic are pretty close in absolute terms, the above might misleadingly suggest more disagreement than there in fact is.)
I would frame the question more as 'Is this question important for the entire chain of actions humanity needs to select in order to steer to good outcomes?', rather than 'Is there a specific thing Paul or Eliezer personally should do differently tomorrow if they update to the other's view?' (though the latter is an interesting question too).
Some implications of having a more Eliezer-ish view include:
My Eliezer-model is a lot less surprised by lulls than my Paul-model (because we're missing key insights for AGI, progress on insights is jumpy and hard to predict, the future is generally very unpredictable, etc.). I don't know exactly how large of a lull or winter would start to surprise Eliezer (or how much that surprise would change if the lull is occurring two years from now, vs. ten years from now, for example).
In Yudkowsky and Christiano Discuss "Takeoff Speeds", Eliezer says:
I have a rough intuitive feeling that it [AI progress] was going faster in
Found two Eliezer-posts from 2016 (on Facebook) that I feel helped me better grok his perspective.
Sep. 14, 2016:
It is amazing that our neural networks work at all; terrifying that we can dump in so much GPU power that our training methods work at all; and the fact that AlphaGo can even exist is still blowing my mind. It's like watching a trillion spiders with the intelligence of earthworms, working for 100,000 years, using tissue paper to construct nuclear weapons.
And earlier, Jan. 27, 2016:
People occasionally ask me about signs that the remaining timeline
Minor note: This post comes earlier in the sequence than Christiano, Cotra, and Yudkowsky on AI progress. I posted the Christiano/Cotra/Yudkowsky piece sooner, at Eliezer's request, to help inform the ongoing discussion of "Takeoff Speeds".
To which my Eliezer-model's response is "Indeed, we should expect that the first AGI systems will be pathetic in relative terms, comparing them to later AGI systems. But the impact of the first AGI systems in absolute terms is dependent on computer-science facts, just as the impact of the first nuclear bombs was dependent on facts of nuclear physics. Nuclear bombs have improved enormously since Trinity and Little Boy, but there is no law of nature requiring all prototypes to have approximately the same real-world impact, independent of what the thing is a prototype of."
Thanks for doing this, Kat! :)
I’ve listened to them as is and I find it pretty easy to follow, but if you’re interested in making it even easier for people to follow, these fine gentlemen have put up a ~$230 RFP/bounty for anybody who turns it into audio where each person has a different voice.
That link isn't working for me; where's the bounty?
Edit: Bounty link is working now: https://twitter.com/lxrjl/status/1464119232749318155
Transcript error fixed -- the line that previously read
I expect it to go away before the end of days
but with there having been a big architectural innovation, not Stack More Layers
if you name 5 possible architectural innovations I can call them small or large
but with there having b
(Ah, EY already replied.)
It feels like this bet would look a lot better if it were about something that you predict at well over 50% (with people in Paul's camp still maintaining less than 50%).
My model of Eliezer may be wrong, but I'd guess that this isn't a domain where he has many over-50% predictions of novel events at all? See also 'I don't necessarily expect self-driving cars before the apocalypse'.
My Eliezer-model has a more flat prior over what might happen, which therefore includes stuff like 'maybe we'll make insane progress on theorem-proving (or whatever) out of the bl... (read more)
One may ask: why aren't elephants making rockets and computers yet?But one may ask the same question about any uncontacted human tribe.
One may ask: why aren't elephants making rockets and computers yet?
But one may ask the same question about any uncontacted human tribe.
Seems more surprising for elephants, by default: elephants have apparently had similarly large brains for about 20 million years, which is far more time than uncontacted human tribes have had to build rockets. (~100x as long as anatomically modern humans have existed at all, for example.)