So I would argue that all of the main contenders are very training data efficient compared to artificial neural nets. I'm not going to go into detail on that argument, unless people let me know that that seems cruxy to them and they'd like more detail.
I'm not sure I get this enough for it to even be a crux, but what's the intuition behind this?
My guess for your argument is that you see it as analogous to the way a CNN beats out a fully-connected one at image recognition, because it cuts down massively on the number of possible models, compatibly with the known structure of the problem.
But that raises the question, why are these biology-inspired networks more likely to be better representations of general intelligence than something like transformers? Genuinely curious what you'll say here.
(Wisdom of evolution only carries so much weight for me, because the human brain is under constraints like collocation of neurons that prevent evolution from building things that artificial architectures can do.)
Has any serious AI Safety research org thought about situating themselves so that they could continue to function after a nuclear war?
Wait, hear me out.
A global thermonuclear war would set AI timelines back by at least a decade, for all of the obvious reasons. So an AI Safety org that survived would have additional precious years to work on the alignment problem, compared to orgs in the worlds where we avoid that war.
So it seems to me that at least one org with short timelines ought to move to New Zealand or at least move farther away from cities.
(Yes, I know MIRI was pondering leaving the Bay Area for underspecified reasons. I'd love to know what their thinking was regarding this effect, but I don't expect they'd reveal it.)
The distinction between your post and Eliezer's is more or less that he doesn't trust anyone to identify or think sanely about [plans that they admit have negative expected value in terms of log odds but believe possess a compensatory advantage in probability of success conditional on some assumption].
Such plans are very likely to hurt the remaining opportunities in the worlds where the assumption doesn't hold, which makes it especially bad if different actors are committing to different plans. And he thinks that even if a plan's assumptions hold, the odds of its success are far lower than the planner envisioned.
Eliezer's preferred strategy at this point is to continue doing the kind of AI Safety work that doesn't blow up if assumptions aren't met, and if enough of that work is complete and there's an unexpected affordance for applying that kind of work to realistic AIs, then there's a theoretical possibility of capitalizing on it. (But, well, you see how pessimistic he's become if he thinks that's both the best shot we have and also probability ~0.)
And he wanted to put a roadblock in front of this specific well-intentioned framing, not least because it is way too easy for some readers to round into support for Leeroy Jenkins strategies.
In principle, I was imagining talking about two AIs.
In practice, there are quite a few preferences I feel confident a random person would have, even if the details differ between people and even though there's no canonical way to rectify our preferences into a utility function. I believe that the argument carries through practically with a decent amount of noise; I certainly treat it as some evidence for X when a thinker I respect believes X.
Ah, that makes more sense.
Identifying someone else's beliefs requires you to separate a person's value function from their beliefs, which is impossible.
I think it's unfair to raise this objection here while treating beliefs about probability as fundamental throughout the remainder of the post.
If you instead want to talk about the probability-utility mix that can be extracted from seeing another agent's actions even while treating them as a black box... two Bayesian utility-maximizers with relatively simple utility functions in a rich environment will indeed start inferring Bayesian structure in each others' actions (via things like absence of Dutch booking w.r.t. instrumental resources); they will therefore start treating each others' actions as a source of evidence about the world, even without being confident about each others' exact belief/value split.
If you want to argue their beliefs won't converge, you'll have to give a good example.
This fails to engage with Eli's above comment, which focuses on Elon Musk, and is a counterargument to the very thing you're saying.
Probable typos: the Qs switch from Q4 to Q5 before the bolded Q5 question.
and they, I'm afraid, will be PrudentBot, not FairBot.
This shouldn't matter for anyone besides me, but there's something personally heartbreaking about seeing the one bit of research for which I feel comfortable claiming a fraction of a point of dignity, being mentioned validly to argue why decision theory won't save us.
(Modal bargaining agents didn't turn out to be helpful, but given the state of knowledge at that time, it was worth doing.)
I see Biden as having cogent things he intends to communicate but sometimes failing to speak them coherently, while Trump is a pure stream of consciousness sometimes, stringing together loosely related concepts like a GPT.
(This isn't the same as cognitive capacity, mind you. Trump is certainly more intelligent than many people who speak more legibly.)
I haven't seen a "word salad" from Biden where I can't go "okay, here's the content he intended to communicate", but there are plenty from Trump where I can't reconstruct anything more than sentiment and gestures at disconnected facts.