If anyone wants to have a voice chat with me about a topic that I'm interested in (see my recent post/comment history to get a sense), please contact me via PM.
My main "claims to fame":
a huge amount of strategic background; as a consequence of being good strategic background, they shifted many people to working on this"
Maybe we should distinguish between being good at thinking about / explaining strategic background, versus being actually good at strategy per se, e.g. picking high-level directions or judging overall approaches? I think he's good at the former, but people mistakenly deferred to him too much on the latter.
It would make sense that one could be good at one of these and less good at the other, as they require somewhat different skills. In particular I think the former does not require one to be able to think of all of the crucial considerations, or have overall good judgment after taking them all into consideration.
No? They're all really difficult questions. Even being an expert in one of these would be at least a career. I mean, maybe YOU can, but I can't, and I definitely can't do so when I'm just a kid starting to think about how to help with X-derisking.
So Eliezer could become experts in all of them starting from scratch, but you couldn't even though you could build upon his writings and other people's? What was/is your theory of why he is so much above you in this regard? ("Being a kid" seems a red herring since Eliezer was pretty young when he did much of his strategic thinking.)
Yudkowsky, being the best strategic thinker on the topic of existential risk from AGI
This seems strange to say, given that he:
These seemed like obvious mistakes even at the time (I wrote posts/comments arguing against them), so I feel like the over-deference to Eliezer is a completely different phenomenon from "But you can’t become a simultaneous expert on most of the questions that you care about." or has very different causes. In other words, if you were going to spend your career on AI x-safety, of course you could have become an expert on these questions first.
I've now read your linked posts, but can't derive from them how you would answer my questions. Do you want to take a direct shot at answering them? And also the following question/counter-argument?
Think about the consequences, what will actually happen down the line and how well your Values will actually be satisfied long-term, not just about what feels yummy in the moment.
Suppose I'm a sadist who derives a lot of pleasure/reward from torturing animals, but also my parents and everyone else in society taught me that torturing animals is wrong. According to your posts, this implies that my Values = "torturing animals has high value", and Goodness = "don't torturing animals", and I shouldn't follow Goodness unless it actually lets me better satisfy my values better long-term, in other words allows me to torture more animals in the long run. Am I understanding your ideas correctly?
(Edit: It looks like @Johannes C. Mayer made a similar point under one of your previous posts.)
Assuming I am understanding you correctly, this would be a controversial position to say the least, and counter to many people's intuitions or metaethical beliefs. I think metaethics is a hard problem, and I probably can't easily convince you that you're wrong. But maybe I can at least convince you that you shouldn't be as confident in these ideas as you appear to be, nor present them to "lower-level readers" without indicating how controversial / counterintuitive-to-many the implications of your ideas are.
An update on this 2010 position of mine, which seems to have become conventional wisdom on LW:
In my posts, I've argued that indexical uncertainty like this shouldn't be represented using probabilities. Instead, I suggest that you consider yourself to be all of the many copies of you, i.e., both the ones in the ancestor simulations and the one in 2010, making decisions for all of them. Depending on your preferences, you might consider the consequences of the decisions of the copy in 2010 to be the most important and far-reaching, and therefore act mostly as if that was the only copy. [Emphasis added]
In the subsequent 15 years, I've upweighted influencing the multiverse through my copies in simulations, relative to base universes, to where they're about equally important in my mind. For 4 reasons:
Concretely, this mostly cashes out to me thinking and writing with potential simulators in mind as an additional audience, hoping my ideas might benefit or interest some of them even if they end up largely ignored in this reality.
normally when I think about this problem I resolve it as "what narrow capabilities can we build that are helpful 'to the workflow' of people solving illegible problems, that aren't particularly bad from a capabilities standpoint".
Do you have any writings about this, e.g., examples of what this line of thought led to?
I agree this is a major risk. (Another one is that it's just infeasible to significantly increase AI philosophical competence in the relevant time frame. Another one is that it's much easier to make it appear like the AI is more philosophically competent, giving us false security.) So I continue to think that pausing/stopping AI should be plan A (which legibilizing the problem of AI philosophical competence can contribute to), with actually improving AI philosophical competence as (part of) plan B. Having said that, 2 reasons this risk might not bear out:
To conclude I'm quite worried about the risks/downsides of trying to increase AI philosophical competence, but it seems to a problem that has to be solved eventually. "The only way out is through" but we can certainly choose to do it at a more opportune time, when humans are much smarter on average and have made a lot more progress in metaphilosophy (understanding the nature of philosophy and philosophical reasoning).
even on alignment
I see a disagreement vote on this, but I think it does make sense. Alignment work at the AI labs will almost by definition be work on legible problems, but we should make exceptions for people who can give reasons for why their work is not legible (or otherwise still positive EV), or who are trying to make illegible problems more legible for others at the labs.
Think more seriously about building organizations that will make AI power more spread out.
I start to disagree from here, as this approach would make almost all of the items on my list worse, and I'm not sure which ones it would make better. You started this thread by say "Even if we solved metaethics and metaphilosophy tomorrow, and gave them the solution on a plate, they wouldn't take it." which I'm definitely very worried about, but how does making AI power more spread out help with this? Is the average human (or humanity collectively) more likely to be concerned about metaethics and metaphilosophy than a typical AI lab leader, or easier to make concerned? I think the opposite is more likely to be true?
EA Forum allows agree/disagree voting on posts (why doesn't LW have this, BTW?) and the post there currently has 6 agrees and 0 disagrees. There may actually be a surprisingly low amount of disagreement, as opposed to people not bothering to write up their pushback.
I'm uncertain between conflict theory and mistake theory, and think it partly depends on metaethics, and therefore it's impossible to be sure which is correct in the foreseeable future - e.g., if everyone ultimately should converge to the same values, then all of our current conflicts are really mistakes. Note that I do often acknowledge conflict theory, like in this list I have "Value differences/conflicts between humans". It's also quite possible that it's really a mix of both, that some of the conflicts are mistakes and others aren't.
In practice I tend to focus more on mistake-theoretic ideas/actions. Some thoughts on this:
(I think this is probably the first time I've explicitly written down the reasoning in 4.)
I think we need a different plan.
Do you have any ideas in mind that you want to talk about?
By saying that he was the best strategic thinker, it seems like you're trying to justify deferring to him on strategy (why not do that if he is actually the best), while also trying to figure out how to defer "gracefully", whereas I'm questioning whether it made sense to defer to him at all, when you could have taken into account his (and other people's) writings about strategic background, and then looked for other important considerations and formed your own judgments.
Another thing that interests me is that several of his high-level strategic judgments seemed wrong or questionable to me at the time (as listed in my OP, and I can look up my old posts/comments if that would help), and if it didn't seem that way to others, I want to understand why. Was Eliezer actually right, given what we knew at the time? Did it require a rare strategic mind to notice his mistakes? Or was it a halo effect, or the effect of Eliezer writing too confidently, or something else, that caused others to have a cognitive blind spot about this?