LW is a known hotbed of compatibilism, so here's my question:
That's not been my impression. I would have summarized it more as "LW (a) agrees that LFW doesn't exist and (b) understands that debating compatibilism doesn't make sense because it's just a matter of definition"
Personally, I certainly don't consider myself a compatibilist (though this is really just a matter of preference since there are no factual disagreements). My brief answer to "does free will exist" is "no". The longer answer is the within-physics stick figure drawing.
No one will hear my counter-arguments to Sabien's propaganda who does not ask me for them privately.
uh, why? Why not make a top level post?
Was about to reread this, but
UPDATE IN 2023: I wrote this a long time ago and you should NOT assume that I still agree with all or even most of what I wrote here. I’m keeping it posted as-is for historical interest.
... are your updated thoughts written up anywhere?
This argument is based on drawing an analogy between
in the sense that both have to get their values into a system. But the two situations are substantially disanalogous because the AI starts with a system that has its values already implemented. it can simply improve parts that are independent of its values. Doing this would be easier with a modular architecture, but it should be doable even without that. It's much easier to find parts of the system that don't affect values than it is to nail down exactly where the values are encoded.
Fair. Ok, I edited the original post, see there for the quote.
One reason I felt comfortable just stating the point is that Eliezer himself framed it as a wrong prediction. (And he actually refers to you as having been more correct, though I don't have the timestamp.)
(Eliezer did think neural nets wouldn't work; he explicitly said it on the Lex Fridman podcast.)
Edit @request from gwern: at 11:30 in the podcast, Eliezer says,
back in the day I went around saying like, I do not think that just stacking more layers of transformers is going to get you all the way to AGI, and I think that GPT-4 is past where I thought this paradigm is going to take us, and I, you know, you want to notice when that happens, you want to say like "oops, I guess I was incorrect about what happens if you keep on stacking more transformer layers"
and then Fridman asks him whether he'd say that his intuition was wrong, and Eliezer says yes.
Why AI would want to align us or end us is something I still haven't figured out after reading about alignment so much.
Has your reading ever included anything related to Instrumental Convergence?
Reporting on my mental state here: I'm emotionally opposed to the name change; I like "effective altruism". I don't want to do quantifiable good; I want to do effective good. Having x-risk and animal welfare in the same category makes perfect sense to me.
I don't get the downvotes, this post is just agreeing with the OP.