All of Jim Buhler's Comments + Replies

But I overall think working on alignment is largely more urgent. Being able to understand what's going on at all inside a neural net, and advocating that companies be required to understand what's going on before developing new/bigger/better models, seems like a convergent goal relevant to both human extinction and astronomical suffering. 

Fwiw, Lukas's comment link to a post arguing against that and I 100% agree with it. I think the "Alignment will solve s-risks as well anyway" is one the most untrue and harmful widespread memes in the EA/LW community.

1Dawn Drescher25d
I suppose my shooting range metaphor falls short here. Maybe alignment is like teaching a kid to be an ace race car driver, and s-risks are accidents on normal roads. There it also depends on the details whether the ace race car driver will drive safely on normal roads.
Nod (fyi I vaguely remembered that comment but couldn't find it a second time while I was writing my own answer) I do think "AI targeted at optimizing a good goal" is more likely to near miss if precautions aren't taken and I do think that's quite important. I did carefully not say "alignment automatically solves s-risks", I said it was a convergent goal that seemed more important to me overall. I do think that's a reasonable thing to disagree on.
3Charlie Steiner24d
If you get to pick how the universe is arranged in the future, would you rather it be lifeless and full of shit, or lifeless and full of brilliant art? I'm gonna guess that you, like me, would prefer art. This is an aesthetic preference about how you'd rather the atoms in the universe be arranged. You don't need to justify it by any deeper principle, it doesn't matter that you're not around to care in either case, it's sufficient for you to prefer universes full of art to universes full of shit as a raw preference, and this can motivate you to steer the future to favor one over the other. I find universes full of cosmopolitan civilizations good, and universes full of suffering bad, in just this raw way. You might also call it "non person-affecting preferences over the use of atoms in the universe."

Interesting! Did thinking about those variants make you update your credences in SIA/SSA (or else)? 

(Btw, maybe it's worth adding the motivation for thinking about these problems in the intro of the post.) :)

2Eric Chen3mo
Same as Sylvester, though my credence in consciousness-collapse interpretations of quantum mechanics has moved from 0.00001 to 0.000001.
2Sylvester Kollin3mo
No, not really! This was mostly just for fun.

Thanks a lot for these comments, Oscar! :)

I think something can't be both neat and so vague as to use a word like 'significant'.

I forgot to copy-paste a footnote clarifying that "as made explicit in the Appendix, what "significant" exactly means depends on the payoffs of the game"! Fixed. I agree this is vague, still, although I guess it has to be since the payoffs are unspecified?

In the EDT section of Perfect-copy PD, you replace some p's with q's and vice versa, but not all, is there a principled reason for this?  Maybe it is just a mistake and it s

... (read more)