Daniel Kokotajlo

Daniel Kokotajlo's Comments

Three Kinds of Competitiveness

I had been thinking it is sometimes nice to talk about competitiveness of AI designs more generally, not just alignment schemes. e.g. neuromorphic AI is more date-competitive, cost-competitive, and performance-competitive than uploads, probably. (It might be less date-competitive though).

Implications of the Doomsday Argument for x-risk reduction

Agreed. If you want to talk more about these ideas sometime, I'd be happy to video chat!

Implications of the Doomsday Argument for x-risk reduction

re: inverse proportinality: Good point, I'll have to think about that more. Maybe it does neatly cancel out, or even worse, since my utility function isn't linear in happy lives lived, maybe it more than cancels out.

I for one have seriously investigated all those weird philosophical ideas you mentioned. ;) And I think our community has been pretty good about taking these ideas seriously, especially compared to, well, literally every other community, including academic philosophy. Our overton window definitely includes all these ideas, I'd say.

But I agree with your general point that there is a tension we should explore. Even if we are OK seriously discussing these ideas, we often don't actually live by them. Our overton window includes them, but our median opinion doesn't. Why not?

I think there is a good answer, and it has to do with humility/caution. Philosophy is weird. If you follow every argument where it leads you, you very quickly find that your beliefs don't add up to normality, or anything close. Faith that beliefs will (approximately) add up to normality seems to be important for staying sane and productive, and moreover, seems to have been vindicated often in the past: crazy-sounding arguments turn out to have flaws in them, or maybe they work but there is an additional argument we hadn't considered that combines with it to add up to normality.

Implications of the Doomsday Argument for x-risk reduction

Yeah. It depends on how you define extinction. I agree that most simulations don't last very long. (You don't even need the doomsday argument to get that conclusion, I think)

Implications of the Doomsday Argument for x-risk reduction

I'd like to see someone explore the apparent contradiction in more detail. Even if I were convinced that we will almost certainly fail, I might still prioritize x-risk reduction, since the stakes are so high.

Anyhow, my guess is that most people think the doomsday argument probably doesn't work. I am not sure myself. If it does work though, its conclusion is not that we will all go extinct soon, but rather that ancestor simulations are one of the main uses of cosmic resources.

What achievements have people claimed will be warning signs for AGI?

AI Impacts has a list of reasons people give for why current methods won't lead to human-level AI. With sources. It's not exactly what you are looking for, but it's close, because most of these could be inverted and used as warning signs for AGI, e.g. "Current methods can't build good, explanatory causal models" becomes "When we have AI which can build good, explanatory causal models, that's a warning sign."

Call for volunteers: assessing Kurzweil, 2019

I'd be happy to volunteer a bit. I don't have much time, but this sounds fun, so maybe I could do a few.

Blog Post Day II Retrospective

OK, so you + Gyrodiot are making me think maybe I should do another one soon. But to be honest I need to focus less on blogging and more on working for a bit, so I personally won't be ready for at least a few weeks I think.

Whenever it happens, I should schedule it far in advance I think. That way people have more of a chance to find out about it.

Three Kinds of Competitiveness

Oh right, how could I forget! This makes me very happy. :D

Three Kinds of Competitiveness

Good point about inner alignment problems being a blocker to date-competitiveness for IDA... but aren't they also a blocker to date-competitiveness for every other alignment scheme too pretty much? What alignment schemes don't suffer from this problem?

I'm thinking "Do anything useful that a human with a lot of time can do" is going to be substantially less capable than full-blown superintelligent AGI. However, that's OK, because we can use IDA as a stepping-stone to that. IDA gets us an aligned system substantially more capable than a human, and we use that system to solve the alignment problem and build something even better.

It's interesting how Paul advocates merging cost and performance-competitiveness, and you advocate merging performance and date-competitiveness. I think it's fine to just talk about "competitiveness" full stop, and only bother to specify what we mean more precisely when needed. Sometimes we'll mean one of the three, sometimes two of the three, sometimes all three.

Load More