Sorted by New

Wiki Contributions


Not aligned on values, beliefs and moral intuitions. Plenty of humans would not kill all people alive if given the choice but there are some who would. I think the existence of doomsday cults that have tried to precipitate an armageddon give support to this claim.

The effort from going from Chimp to Human was marginally lower but still took a huge amount of effort. It was maybe 5 million years since the last common ancestor between Chimps and Humans and taking a generation to be like 20 years that's at least 250,000 generations of a couple of thousand individuals in a complex environment with lots of processes going on. I haven't done the math but that seems like a massive amount of computation. To go from human to Von Neumann still takes a huge search process. If we think of every individual human as consisting of evolution trying to get more intelligence there are almost 8 billion instances being 'tried' right now in a very complex environment. Granted that if humans were to run this process it may take a lot less time. If say breeding and selection of the most intelligent individuals in every generation was done it may take a lot less time to get human level intelligence if starting from a chimp.

What is the theory of change of the AI Safety field and why do you think it has a high probability to work?

Human beings are not aligned and will possibly never be aligned without changing what humans are. If it's possible to build an AI as capable as a human in all ways that matter, why would it be possible to align such an AI?

Evolution is massively parallelized and occurs in a very complex, interactive, and dynamic environment. Evolution is also patient, can tolerate high costs such as mass extinction events and also really doesn't care about the outcome of the process. It's just something that happens and results in the filtering of the most fit genes. The amount of computation that it would take to replicate such complex, interactive, and dynamic environments would be huge. Why should we be confident that it's possible to find an architecture for general intelligence a lot more efficiently than evolution? Wouldn't it also always be more practically expedient creating intelligence that does the exact things we want, even if we could simulate the evolutionary process why would we do it?

Who are the AI Capabilities researchers trying to build AGI and think they will succeed within the next 30 years?

Great point! Though for what it's worth I didn't mean to be dismissive of the prediction, my main point is that the future has not yet been determined. As you indicate people can react to predictions of the future and end up on a different course.

I'm still forming my views and I don't think I'm well calibrated to state any probability with authority yet. My uncertainty still feels so high that I think my error bars would be too wide for my actual probability estimates to be useful. Some things I'm thinking about:

  • Forecasters are not that great at making forecasts greater than 5 years out according to Superforecasting IIRC and I don't think AGI is going to happen within the next 5 years.
  • AGI has not been created yet and its possible that AI development gets derailed due to other factors e.g.:
    • Political and economic conditions change such that investment in AI slows down.
    • Global conflict exacerbates which slows down AI (maybe this speeds it up but I think there would be other pressing needs when a lot of resources has to be diverted to war)
    • Other global catastrophic risks could happen before AGI is developed i.e. should I be more scared of AGI than say nuclear war or GCBRs at this point (not that great but could still happen)
    • On the path to AGI there could be a catastrophic failure that kills a few people but can be contained but gets people really afraid of AI.
  • Maybe some of the work on AI safety ends up helping produce mostly aligned AI. I'm not sure if everyone dies if an AI is 90% aligned.
  • Maybe the AGI systems that are built don't have instrumental convergence maybe if we get AGI through CAIS which seems to me like the most likely way we'll get there.
  • Maybe like physics once the low hanging fruit has been plucked then it takes a while to make breakthroughs which extends the timelines
  • For me to be personally afraid I'd have to think this was the primary way I would die which seems unlikely given all the other ways I could die between now and if/when AGI is developed.
  • AI researchers, who are the people that most likely believe that AGI is possible more than anyone else, don't have consensus when it comes to this issue. I know experts can be wrong about their own fields but I'd expect them to be more split on the issue(I don't know what the current status is now just know what it was in the Grace et. al survey). I know very little about AGI, should I be more concerned than AI researchers are? 

I still think it's important to work on AI Safety since even a small chance that AGI could go wrong would still have a high expected value in terms of the negative outcome. I think most of my thinking comes from the fact that I think it is more probable that there will be a slow take off instead of a fast take off. I may also just be bad at being scared or feeling doomed.

What are some relatively-likely examples of future possible observations that would make you think AGI is every likely to kill everyone?

People start building AI that is agentic and open ended in its actions.

Would you expect to make observations like that well in advance of AGI (if doom is in fact likely), such that we can expect to have plenty of time to prepare if we ever have to make that future update?

Yes, because I think the most likely scenario is a slow take off. This is because it costs money to scale compute and we actually need to validate and the more complex a system the harder it is to build correctly, probably takes a few iterations to get things to work well enough that it can be tested against a benchmark before moving on to trying to get a system to have more capability. I think this process will have to happen many times before getting to AI that is dangerous and on the way I'd expect to start seeing some interesting agentic behavior with short-horizon planning.

Or do you think we're pretty screwed, evidentially speaking, and can probably never update much toward 'this is likely to kill us' until it's too late to do anything about it?

I think the uncertainty will be pretty high until we start seeing sophisticated agentic behavior. Though I don't think we should wait that long to try come up with solutions since I think a small chance that this could happen still warrants concern.

Load More