Thomas Larsen


An anonymous academic wrote a review of Joe Carlsmith's 'Is power seeking AI an existential risk?', in which the reviewer assigns for <1/100,000 probability of AI existential risk. The arguments given aren't very good imo, but maybe worth reading. 

Just made a fairly large edit to the post after lots of feedback from commenters.  My most recent changes include the following: 

  • Note limitations in introduction (lack academics, not balanced depth proportional to people, not endorsed by researchers) 
  • Update CLR as per Jesse's comment
  • Add FAR 
  • Update brain-like AGI to include this
  • Rewrite shard theory section 
    • Brain <-> shards 
  • effort: 50 -> 75 hours :)
  • Add this paper to DeepMind
  • Add some academics (David Krueger, Sam Bowman, Jacob Steinhardt, Dylan Hadfield-Menell, FHI)
  • Add other category 
  • Summary table updates:
    • Update links in table to make sure they work. 
    • Add scale of organization 
    • Add people

Thank you to everyone who commented, it has been very helpful. 

Good point, I've updated the post to reflect this. 

I'm excited for your project :) 

Good point. We've added the Center for AI Safety's full name into the summary table which should help.

Thanks for the update! We've edited the section on CLR to reflect this comment, let us know if it still looks inaccurate.

not all sets all sets of reals which are bounded below have an infimum

Do you mean 'all sets of reals which are bounded below DO have an infimum'?

In the model based RL set up, we are planning to give it actions that can directly modify the game state in any way it likes. This is sort of like an arbitrarily-powerful superpower, because it can change anything it wants about the world, except of course that this is a cartesian environment and so it can't, e.g., recursively self improve. 

With model free RL, this strategy doesn't obviously carry over so I agree that we are limited to easily codeable superpowers. . 

Strong upvoted and I quite like this antidote, I will work on adding my guess of the scale of these orgs into the table. 

Hi Adam, thank you so much for writing this informative comment. We've added your summary of FAR to the main post (and linked this comment). 

Agree with both aogara and Eli's comment. 

One caveat would be that papers probably don’t have full explanations of the x-risk motivation or applications of the work, but that’s reading between the lines that AI safety people should be able to do themselves.

For me this reading between the lines is hard: I spent ~2 hours reading academic papers/websites yesterday and while I could quite quickly summarize the work itself, it was quite hard to me to figure out the motivations.

Load More