Historically, it has been difficult to persuade people of the likelihood of AI risk because the examples tend to sound “far-fetched” to audiences not bought in on the premise. One particular problem with many traditional framings for AI takeover is that most people struggle to imagine how e.g. “a robot...
It seems like we keep getting LLMs that are better and better at getting the point of fairly abstract concepts (e.g. understanding jokes). As compute increases and their performance improves, it seems increasingly likely that human “values” are within the class of not-that-heavily-finetuned LLMs. For example, if I prompted a...
[EDIT 4/10/2022: This post was rash and ill-conceived, and did not have clearly defined goals nor meet the vaguely-defined ones. I apologize to everyone on here; you should probably update accordingly about my opinions in the future. In retrospect, I was trying to express an emotion of exasperation related to...
The Natural Abstractions Hypothesis states that a wide class of models will, simply by observing the world, in the limit build abstractions that are similar to those built by humans. For example, the concept “tree” is useful to a superintelligence because it is predictive of a frequently occurring phenomenon (there...