bioshok
Message
I regularly tweet in Japanese about AI alignment and the long-term risks of AGI.
12
6
Footnote 68 of the Forethought paper employs a Cobb-Douglas R&D production function (σ = 1) in its quantitative analysis of a technology explosion, with cognitive labor exponent γ = 0.7 derived from NSF R&D expenditure data. Under this assumption, an explosive increase in cognitive capability can produce centuries of technological progress even with limited physical R&D capital.
However, Growiec, McAdam and Mućk (2023, Kansas City Fed) directly estimated the elasticity of substitution between R&D labor and R&D capital in the idea product...
In the context of Deceptive Alignment, would the ultimate goal of an AI system appear random and uncorrelated with the training distribution's objectives from a human perspective? Or would it be understandable to humans that the goal is somewhat correlated with the objectives of the training distribution?
For instance, in the article below, it is written that "the model just has some random proxies that were picked up early on, and that's the thing that it cares about." To what extent does it learn random proxies?
https://www.lesswrong.com/posts/A9NxPTwbw6r6...
In the context of Deceptive Alignment, would the ultimate goal of an AI system appear random and uncorrelated with the training distribution's objectives from a human perspective? Or would it be understandable to humans that the goal is somewhat correlated with the objectives of the training distribution?
For instance, in the article below, it is written that "the model just has some random proxies that were picked up early on, and that's the thing that it cares about." To what extent does it learn random proxies?
https://www.lesswrong.com/posts/A9NxPTwbw6r6...
In the context of Deceptive Alignment, would the ultimate goal of an AI system appear random and uncorrelated with the training distribution's objectives from a human perspective? Or would it be understandable to humans that the goal is somewhat correlated with the objectives of the training distribution?
I would like to know about the history of the term "AI alignment". I found an article written by Paul Christiano in 2018. Did the use of the term start around this time? Also, what is the difference between AI alignment and value alignment?
https://www.alignmentforum.org/posts/ZeE7EKHTFMBs8eMxn/clarifying-ai-alignment
This is a wonderful essay — really interesting. I have one question. I do acknowledge the possibility of an intelligence explosion, but I’d like to understand in more detail the scenario you describe, like in AI 2027, where several centuries of technological progress could occur within just 1–2 years. I’m not skeptical about a technological explosion driven by superintelligence — I simply want to better understand your reasoning.
What I want to understand is how much of an “industrial explosion” — that is, an explosion in research capital — is required for ... (read more)