TL;DR We apply mechinterp techniques on VPT, OpenAI's Minecraft agent. We also find a new case of goal misgeneralization - VPT kills a villager when we force one to stand under some tree leaves. Abstract > Understanding the mechanisms behind decisions taken by large foundation models in sequential decision making...
I think quantified intuitions is a reasonable, although incomplete version of what you describe. It specifically focuses on scope insensitivity rather than a traditional rationalist curriculum.
Markets pricing in AGI also is also conditional on markets believing something like the current legal/property rights system will continue to hold after AGI. If it is possible that AI is a bubble, and it's not obvious that you will win anything if you get the AGI trade right, then traders won't "price in" AGI even if it is extremely economically valuable and coming soon.
My argument is also not that markets won't price in AI in its current form or increasing capabilities, it is specifically at the point where we actually have strong AGI systems phase shift.
I disagree with your post, but I will add an additional example: falling birthrates. I don't remember in which of his essays it was (probably in Fanged Noumena), but Nick Land posits that the technocapital system of capitalism which he views as being AGI has figured out that it won't need humans much longer and thus has no incentive to keep the birth rates up. I obviously do not literally believe this, but I think it helps illustrate what you're trying to describe.
I know this is 7 months late! But I read this shortform yesterday and it somewhat resonated with me. And then today I read Noah Smith's most recent blog post which perfectly described what I think you're getting at so I'm linking it here.
Why trust your prior over the prior of the market/hedge funds? By this I mean why expect that this isn't already priced in? AI (and AGI) is a big enough news story now such that I would expect hedge funds to be thinking about things like this. At recruiting events, I've asked quants how they're thinking about this exact question and I usually got pretty decent AGI pilled responses.
It is certainly possible that the market hasn't priced this in, but my prior is in the vast, vast majority of cases, there is some quant that has already sucked out any potential gains one could get.
I'm also a college student who has been wrestling with this question for my entire undergrad. In a short timelines world, I don't think there are very good solutions. In longer timelines worlds, human labor remains economically valuable for longer.
I have found comfort in the following ideas:
1) The vast majority of people (including the majority of wealthy white-collar college-educated people) are in the same boat as you. The distribution of how AGI unrolls is likely to be so absurd that it's hard to predict what holds value after this. Does money still matter after AGI/ASI? What kinds of capital matters after AGI/ASI? These questions are far from obvious for me. If you... (read more)
Claude's rebuttal is exactly my claim. If major AI research breakthroughs could be done in 5 hours, then imo robustness wouldn't matter as much. You could run a bunch of models in parallel and see what happens (this is part of why models are so good at olympiads), but an implicit part of my argument/crux is that AI research is necessarily deep (meaning you need to string some number of successfully completed tasks together such that you get an interesting final result). And if the model messes up one part, your chain breaks. Not only does this give you weird results, but it breaks your chain of causality[1], which is essential for... (read more)
METR should test for a 99.9% task completion rate (in addition to the current 80% and 50%). A key missing ingredient holding back LLM economic impact is that they're just not robust enough. This can be viewed analogously to the problem of self-driving. Every individual component of self-driving is ~solved, but stringing them together results in a non-robust final product. I believe that automating research/engineering completely will require nines of reliability that we just don't have. And testing for nines of reliability could be done by giving the model many very short time horizon tasks and seeing how it performs.
This can be further motivated by considering what happens if we string together... (read more)
I think this issue of "9s" of reliability should update people towards longer timelines. Tesla FSD has basically been able to do everything individually that we would call self-driving for the last ~4 years, but it isn't 99.99...% reliable. I think LLMs replacing work will, by default, follow the same pattern.
We apply mechinterp techniques on VPT, OpenAI's Minecraft agent. We also find a new case of goal misgeneralization - VPT kills a villager when we force one to stand under some tree leaves.
... (read more)Understanding the mechanisms behind decisions taken by large foundation models in sequential decision making tasks is critical to ensuring that such systems operate transparently and safely. In this work, we perform exploratory analysis on the Video PreTraining (VPT) Minecraft playing agent, one of the largest open-source vision-based agents. We aim to illuminate its reasoning mechanisms by applying various interpretability techniques. First, we analyze the attention mechanism while the agent solves its training task - crafting a diamond pickaxe. The agent
My prediction for the next few years (or until AGI) is that there's going to start to be a winner take all approach to computer science talent. The majority of the job of software engineering will be automated (if it isn't already). There are still robustness pockets that human software engineers can help fill for now. But I expect top AI researchers to continue making exorbitant amounts of money, even if little software engineering is involved. So, computer science talent will start to loosely resemble the competitiveness of professional sports, where there is a step function change between compensation if you "make it" or not. And the hiring bar will be extremely high. You can already start to see glimpses of this with the weight people put on Olympiad contestants when recruiting.