All of Hyperion's Comments + Replies

This is an intuition only based on speaking with researchers working on LLMs, but I think that OAI thinks that a model can simultaneously be good enough at next token prediction to assist with research but also be very very far away from being a powerful enough optimizer to realise that it is being optimized for a goal or that deception is an optimal strategy, since the latter two capabilities require much more optimization power. And that the default state of cutting edge LLMs for the next few years is to have GPT-3 levels of deception (essentially none) and graduate student levels of research assistant ability.

I don't think it's odd at all - even a terrible chess bot can outplay almost all humans. Because most humans haven't studied chess. MATH is a dataset of problems from high school competitions, which are well known to require a very limited set of math knowledge and be solveable by applying simple algorithms. 

I know chain of thought prompting well - it's not a way to lift a fundamental constraint, it just is a more efficient targeting of the weights which represent what you want in the model.

It really isn't hard. No new paradigms are required. The proo

... (read more)
I think you may underestimate the difficulty of the MATH dataset. It's not IMO-level, obviously, but from the original paper []: Clearly this is not a rigorous evaluation of human ability, but the dataset is far from trivial. Even if it's not winning IMO golds [] yet, this level of capability is not something I would have expected to see managed by an AI that provably cannot multiply in one step (if you had asked me in 2015). {Edit: to further support that this level of performance on MATH was not obvious, this comes from the original paper: Further, I'd again point to the hypermind prediction market for a very glaring case of people thinking 50% in MATH was going to take more time than it actually did. I have a hard time accepting that this level of performance was actually expected without the benefit of hindsight.} It was not targeted at time complexity, but it unavoidably involves it and provides some evidence for its contribution. I disagree that I've offered no evidence- the arguments from complexity are solid, there is empirical research [] confirming the effect, and CoT points in a compelling direction.  I can understand if you find this part of the argument a bit less compelling. I'm deliberately avoiding details until I'm more confident that it's safe to talk about. (To be clear, I don't actually think I've got the Secret Keys to Dooming Humanity or something; I'm just trying to be sufficiently paranoid.) I would recommend making concrete predictions on the 1-10 year timescale about performance on these datasets (and on more difficult datasets).

I mean, to me all this indicates is that our conception of "difficult reasoning problems" is wrong and incorrectly linked to our conception of "intelligence". Like, it shouldn't be surprising that the LM can solve problems in text which are notoriously based around applying a short step by step algorithm, when it has many examples in the training set.

To me, this says that "just slightly improving our AI architectures to be less dumb" is incredibly hard, because the models that we would have previously expected to be able to solve trivial arithmetic problems if they could do other "harder" problems are unable to do that.

I'm not clear on why it wouldn't be surprising. The MATH dataset is not easy stuff for most humans. Yes, it's clear that the algorithm used in the cases where the language models succeeds must fit in constant time and so must be (in a computational sense) simple, but it's still outperforming a good chunk of humans. I can't ignore how odd that is. Perhaps human reasoning is uniquely limited in tasks similar to the MATH dataset, AI consuming it isn't that interesting, and there are no implications for other types of human reasoning, but that's a high complexity pill to swallow. I'd need to see some evidence to favor a hypothesis like that. 1. It was easily predictable beforehand that a transformer wouldn't do well at arithmetic (and all non-constant time algorithms), since transformers provably can't express it in one shot. Every bit of capability they have above what you'd expect from 'provably incapable of arithmetic' is what's worth at least a little bit of a brow-raise. 2. Moving to non-constant time architectures provably lifts a fundamental constraint, and is empirically shown []to increase capability []. (Chain of thought prompting does not entirely remove the limiter on the per-iteration expressible algorithms, but makes it more likely that each step is expressible. It's a half-step toward a more general architecture, and it works.) 3. It really isn't hard. No new paradigms are required. The proof of concepts are already implemented and work. It's more of a question of when one of the big companies decides it's worth poking with scale.

Mostly Discord servers in my experience: EleutherAI is a big well known one but there are others with high concentrations of top ML researchers.

I happened to be reading this post today, as Science has just published a story on a fabrication scandal regarding an influential paper on amyloid-β:

I was wondering if this scandal changes the picture you described at all?

Not a ton. I'd also recommend this article [], including the discussion in the comments by researchers in the field. A crucial distinction I'd emphasize which is almost always lost in popular discussions is that between the toxic amyloid oligomer hypothesis, that aggregates of amyloid beta are the main direct cause of neurodegeneration; and the ATN hypothesis I described in this thread, that amyloid pathology causes tau pathology and tau pathology causes neurodegeneration. The former is mainly what this research concerns and has been largely discredited in my opinion since approximately 2012; the latter has a mountain of evidence in favor as I've described, and that hasn't really changed now that it's turned out that one line of evidence for an importantly different hypothesis was fabricated.