GPT-3: a disappointing paper

Reading this I get the impression you have mismanaged expectations of what you think GPT-3 would do (ie should only be reserved for essentially pseudo-AGI)...but scaling GPT to the point of diminishing returns is going to take several more years. As everyone is stressing, they don’t even fit the training data at the moment.

GPT-2 was a hype fest, while this gets silently released on ArXiv. I’m starting think there’s something real here. I think before I’d laugh anyone who suggested GPT-2 could reason. I still think that’s true with GPT-3, but I wouldn’t laugh anymore. It seems possible massive scaling could legitimately produce a different kind of AI then anything we’ve seen yet.

While I’m not sure how easy plugging into a DRL algorithm will be, this seems to be the obvious next step. On the other hand, I suspect DRL isn’t really mature enough to work as an integrating paradigm.

I asked a related question and got some answers about finding things on the internet. Didn’t completely satisfy me, but my question was significantly more vague so it might help you!


I think that a hidden assumption here is that improving in a weak skill always has a positive spillover affect on other skills. There might be a hidden truth within this. Namely, sometimes unlearning things will be the best way to make progress.


Perhaps this can be connected with another recent post. It was pointed about in Subspace Optima that when we optimize we do so under constraints external or internal. It seems like you had an internal constraint stopping you from optimizing over the whole space. Instead you focused on what you thought was the most correlated trait. This almost reads like an insight following the realization you’ve been optimizing a skill along a artificial sub-space.

Do this at the end of the day as a way to review progress?

Can I get clarification on what sort of emotions were problematic and/or what reactions were problematic? I’m wondering if this was rumination or in the moment reactions.

Just a meta-comment. If you don’t give a description of the feed, I found myself very unlikely to look at the url.

