Is this the beginning of the end for LLMS [as the royal road to AGI, whatever that is]?

Out of curiosity, I skimmed the Ted Gioia linked article and encountered this absolutely wild sentence:

AI is getting more sycophantic and willing to agree with false statements over time.

which is just such a complete misunderstanding of the results from Discovering Language Model Behaviors with Model-Written Evaluations. Instantly disqualified the author from being someone I'd pay attention to for AI-related analysis.

[-]frontier642y30

I don't think the body of this post is related to the title. Whether a framework outlines a path to AGI has little to do with consumer takeup of an earlier product based on the same framework.

[-]Paul Tiplady2y32

While of course this is easy to rationalize post hoc, I don’t think falling user count of ChatGPT is a particularly useful signal. There is a possible world where it is useful; something like “all of the value from LLMs will come from people entering text into ChatGPT”. In that world, users giving up shows that there isn’t much value.

In this world, I believe most of the value is (currently) gated behind non-trivial amounts of software scaffolding, which will take man-years of development time to build. Things like UI paradigms for coding assistants, experimental frameworks and research for medical or legal AI, and integrations with existing systems.

There are supposedly north of 100 AI startups in the current Y Combinator batch; the fraction of those that turn into unicorns would be my proposal for a robust metric to pay attention to. Even if it’s par for startups that’s still a big deal, since there was just a major glut in count of startups founded. But if the AI hype is real, more of these than normal will be huge.

Another similar proxy would be VC investment dollars; if that falls off a cliff you could tell a story that even the dumb money isn’t convinced anymore.

[-]Bill Benzon2y10

While of course this is easy to rationalize post hoc, I don’t think falling user count of ChatGPT is a particularly useful signal.

I agree with that. Perhaps those who've dropped off were casual users and have become bored. But there are other complaints. The continued existence of confabulation seems more troublesome. OTOH, I can imagine that coding assistance will prove viable. As I said, the situation is quite volatile.

[-]Nate Showell2y21

Some other possible explanations for why ChatGPT usage has decreased:

The quality of the product has declined over time
People are using its competitors instead

[-]gwern2y*201

There's a lot one could say about this claim:

Recall here the numbers here are substantially fake. The best they can tell you is roughly "this is very big" or "this is very small". If you want to go much beyond that, you are reading sheep entrails. They are not coming from OpenAI but web traffic measurements. Such measurements are notoriously both noisy and highly biased and the biases change over time, and so unsurprisingly, at the time, OAers were saying it had overestimated the actual users by like 100%.
The numbers have lots of ways to be misleading. For example, because Chinese DL was, and still is, so inferior, there was a whole cottage industry of Chinese companies pirating accounts and black markets in credentials. This also applied to all the Third World or embargoed or difficult countries OA has denied access to, either for paying accounts or just period. Then you have people abusing it or sexing with it and getting banned and figuring out how to create new accounts, or moving on.

Just a huge amount of whac-a-mole going on. This obviously causes issues for interpreting any user metrics: a huge decrease in user count might actually reflect a huge increase in users, and vice-versa, depending on how the security arms race is going.
Summer vacation. If you look at the graph, you'll notice a remarkable correlation with the Western academic calendar... Someone confidently proclaiming that ChatGPT use is crashing before seeing the September–November 2023 numbers is giving hostage to fortune.

EDIT: as of 1 October, SimilarWeb and other sources are reporting increasing traffic.
Many, many alternatives spinning up, many of which are using OA as the backend, particularly after the large price drops. These would count in a naive web-traffic approach as OA 'losing users', rather than gaining them.

More broadly, if they are going to true competitors like Claude-2 and do not count as OA users by any definition, that's still damaging to Goia's thesis that 'generative AI is useless' - sure, maybe it's not great for OA but it shows that the users are getting value out of generative AI, and just that they found a better way to get that value.

Personally, I would say that a simple test of OA users/activity would be to look at how they act. OA is presumably getting regular large shipments of GPUs installed into datacenters as fast as MS money can buy them; if OA usage is flat for several months - never mind crashing! - then they should be 'enjoying' a glut of GPUs by now and acting accordingly. Does OA look like it has an embarrassment of GPUs? Or does it look like it is struggling to add capacity as necessary to keep up with constant user growth, and holding back major improvements because it can't afford them, and focusing on optimizing models (even to the detriment of quality) to get more out of its existing GPUs?

[-]Bill Benzon2y10

Thanks for this. Very useful.

[-]Paul Tiplady2y21

Confabulation is a dealbreaker for some use-cases (e.g. customer support), and potentially tolerable for others (e.g. generating code when tests / ground-truth is available). I think it's essentially down to whether you care about best-case performance (discarding bad responses) or worst-case performance.

But agreed, a lot of value is dependent on solving that problem.

[-]Bill Benzon2y10

As sort of an aside, in some way I think the confabulation is the default mode of human language. We make stuff up all the time. But we have to coordinate with others too, so that places constraints on what we say. Those constraints can be so binding that we've come to think of this socially constrained discourse as 'ground truth' and free of the confabulation impulse. But that's not quite so.

[-]Quintin Pope2y30

Seems contradictory to argue both that generative AI is useless and that it could replace millions of jobs.

[-]Bill Benzon2y10

You're probably right. I note, however, that this is territory that's not been well-charted. So it's not obvious to me just what to make of the inconsistency. It doesn't (strongly) contradict Gioia's main point, which is that LLMs seem to be in trouble in the commercial sphere.

[-]MattJ2y10

I think the auther ment that there was a perception that it could replace millions of jobs, and so an incentive for business to press forward with their implementation plans, but that this would eventually back fire if the hallucination problem is insoluble.

[-]Quintin Pope2y20

Perhaps, but that's not the literal meaning of the text.

Here’s what we now know about AI:
[...]
AI potentially creates a situation where millions of people can be fired and replaced with bots [...]

[-]MattJ2y2-1

Yes, but that ”generative AI can potentially replace millions of jobs” is not contradictory to the statement that it eventually ”may turn out to be a dud”.

I initially reacted in the same way as you to the exact same passage but came to the conclusion that it was not illogical. Maybe I’m wrong but I don’t think so.

[-]Bill Benzon2y20

You're right, and I don't know what Gioia would say if pressed. But it might be something like: "Millions of people will be replaced by bots and then the businesses will fall apart because the bots don't behave as advertised. So now millions are out of jobs and the businesses that used to employ them are in trouble."

[+][comment deleted]2y31

LESSWRONG
LW

LESSWRONG
LW

3

Is this the beginning of the end for LLMS [as the royal road to AGI, whatever that is]?

3

3