I mentioned before that Google search is less than satisfactory even on uncomplicated queries: https://www.lesswrong.com/posts/naAs59xiGfr7fPjej/google-search-as-a-washed-up-service-dog-i-halp

The linked article compares accuracy and usefulness of Google search with ChatGPT on some quite reasonable queries, and the latter seems to win hands down, even gimped by the lack of internet access (h/t https://twitter.com/emollick/status/1605420852635488258). I am not sure what to make of it, since Google has its own AI lab that is apparently at least as advanced and with better staffing and funding... 

New Comment
16 comments, sorted by Click to highlight new comments since: Today at 6:28 PM

I think the article undersells the problems of ChatGPT's hallucinations. One example from the article where ChatGPT is said to win is a recipe for risotto. However, I wouldn't follow a risotto recipe for ChatGPT just because I can't be confident it hasn't hallucinated some portion of the recipe but would happily follow one from Google, even if the format is a bit more annoying. Same issue with calculating load bearing capacity for a beam only more serious!

Having said that, it does seem like there are definitely specific areas where ChatGPT will be more useful. Coding is a good example as verifying the code is usually straightforward and/or would need to be done anyway. In many cases ChatGPT for an overview followed by Google for more detail/verification is probably a good way to go - I think this would be a good idea for the load bearing wall example.

However, I wouldn't follow a risotto recipe for ChatGPT just because I can't be confident it hasn't hallucinated some portion of the recipe but would happily follow one from Google, even if the format is a bit more annoying.

I agree. I can also confirm that ChatGPT is indeed making stuff up even in the recipe linked in the article: the traditional risotto recipe used in Italy doesn't include garlic.

  1. It would be helpful to compare Google to ChatGPT on the actual distribution of queries to Google. I'd guess only a small % of queries to Google are about algorithms (e.g. doubly-linked lists question). IIRC many queries are not really searches but just looking for specific URLs (e.g. people search "facebook" or "wordle"). Also very common is naural-language translation (ChatGPT can't translate whole websites right now). Finally, many searches are for local information that ChatGPT can't offer at all right now ("weather", "food near me"). Programmers and other power users are probably have different usage distributions than the average user.

  2. That said, I think less internet-savvy people could find web-enabled ChatGPT better for many queries because having a back-and-forth conversation with ChatGPT about some issue (e.g. tech support, medical, how to cook something) is easier than doing a search, opening a few tabs with the results and skimming them, modifying the search if the results weren't great (e.g. adding "reddit"), collating information from multiple pages, etc.

"How do I make risotto?" is only a little bit about algorithms. And yeah, being able to have a real discussion with the bot helps a lot.

have you tried metaphor.systems? personally I find that neither chatgpt or google can hold a candle to metaphor for the queries metaphor works well on. let me know if I've mentioned it too many times and should shut up yet, but I've been bringing it up at every opportunity because no other ai system has as much of a positive impact on my daily workflow. I've found so many other awesome tools with it.

huh, never heard of it.

I haven't yet tried metaphor as you recommended. I should. I have started using chatGPT as a pre-search tool. I use it to figure out the right technical terms to put in my Google search. Or the right function to read the manual page of.

Reading the article, and painting it with my own experiences with Google failing me, it looks like Google is best when you want to understand a specific, named (possibly niche) object, but that chat GPT potentially has an advantage when you want to understand two things with a specific relationship, or if you can't name the specific object you want but can only describe it in relation to other objects.

An example from my own life. I few years ago I wanted to know if it was possible to put a latex table in a footnote. I still don't know if it is possible, because Google buried me in websties telling me how to attach a footnote to a table - a completely different formatting problem that shares the same two key words. I suspect chat GPT would be good for this.

I can't tell you whether this is right or not but this is what ChatGPT thinks:

"Is it possible to put a latex table in a footnote?"

"Yes, it is possible to include a LaTeX table in a footnote. You can do this by using the footnote or footnotetext commands provided by the footmisc package.

Here is an example of how to use these commands to include a table in a footnote:






\begin{tabular}{c c c}

A & B & C \\

1 & 2 & 3 \\

4 & 5 & 6


\caption{Table caption}






\begin{tabular}{c c c}

D & E & F \\

7 & 8 & 9 \\

10 & 11 & 12


\caption{Table caption in footnote}



Note that when including a table in a footnote, you should use the table environment as you would normally, but place it inside the footnote or footnotetext command. You should also make sure to use the [h] option for the table environment to prevent the table from floating.

I hope this helps! Let me know if you have any other questions."

Oh, thanks for trying it. Its a good effort in a way, it is definitely it trying to put the table in the footnote. So has the right target. Unfortunately from my test it doesn't seem to actually work - no error message, but the table doesn't render. It is so much less frustrating to see it fail at the right goal than provide a lot of information about a distinct goal that happens to share the same keywords.

Google's production search is expensive to change, but I'm sure you're right that it is missing some obvious improvements in 'understanding' a la ChatGPT.

One valid excuse for low quality results is that Google's method is actively gamed (for obvious $ reasons) by people who probably have insider info.

IMO a fair comparison would require ChatGPT to do a better job presenting a list of URLs.

One major factor is that Google serves more than a hundred thousand searches per second. A reasonably detailed response by ChatGPT (comparable to reading a top search result) is on the order of a thousand inferences, and so scaling to a Google replacement would require on the order of a hundred million inferences per second. Given the size of the model, I'm not sure that even Google would have enough compute to support that rate.

There is also the hallucination problem, as well as the fact that ChatGPT can't actually link to anything on the Internet, it can only tell you something about it.

Well, these are all either non-problems or solved/solvable ones.

  • you don't need to serve pages, only answer queries at a reasonable rate, which the bot seems to be doing pretty well already, with minimal scaling implemented so far.
  • you can apply the usual caching for identical queries.
  • you can probably have a light-weight version that bunches similar queries (the metric of similarity is, of course, subject to investigation).
  • Google hits also give you iffy answers, a lot of them. They are not hallucinations, but not really better in terms of accuracy, usefulness or fake authoritativeness.
  • ChatGPT is intentionally gimped in terms of up-to-date internet access, easily fixed.

One thing that would complicate the usual scaling approaches is that each instance apparently has a session, and the results of subsequent queries are session-dependent. Then again, the likes of Google Assistant already deal with the same issue.

Sam Altman talked about how the average ChatGPT chat costs a few cents in compute costs. That's sounds like orders of magnitude more than what a Google query costs in compute costs. 

One Quora answer suggests Google earns 0.6 cents per search on average. The answer is old an Google might make more revenue per average search but it's a good chance that ChatGPT is currently more expensive in compute than a Google search makes revenue.

Hmm, Google has been in business for over a quarter century, and these guys have their first viable product, and the cost difference is only one order of magnitude, and without any scaling optimization? I'd say that Google better get its act together if they don't want to lose their search business. 

...Which is no longer "search", this is an outdated term. The paradigm has shifted. Almost-natural language virtual assistants were the first signs, and now LLMs are taking over.

these guys have their first viable product

Their API has been a viable product before. GitHub Copilot is build on OpenAI codex and is Wa viable product. 

and the cost difference is only one order of magnitude

The difference between compute prices and revenue of the Google advertising is within one order of magnitude. Google does not pay as much for compute currently as it makes in revenue per search but likely order of magnitude less.