# 24

What's the most impressive thing you can think of that you believe GPT-4 has around 5% chance of being capable of doing? (i.e. your belief that GPT-4 will have the capability to do this thing is around 5%)

For this question, "impressive" should be interpreted to mean something different from "surprising". What I have in mind is "impressive" in the sense of "economically useful", "comparable to or better than human experts" or "jaw-droppingly creative", etc. For example, GPT-4 being able to reverse large text would be surprising but not impressive.

The reason I'm specifying a belief probability of 5% is that if your probability is higher than that, you can try to make the task/thing more impressive to reduce the probability to 5%. If it's less than 5%, well... things can get a bit crazy so maybe try to make the task less impressive.

But if you find this constraint too restrictive, feel free to specify your own combination of the most impressive thing and your probability that GPT-4 will be able to do it, as long as the probability is in the vicinity of 5% (something like 1 to 10% would be fine). You can also specify a probability range (eg. 5-10%) if it's difficult to estimate it.

New Comment
24 comments, sorted by Click to highlight new comments since:

Scare Eliezer into cutting his expected time to disaster by at least 75%.

But wouldn't that be easy? He seems to take every little advancement as a big deal.

How many times do you think he has changed his expected time to disaster to 25% of what it was?

If they chose to design it with effective long term memory, and a focus on novels, (especially prompting via summary) maybe it could write some? They wouldn't be human level, but people would be interested enough in novels on a whim to match some exact scenario that it could be valuable. It would also be good evidence of advancement, since that is a huge current weakness (the losing track of things.).

GPT-4 (Edited because I actually realize I put way more than 5% weight on the original phrasing): SOTA on language translation for every language (not just English/French and whatever else GPT-3 has), without fine-tuning.

Not GPT-4 specifically, assuming they keep the focus on next-token prediction of all human text, but "around the time of GPT-4": Superhuman theorem proving. I expect one of the millennium problems to be solved by an AI sometime in the next 5 years.

AI solving a millennium problem within a decade would be truly shocking, IMO. That’s the kind of thing I wouldn’t expect to see before AGI is the world superpower. My best guess coming from a mathematics background is that dominating humanity is an easier problem to for an AI.

That’s what people used to say about chess and go. Yes, mathematics requires intuition, but so does chess; the game tree’s too big to be explored fully.

Mathematics requires greater intuition and has a much broader and deeper “game” tree, but once we figure out the analogue to self-play, I think it will quickly surpass human mathematicians.

Sure. I’m not saying it won’t happen, just that an AI will already be transformative before it does happen.

I agree that before that point, an AI will be transformative, but not to the point of “AGI is the world superpower”.

Getting grandmaster rating on Codeforces.

Upd after 4 months: I think I changed my opinion, now I am 95% sure no model will be able to achieve this in 2023 and it seems quite unlikely in 2024 too.

I think there's a 50% or higher chance that GPT-4 will be sufficiently accurate that it can be used to teach new skills to autodidacts with basic prompt engineering skills. (I tried on GPT-3 today, and it had a combination of correct and incorrect insights to teach people things)

So, to move down to 5% - GPT-4 is able to ask questions about your current understanding of most undergrad or lower level topics, correct specific misapprehensions you have, and select resources/exercises based on your current level of ability. In other words, GPT-4 can act as a middling-quality professional tutor for most topics.

Honestly, even this seems not impressive enough for 5%. I might think of it as more like 10-20%. Perhaps 5% would be "In addition to this, GPT-4 can provide nontrivial brainstorming advice on graduate-level research questions". Not enough to solve them on its own, but enough to point you in fruitful directions and improve your workflow.

2%: Solve the same problem as the product Wolfram Alpha, with the same style of inputs and outputs.

Now that I think about it, Wolfram alpha might be sitting on a fairly valuable hunk of diverse math problem data. They get around 10 million visits a month, or about a billion diverse math problems- that's larger than some chunks of the pile

It would be useful if it could solve alignment...

Why are people disagreeing with this statement?

In isolation, it's technically correct.

In the context of being a direct reply to the post, it's suggesting that "solve alignment" is something that GPT-4 could plausibly do. I certainly disagree with that and voted disagreement accordingly.

It actually wouldn't surprise me if it could be done by a human alignment theorist working with an existing GPT, where the GPT serves mostly as a source of ideas.

[+][comment deleted]41