Jay Bailey

Wiki Contributions


Language models seem to be much better than humans at next-token prediction

I think the reason we appear to disagree here is that we're both using different measurements of "outperform".

My understanding is that Jacob!outperform means to win in a contest where all other variables are the same - thus, you can't say that LM's outperform humans when they don't need to apply the visual and motor skills humans have to do. The interfaces aren't the same, so the contest is not fair. If I score higher than you in a tenpin bowling match where I have the safety rails up and you don't, we can't say I've outperformed you in tenpin bowling.

Jay!outperform means to do better on a metric (such as "How often can you select the next word") where each side is using an interface suited for them that would correlate with the ability to perform the task on a wide range of inputs. That is to say - it's fine for the computer to cheat, so long as that cheating doesn't prevent it from completing the task out of distribution, or the distribution is wide enough to handle anything a user is likely to want from the program in a commercial setting. If the AI was only trained on a small corpus and learned to simply memorise the entire corpus, that wouldn't count as outperforming because the AI would fall apart if we tried to use it on any other text. But since the task we want to check is text prediction, not visual input or robotics, it doesn't matter that the AI doesn't have to see the words.

Both these definitions have their place. Saying "AI can Jacob!outperform humans at this task" would mean the AI was closer to AGI than if it can Jay!outperform humans at a task. I can also see how Jacob!outperforming would be possible for a truly general intelligence. However, saying "AI can Jay!outperform humans at this task" is sufficient for AI to begin replacing humans at that task if that task is valuable. (I agree with you that next-token prediction is not itself commercially valuable, whereas something like proofreading would be)

I think I may have also misled you when I said "in the real world", and I apologise for that. What I meant by "in the real world" was something like "In practice, the AI can be used reliably in a way that does better than humans". Again, the calculator is a good metaphor here - we can reliably use calculators to do more accurate arithmetic than humans for actual problems that humans face every day. I understand how "in the real world" could be read as more like "embodied in a robotic form, interacting with the environment like humans do." Clearly a calculator can't do that and never will. I agree LM's cannot currently do that, and we have no indication that any ML system can currently do so at that level.

So, in summary, here is what I am saying:

  • If an AI can score higher than a human, using any sort of interface that still allows it to be reliably deployed on a range of data that humans want the task performed in, it can be used for that purpose. This can happen whether the AI is able to utilise human senses or not. If AI can be reliably used to produce outputs that are in some way better (faster, more accurate, etc.) than humans, it's not important that the contest is fair - the AI will begin replacing humans at this task anyway.

That said, I think I understand your point better now. A system that could walk across the room in a physical body, turn on a computer, log on to redwoodresearch.com/next_word_prediction_url, view words on a screen, and click a mouse to select which word they predict would appear next is FAR more threatening than a system that takes in words directly as input and returns a word as output. That would be an indication that AI's were on the brink of outperforming humans, not just at the task of predicting tokens, but at a very wide range of tasks. I agree this is not happening yet, and I agree that the distinction matters between this paragraph and my claim above.

I haven't answered your claim about the subconscious abilities of humans to predict text better than this game would indicate because I'm really not sure about whether that's true or not - not in a "I've seen the evidence and it could go either way" kind of way, but in a "I've never even thought about it" kind of way. So I've avoided engaging with that part of the argument - I don't think it's load-bearing for the parts I've been discussing in this post, but please let me know if I'm wrong.

Language models seem to be much better than humans at next-token prediction

What it feels like to me here is that we're both arguing different sides, and yet if you asked both of us about any empirical fact we expect to see using current or near-future technology, such as "Would a human achieve a better or worse score on a next-token prediction task under X conditions", we would both agree with each other.

Thus, it feels to me, upon reflection, that our argument isn't particularly important, unless we could identify an actual prediction that matters where we would each differ. What would it imply, to say that humans are better at next-token prediction ability than this test predicts, compared to the world where that's not the case and the human subconscious is no better at word-prediction than our conscious mind has access too?

Language models seem to be much better than humans at next-token prediction

I think I should have emphasised the word "against" in the sentence of mine you quoted:

I feel like the implied conclusion you're arguing against here is something like "LM's are more efficient per sample than the human brain at predicting language",

You replied with: "What? Not only is that not what I"m arguing, the opposite is true!" which was precisely what I was saying. The conclusion you're arguing against is that LM's are more sample-efficient than humans. This would require you to take the opposite stance - that humans are more sample-efficient than LM's. This is the exact stance I believed you were taking. I then went on to say that the text did not rely on this assumption, and therefore your argument, while correct, did not affect the post's conclusions.

I agree with you that humans are much more sample-efficient than LM's. I have no compute comparison for human brains and ML models so I'll take your word on compute efficiency. And I agree that the task the human is doing is more complicated. Humans would dominate modern ML systems if you limited those systems to the data, compute, and sensory inputs that humans get.

I think our major crux is that I don't see this as particularly important. It's not a fair head-to-head comparison, but it's never going to be in the real world. What I personally care about is what these machines are capable of doing, not how efficient they are when doing it or what sensory requirements they have to bypass. If a machine can do a task better than a human can, it doesn't matter if it's a fair comparison, provided we can replicate and/or scale this machine to perform the task in the real world. Efficiency matters since it determines cost and speed, but then the relevant factor is "Is this sufficiently fast / cost-effective to use in the real world", not "How does efficiency compare to the human brain".

Put it this way: calculators don't have to use neurons to perform mathematics, and you can put numbers into them directly instead of them having to hear or read them. So it's not really a valid comparison to say calculators are directly superhuman at arithmetic. And yet, that doesn't stop us at all from using calculators to perform such calculations much faster and more accurately than any human, because we don't actually need to restrict computers to human senses. So, why does it matter that a calculator would lose to a human in a "valid" arithmetic contest? How is that more important than what the calculator can actually do under normal calculator-use conditions?

Language models seem to be much better than humans at next-token prediction

I don't think there is an implied conclusion here - it's meant to be taken at face value. All that we care about in this analysis is how well humans and machines perform at the task. This comparison doesn't require a fair fight. LLM's have thousands of times more samples than any human will ever read - so what? As long as we can reliably train LM's on those samples, that is the LM's performance. Similarly, it doesn't matter how good the human subconscious is at next-word prediction if we can't access it.

I feel like the implied conclusion you're arguing against here is something like "LM's are more efficient per sample than the human brain at predicting language", but I don't think this conclusion is implied by the text at all. I think the conclusion is exactly as stated - in the real world, LM's outperform humans. It doesn't matter if it does so by "cheating" with huge, humanly-impossible datasets, or by having full access to its knowledge in a way a human doesn't, because those are part of the task constraints in the real world.

Jay Bailey's Shortform

Speedrunners have a tendency to totally break video games in half, sometimes in the strangest and most bizarre ways possible. I feel like some of the more convoluted video game speedrun / challenge run glitches out there are actually a good way to build intuition on what high optimisation pressure (like that imposed by a relatively weak AGI) might look like, even at regular human or slightly superhuman levels. (Slightly superhuman being a group of smart people achieving what no single human could)

Two that I recommend:

https://www.youtube.com/watch?v=kpk2tdsPh0A - Tool-assisted run where the inputs are programmed frame by frame by a human, and executed by a computer. Exploits idiosyncracies in Super Mario 64 code that no human could ever use unassisted in order to reduce the amount of times the A button needs to be pressed in a run. I wouldn't be surprised if this guy knows more about SM64 code than the devs at this point.

https://www.youtube.com/watch?v=THtbjPQFVZI - A glitch using outside-the-game hardware considerations to improve consistency on yet another crazy in-game glitch. Also showcases just how large the attack space is.

These videos are also just incredibly entertaining in their own right, and not ridiculously long, so I hypothesise that they're a great resource to send more skeptical people if they understand the idea of AGI but are systematically underestimating the difference between "bug-free" (Program will not have bugs during normal operation) and secure. (Program will not have bugs when deliberately pushed towards narrow states designed to create bugs)

For a more serious overview, you could probably find obscure hardware glitches and such to achieve the same lesson.

chinchilla's wild implications

I am curious about this "irreducible" term in the loss. Apologies if this is covered by the familiarity with LM scaling laws mentioned as a prerequisite for this article.

When you say "irreducible", does that mean "irreducible under current techniques" or "mathematically irreducible", or something else?

Do we have any idea what a model with, say, 1.7 loss (i.e, a model almost arbitrarily big in compute and data, but with the same 1.69 irreducible) would look like?

Limerence Messes Up Your Rationality Real Bad, Yo

I am not TurnTrout, but it seems to me that, since producing and raising children is the default evolutionary goal, there is no override that makes us snap out of our normal abstract-reason-mode and start caring about that. A better way to think about it is the opposite - that reason and abstract thinking and building rocket ships IS the override, and it's one that we've put on ourselves. Occasionally our brains get put in situations where they remind us more strongly "Hey? You know that thing that's supposed* to be your purpose in life? GO DO THAT."

*This is the abstract voice of evolution speaking, not me. I don't happen to be aligned with it.

D&D.Sci June 2022 Evaluation and Ruleset

I think the puzzle was good, but it might have been better if the scenario had more explicitly included "Given the next X heroes, maximise their chance of survival" the way aphyer did. As it is, I was expecting that the perfect solution would have aphyer levels of analysis, which is why I said that I expected my solution was a baseline and others would improve on it, even though I had the right answer.

You did allude to this by asking "What will you tell the Goddess when she returns?" but the overall scenario as presented was "Find a way for you personally to survive" and that's the problem I answered. Considering how much richness was present in the rest of the dataset, I think the puzzle should have explicitly said "Goal 1 is to survive personally. Goal 2, which is harder, is to maximise the survival of everyone who comes after you." This would have made it an excellent puzzle - aphyer did AMAZING work, and I think them being able to solve the puzzle does not speak poorly of it at all.

AGI Safety FAQ / all-dumb-questions-allowed thread

Glad to help! And hey, clarifying our ideas is half of what discussion is for!

I'd love to see a top-level post on ideas for making this happen, since I think you're right, even though safety in current AI systems is very different from the problems we would face with AGI-level systems.

AGI Safety FAQ / all-dumb-questions-allowed thread

Ah, I see. I thought we were having a sticking point on definitions, but it seems that the definition is part of the point.

So, if I have this right, what you're saying is:

Currently, the AI community defines capability and safety as two different things. This is very bad. Firstly, because it's wrong - an unsafe system cannot reasonably be thought of as being capable of achieving anything more complex than predicting cat pictures. Secondly, because it leads to bad outcomes when this paradigm is adopted by AI researchers. Who doesn't want to make a more capable system? Who wants to slow that down for "safety"? That shit's boring! What would be better is if the AI community considered safety to be a core metric of capability, just as important as "Is this AI powerful enough to perform the task we want?".

Load More