SorenJ — LessWrong

Don't feel obligated to respond to this comment...

But wait—there’s more!
(Part of why the two-ness of guess vs. ask always bothered me is that it didn’t allow for what comes next.)
Bailey tracks zero echoes, and Cameron tracks one.
Dallas tracks two. “If I say X, they’ll probably feel A about it. But they know that, and they know that I know that, and thus their X→A pattern creates pressure on me that makes it hard for me to give my honest opinion on the whole X question, and I have some feelings about that.”
(Maybe Dallas tries to change the other person’s X→A pattern, or maybe Dallas just lets the other person’s X→A pattern influence their behavior but feels kind of resentful about it, or maybe Dallas stubbornly insists on X’ing even though the other person is trying to take Dallas hostage with their emotional X→A blackmail, etc.)
Elliott, on the other hand, grew up around a bunch of people like Dallas, and is tracking three echoes, because Elliott has seen how Dallas-type thinking impacts the other person.“If I say X, they will respond with A, and we all know that the X→A pressure causes me to feel a certain way, and they probably feel good/bad/guilty/apologetic/whatever about how this is impacting my behavior.”
(Examples beyond this point start to get pretty complicated,

This is pretty funny and entertaining. And I want to make it even more fun! You don't necessarily need to worry about tracking an infinite number of echoes. Let's assume that you can track any echo to within accuracy. Even if you know someone very well, you can't read minds. So say for the sake of argument right now that $λ < 0.5$ as an example.

Then, sweeping a bunch of stuff under the rug, a simple mathematical way to model the culture would be a power series:

$f = f_{0} + A λ + B λ^{2} + C λ^{3} + . . .$

Where $A, B, C, . . .$ are your predictions for how your conversation partner will respond for that particular echo. $A, B, C, . . .$ are not going to be real numbers, they will be some distribution/outer product of distributions, but the point is that because $λ < 1$ this series should converge. Cultures where $λ$ is higher will be more "nonlinear."

SorenJ's Shortform

SorenJ2mo10

Why do language models hallucinate? https://openai.com/index/why-language-models-hallucinate/
There is a new paper from OpenAI. It is mostly basic stuff. I have a question though, and thought this would be a good place to ask. Have any model trainings worked with outputing subjective probabilities for their answers? And have labs started to apply the reasoning traces to the pretraining tasks as well?

One could make the models "wager", and then reward in line with the wager. The way this is typically done is based on the Kelly criterion and uses logarithimc scoring. The logarithmic scoring rule awards you points based on the logarithm of the probability you assigned to the correct answer.

For example, for two possibilites A and B if answer A turns out to be correct, your score is: . If answer B turns out to be correct, your score is: $S_{B} = log (1 - p)$ . Usually a constant is added to make the scores positive. For example, the score could be. The key feature of this rule is that you maximize your expected score by reporting your true belief. If you truly believe there is an 80% chance that A is correct, your expected score is maximized by setting $p = 0.8$ .

I recall seeing somewhere on the internet a while back of a decision theory course, where for the exam itself students were required to output their confidence in their answer, and they would be awarded points accordingly.

What about doing the following: get rid of the distinction between post-training and pre-training. Make predicting text an RL task and allow reasoning. At the end of the reasoning chain output a subjective probability and award in accordance with the logarithmic scoring rules.e

I am worried about near-term non-LLM AI developments

SorenJ3mo80

What are your priors, or what is your base rate calculation, for how often promising new ML architectures end up scaling and generalizing well to new tasks?

If your AGI definition excludes most humans, it sucks.

SorenJ3mo63

I think something you aren't mentioning is that at least part of the reason the definition of AGI has gotten so decoupled from its original intended meaning is that the current AI systems we have are unexpectedly spiky.

We knew that it was possible to create narrow ASI or AGI for a while; chess engines did this in 1997. We thought that a system which could do the broad suite of tasks that GPT is currently capable of doing would necessarily be able to do the other things on a computer that humans are able to do. This didn't really happen. GPT is already superhuman in some ways, and maybe superhuman for about ~50% of economically viable tasks that are done via computer, but it still makes mistakes at very other basic things.

It's weird that GPT can name and analyze differential equations better than most people with a math degree, but be unable to correctly cite a reference. We didn't expect that.

Another difficult thing about defining AGI is that we actually expect better than "median human level" performance, but not necessarily in an unfair way. Most people around the globe don't know the rules of chess, but we would expect AGI to be able to play at roughly the ~1000 elo level. Let's define AGI as being able to perform every computer task at the level of the median human who has been given one month of training. We haven't hit that milestone yet. But we may well blow past superhuman performance on a few other capabilities before we get to that milestone.

a confusion about preference orderings

SorenJ6mo20

I'm not sure if I agree that this idea with the dashed lines, of being unable to transition "directly," is coherent or not. A more plausible structure seems to me like a transitive relation for the solid arrows. If A->B and B->C, then there exists an A->C.

Again, what does it mean to be unable to transition "directly?" You've explicitly said we're ignoring path depencies and time, so if an agent can go from A to B, and then from B to C, I claim that this means there should be a solid arrow from A to C.

Of course, in real life, sometimes you have to sacrifice in the short term to reach a more preferred long term state. But by the framework we set up, this needs to be "brought into the diagram" (to use your phrasing.)

Five Hinge‑Questions That Decide Whether AGI Is Five Years Away or Twenty

SorenJ6mo*7-1

Reasoning models are better than basically all human mathematicians

What do you mean by this? Although I concede that 95%+ of all humans are not very good at math, for those I would call human mathematicians I would say that reasoning models are better than basically 0% of them. (And I am aware of the Frontier Math benchmarks.)

Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study

SorenJ7mo20

They're pretty bad, but they seem about GPT-2 level bad? So plausibly in a couple of years they will be GPT-4 level good, if things go the same way?
This does seem pretty difficult. The only idea I have is having humans wear special gloves with sensors on them, and maybe explain their thoughts aloud as they work, and then collecting all of this data.
Before you go to RL you need to train on prediction with a large amount of data first. We don't have this yet for blue collar work. Then once you have the prediction model, robots, and rudimentary agents, you try to get the robots to do simple tasks in isolated environments. If they succeed they get rewarded. This feels quite a bit more than 3 years away...

In general, I think the ideas is that you first get a superhuman coder, then you get a superhuman AI researcher, then you get a any-task superhuman researcher, and then you use this superhuman researcher to solve all of the problems we have been discussing in lightning fast time.

Frontier AI Models Still Fail at Basic Physical Tasks: A Manufacturing Case Study

SorenJ7mo8-2

A simple idea here would be to just pay machinists and other blue collar workers to wear something like a Go-Pro (and possibly even other sensors along the body that track movement) all throughout the day. You then use the same architectures that currently predict video to predict what the blue collar worker does during their day.

AI 2027: What Superintelligence Looks Like

SorenJ7mo10

Thank you very much, this is so helpful! I want to know if I am understanding things correctly again, so please correct me if I am wrong on any of the following:

By "used for inference," this just means basically letting people use the model? Like when I go to the chatgpt website, I am using the datacenter campus computers that were previously used for training? (Again, please forgive my noobie questions.)

For 2025, Abilene is building a 100,000-chip campus. This is plausibly around the same number of chips that were used to train the~3e26 FLOPs GPT4.5 at the Goodyear campus. However, the Goodyear campus was using H100 chips, but Abilene will be using Blackwell NVL72 chips. These improved chips means that for the same number of chips we can now train a 1e27 FLOPs model instead of just a 3e26 model. The chips can be built by summer 2025, and a new model trained by around end of year 2025.

1.5 years after the Blackwell chips, the new Rubin chip will arrive. The time is now currently ~2027.5.

Now a few things need to happen:

The revenue growth rate from 2024 to 2025 of 3x/year continues to hold. In that case, after 1.5 years, we can expect $60bn in revenue by 2027.5.
The 'raised money' : 'revenue' ratio of $30bn : $12bn in 2025 holds again. In that case we have $60bn x 2.5 = $150bn.
The decision would need to be made to purchase the $150 bn worth of Rubin chips (and Nvidia would need to be able to supply this.)

More realistically, assuming (1) and (2) hold, it makes more sense to wait until the Rubin Ultra comes out to spend the $150bn on.

Or, some type of mixed buildout would occur, some of that $150bn in 2027.5 would use the Rubin non-Ultra to train a 2e28 FLOPs model, and the remainder would be used to build an even bigger model in 2028 that uses Rubin Ultra.

AI 2027: What Superintelligence Looks Like

SorenJ7mo20

Do I have the high level takeaways here correct? Forgive my use of the phrase "Training size," but I know very little about diferent chips, so I am trying to distill it down to simple numbers.

2024:

a) OpenAI revenue: $3.7 billion.

b) Training size: 3e26 to 1e27 FLOPs.

c) Training cost: $4-5 billion.

2025 Projections:

a) OpenAI revenue: $12 billion.

b) Training size: 5e27 FLOPs.

b) Training cost: $25-30 billion.

2026 Projections:

a) OpenAI revenue: ~$36 billion to $60 billlion.

At this point I am confused: why you are saying Rubin arriving after Blackwell would make the revenue more like $60 billion? Again, I know very little about chips. Wouldn't the arrival of a different chip also change OpenAIs cost?

b) Training size: 5e28 FLOPs.

c) Training cost: $150 billion.

Assuming investors are willing to take the same ratio of revenue : training cost as before, this would predict $70 billion to $150 billion. In other words, to get to the $150 billion mark requires that Rubin arrives after Blackwell, openAI makes revenue $60 billion in revenue, and investors take a 2.5 multiplier for $60 x 2.5 = $150 billion.

Is there anything else that I missed?

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments