Software engineer from Ireland who's interested in EA and AI safety research.
I used the estimate from a document named What's In My AI? which estimates that the GPT-2 training dataset contains 15B tokens.
A quick way to estimate the total number of training tokens is to multiply the size of the training dataset size by the number of tokens per byte which is typically about 0.25 according to the Pile paper. So 40B x 0.25 = 10 billion.
For context, I have a very similar background to you - I'm a software engineer with a computer science degree interested in working on AI alignment.
LTFF granted about $10 million last year. Even if all that money were spent on independent AI alignment researchers, if each researcher costs $100k per year, then there would only be enough money to fund about 100 researchers in the world per year so I don't see LTFF as a scalable solution.
Unlike software engineering, AI alignment research tends to be neglected and underfunded because it's not an activity that can easily be made profitable. That's one reason why there are far more software engineers than AI alignment researchers.
Work that is unprofitable but beneficial such as basic science research has traditionally been done by university researchers who, to the best of my knowledge, are mainly funded by government grants.
I have also considered becoming independently wealthy to work on AI alignment in the past but that strategy seems too slow if AGI will be created relatively soon.
So my plan is to apply for jobs at organizations like Redwood Research or apply for funding from LTFF and if those plans fail, I will consider getting a PhD and getting funding from the government instead which seems more scalable.
One more reason why iterative design could fail is if we build AI systems with low corrigibility. If we build a misaligned AI with low corrigibility that isn't doing what we want, we might have difficulty shutting it down or changing its goal. I think that's one of the reasons why Yudkowsky believes we have to get alignment right on the first try.
Maybe Sam knows a lot I don't know but here are some reasons why I'm skeptical about the end of scaling large language models:
Because scaling laws are power laws (x-axis is logarithmic and y-axis is linear), there are diminishing returns to resources like more compute but I doubt we've reached the point where the marginal cost of training larger models exceeds the marginal benefit. Think of a company like Google: building the biggest and best model is immensely valuable in a global, winner-takes-all market like search.
My estimate is about 400 billion parameters (100 billion - 1 trillion) based on EpochAI's estimate of GPT-4's training compute and scaling laws which can be used to calculate the optimal number of parameters and training tokens that should be used for language models given a certain compute budget.
Although 1 trillion sounds impressive and bigger models tend to achieve a lower loss given a fixed amount of data, an increased number of parameters is not necessarily more desirable because a bigger model uses more compute and therefore can't be trained on as much data.
If the model is made too big, the decrease in training tokens actually exceeds the benefit of the larger model leading to worse performance.
Extract from the Training Compute-Optimal Language Models paper:
"our analysis clearly suggests that given the training compute budget for many current LLMs, smaller models should have been trained on more tokens to achieve the most performant model."
Another quote from the paper:
"Unless one has a compute budget of FLOPs (over 250× the compute used to train Gopher), a 1 trillion parameter model is unlikely to be the optimal model to train."
So unless the EpochAI estimate is too low by about an order of magnitude [1] or OpenAI has discovered new and better scaling laws, the number of parameters in GPT-4 is probably lower than 1 trillion.
My Twitter thread estimating the number of parameters in GPT-4.
I don't think it is but it could be.
I generally agree with the points made in this post.
Slowing down AI progress seems rational conditional on there being a significant probability that AGI will cause extinction.
Generally, technologies are accepted only when their expected benefit significantly outweighs their expected harms. Consider flying as an example. Let’s say the benefit of each flight is +10 and the harm of getting killed is -1000. If x is the probability of surviving then the net utility equation is .
Solving for x, the utility is 0 when . In other words, the flight would only be worth it if there was at least a 99% chance of survival which makes intuitive sense.
If we use the same utility function for AI and assume that Eliezer believes that creating AGI will have a 50% chance of causing human extinction then the outcome would be strongly net negative for humanity and one should agree with this sentiment unless one's P(extinction) is less than 1%.
Eliezer is saying that we can in principle make AI safe but argues that it could take decades to advance AI safety to the point where we can be sufficiently confident that creating an AGI would have net positive utility.
If slowing down AI progress is the best course of action, then achieving a good outcome for AGI seems more like an AI governance problem than a technical AI safety research problem.
"Progress in AI capabilities is running vastly, vastly ahead of progress in AI alignment or even progress in understanding what the hell is going on inside those systems. If we actually do this, we are all going to die."
I think Evan Hubinger has said that before if this were the case, GPT-4 would be less aligned than GPT-3 but the opposite is true in reality (GPT-4 is more aligned according to OpenAI). Still, I think we ideally want a scalable AI alignment solution long before the level of capabilities is reached where it’s needed. A similar idea is how Claude Shannon conceived of a minimax chess algorithm decades before we had the compute to implement it.
Eliezer has been sounding the alarm for some time and it’s easy to get alarm fatigue and become complacent. But the fact that a leading member of the AI safety research community has a message as extreme as this is alarming.
In this case, the percent error is 8.1% and the absolute error is 8%. If one student gets 91% on a test and another gets 99% they both get an A so the difference doesn't seem large to me.
The article linked seems to be missing. Can you explain your point in more detail?
I like LessWrong and I visit the site quite often. I would be willing to pay a monthly subscription for the site especially if the subscription included extra features.
Maybe LessWrong could raise money in a similar way to Twitter: by offering a paid premium version.
A Fermi estimate of how much revenue LessWrong could generate: