tickybob — LessWrong

How many OOMs of compute span the human range?

Thanks for the feedback, exactly the kind of thing I was hoping to get from posting here.

I have thought about a multiplicative model for intelligence, but wouldn't the fact we see pretty-close-to-Gaussian results on intelligence tests tend to disconfirm it? Any residual non-Gaussianity seems like it can be explained by population stratification, etc, rather than a fundamentally non-Gaussian underlying structure. Also, like you say the polygenic evidence also seems to point to a linear additive model being essentially correct.

How many OOMs of compute span the human range?

tickybob3mo10

Thanks, good suggestion. I've updated the title.

chinchilla's wild implications

tickybob3y30

There is an old (2013) paper from Google here that mentions training an ngram model on 1.3T tokens: ("Our second-level distributed language model uses word 4-grams. The English model is trained on a 1.3 × 10^12 token training set"). An even earlier 2006 blog post here also references a 1T word corpus.

This number is 2x as big as MassiveWeb, more than a decade old, and not necessarily the whole web even back then. So I would be quite surprised if the MassiveWeb 506B token number represents a limit of what's available on the web. My guess would be that there's at least an order of magnitude more tokens available in a full web scrape. Though a lot does depends on how much the "quality filter" throws out.

And if this does represent a limit of what's on the web, then as other posters have said, email is much larger than the web. Though I question whether anyone would be reckless enough to train an LLM on everyone's private emails without consent, it seems like a potential privacy disaster.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments