[ Question ]

How important are model sizes to your timeline predictions?

by SoerenMind 14d5th Sep 2019No comments


I'm thinking of doing a paper for GovAI on model sizes if they matter. Any or all of the following would be helpful:

1) If you found out that the largest published neural net model sizes (by # of parameters) will grow by 10000x in the next 6 years, would that make your timelines shorter or longer and by how much?

2) Same question with 100x.

3) Same question but with 100x less (100x more) than whatever you expected.

4) Same question but with 10x less than in the last 6 years.

Assume all else is equal if possible. E.g. we have the same compute, memory, algorithms and data available. Assume that the reason for slower increase is due to lack of possibility to grow models, and not necessarily lack of will. Ignore conditional execution models where only small parts of the model is used at a time.

I'm okay with quick, unexplained qualitative answers. Well-reasoned quantitative ones are encouraged of course.


-Some ML projects like GPT-2 (figure 1) have recorded high (but diminishing) returns to bigger models, typically looking like power laws.

-Some people (e.g. Geoff Hinton) informally argued that model sizes are analogous to synapse count in the brain, which correlates with brain size, which has high returns.

Bonus question: Without looking it up, by what factor do you think the largest published model sizes have increased between 2012 and 2019? By what factor do guess they'll increase in the next 6 years?

New Answer
Ask Related Question
New Comment
Write here. Select text for formatting options.
We support LaTeX: Cmd-4 for inline, Cmd-M for block-level (Ctrl on Windows).
You can switch between rich text and markdown in your user settings.