Bjartur Tómas

Occasionally think about topics discussed here. Will post if I have any thoughts worth sharing. Write CRUD for a living.


Are we in an AI overhang?

Just posting in case you did not get my PM. It has my email in it.

Logan Strohl on exercise norms

This is probably not a meta enough comment, but I have been using kettlebells since the pandemic and I think they are the highest ROI form of exercise I have ever tried. I do 5 minutes of kettlebell swings with a 60 pound bell 3 times a day: before work, on my lunch break, and after work. My strength has significantly increased and it feels like a good cardio workout too.

My big problem with exercise is not the discomfort but the monotony. Swings are much more exhausting than most exercises and are also a hybrid of lifting and cardio, making them very efficient.

Are we in an AI overhang?

Your estimates of hardware advancement seem higher than most people's. I've enjoyed your comments on such things and think there should be a high-level, full length post on them, especially with widely respected posts claiming much longer times until human-level hardware.Would be willing to subsidize such a thing if you are interested. Would pay 500 USD to yourself or a charity of your choice for a post on the potential of ASICS, Moore's law, how quickly we can overcome the memory bandwidth bottlenecks and such things. Would also subsidize a post estimating an answer this question, too:

Are we in an AI overhang?

One thing we have to account for is advances architecture even in a world where Moore's law is dead, to what extent memory bandwidth is a constraint on model size, etc. You could rephrase this as how much of an "architecture overhang" exists. One frame to view this through is in era the of Moore's law we sort of banked a lot of parallel architectural advances as we lacked a good use case for such things. We now have such a use case. So the question is how much performance is sitting in the bank, waiting to be pulled out in the next 5 years.

I don't know how seriously to take the AI ASIC people, but they are claiming very large increases in capability, on the order of 100-1000x in the next 10 years, if this is a true this is a multiplier on top of increased investment. See this response from a panel including big-wigs at NVIDIA, Google, and Cerebras about projected capabilities: On top of this, one has to account, too, for algorithmic advancement:

Another thing to note is though by parameter count, the largest modern models are 10000x smaller than the human brain, if one buys that parameter >= synapse idea (which most don't but is not entirely off the table), the temporal resolution is far higher. So once we get human-sized models, they may be trained almost comically faster than human minds are. So on top an architecture overhang we may have this "temporal resolution overhang", too, where once models are as powerful as the human brain they will almost certainly be trained much faster. And on top of this there is an "inference overhang" where because inference is much, much cheaper than training, once you are done training an economically useful model, you will almost tautologically have a lot of compute to exploit it with.

Hopefully I am just being paranoid (I am definitely more of a squib than a wizard in these domains), but I am seeing overhangs everywhere!

Open & Welcome Thread - June 2020
What would be a good exit plan? If you've thought about this, can you share your plan and/or discuss (privately) my specific situation?'

+1 for this. Would love to talk to other people seriously considering exit. Maybe we could start a Telegram or something.

[This comment is no longer endorsed by its author]Reply
human psycholinguists: a critical appraisal

They already assigned >90% probability that GPT-2 models something like how speech production works.

Is that truly the case? I recall reading Corey Washington a former linguist (who left the field for neuroscience in frustration with its culture and methods) claim that when he was a linguist the general attitude was there was no way in hell something like GPT-2 would ever work even close to the degree that it does.

Found it:

Steve: Corey’s background is in philosophy of language and linguistics, and also neuroscience, and I have always felt that he’s a little bit more pessimistic than I am about AGI. So I’m curious — and answer honestly, Corey, no revisionist thinking — before the results of this GPT-2 paper were available to you, would you not have bet very strongly against the procedure that they went through working?

Corey: Yes, I would’ve said no way in hell actually, to be honest with you.

Steve: Yes. So it’s an event that caused you to update your priors.

Corey: Absolutely. Just to be honest, when I was coming up, I was at MIT in the mid ’80s in linguistics, and there was this general talk about how machine translation just would never happen and how it was just lunacy, and maybe if they listened to us at MIT and took a little linguistics class they might actually figure out how to get this thing to work, but as it is they’re going off and doing this stuff which is just destined to fail. It’s a complete falsification of that basic outlook, which I think, — looking back, of course — had very little evidence — it had a lot of hubris behind it, but very little evidence behind it.

I was just recently reading a paper in Dutch, and I just simply… First of all, the OCR recognized the Dutch language and it gave me a little text version of the page. I simply copied the page, pasted it into Google Translate, and got a translation that allowed me to basically read this article without much difficulty. That would’ve been thought to be impossible 20, 30 years ago — and it’s not even close to predicting the next word, or writing in the style that is typical of the corpus.