jsteinhardt

Sequences

More Is Different for AI

Wiki Contributions

Comments

Hiring Programmers in Academia

I think this might be an overstatement. It's true that NSF tends not to fund developers, but in ML the NSF is only one of many funders (lots of faculty have grants from industry partnerships, for instance).

Personal forecasting retrospective: 2020-2022

Thanks for writing this!

Regarding how surprise on current forecasts should factor into AI timelines, two takes I have:

 * Given that all the forecasts seem to be wrong in the "things happened faster than we expected" direction, we should probably expect HLAI to happen faster than expected as well.

 * It also seems like we should retreat more to outside views about general rates of technological progress, rather than forming a specific inside view (since the inside view seems to mostly end up being wrong).

I think a pure outside view would give a median of something like 35 years in my opinion (based on my very sketchy attempt of forming a dataset of when technical grant challenges were solved), and then ML progress seems to be happening quite quickly, so you should probably adjust down from that.

Actually pretty interested how you get to medians of 40 years, that seems longer than I'd predict without looking at any field-specific facts about ML, and then the field-specific facts mostly push towards shorter timelines.

How fast can we perform a forward pass?

Thanks! I just read over it and assuming I understood correctly, this bottleneck primarily happens for "small" operations like layer normalization and softlax, and not for large matrix multiples. In addition, these small operations are still the minority of runtime (40% in their case). So I think this is still consistent with my analysis, which assumes various things will creep in to keep GPU utilization around 40%, but that they won't ever drive it to (say) 10%. Is this correct or have I misunderstood the nature of the bottleneck?

Edit: also maybe we're just miscommunicating--I definitely don't think CPU->HBM is a bottleneck, it's instead the time to load from HBM which sounds the same as what you said. Unless I misread the A100 specs, that comes out to 1.5TB/s, which is the number I use throughout.

How fast can we perform a forward pass?

Short answer: If future AI systems are doing R&D, it matters how quickly the R&D is happening.

How fast can we perform a forward pass?

Okay, thanks! The posts actually are written in markdown, at least on the backend, in case that helps you.

How fast can we perform a forward pass?

Question for mods (sorry if I asked this before): Is there a way to make the LaTeX render?

In theory MathJax should be enough, eg that's all I use at the original post: https://bounded-regret.ghost.io/how-fast-can-we-perform-a-forward-pass/

Why I'm Optimistic About Near-Term AI Risk

I was surprised by this claim. To be concrete, what's your probability of xrisk conditional on 10-year timelines? Mine is something like 25% I think, and higher than my unconditional probability of xrisk.

Early 2022 Paper Round-up

Fortunately (?), I think the jury is still out on whether phase transitions happen in practice for large-scale systems. It could be that once a system is complex and large enough, it's hard for a single factor to dominate and you get smoother changes. But I think it could go either way.

Early 2022 Paper Round-up

Thanks! I pretty much agree with everything you said. This is also largely why I am excited about the work, and I think what you wrote captures it more crisply than I could have.

Buck's Shortform

Yup, I agree with this, and think the argument generalizes to most alignment work (which is why I'm relatively optimistic about our chances compared to some other people, e.g. something like 85% p(success), mostly because most things one can think of doing will probably be done).

It's possibly an argument that work is most valuable in cases of unexpectedly short timelines, although I'm not sure how much weight I actually place on that.

Load More