Posts

Sorted by New

Wiki Contributions

Comments

Thanks for posting this. I am a bit surprised that the forecasts for hardware-related restrictions are so low. Are there any notes or details available on what led the group to those numbers?

In particular the spread between firmware-based monitoring (7%) and compute capacity restrictions (15%) seems too small to me. I would have expected either a higher chance of restrictions or lower chance of on-chip monitoring because both are predicated on similar decision-making steps but implementing and operating an end-to-end firmware monitoring system has many technical hurdles.

Yeah pretty much. If you think about mapping something like matrix-multiply to a specific hardware device, details like how the data is laid out in memory, utilizing the cache hierarchy effectively, efficiently moving data around the system, etc are important for performance.

This is a really nice analysis, thank you for posting it! The part I wonder about is what kind of "tricks" may become economically feasible for commercialization once shrinking the transistors hits physical limits. While that kind of physical design research isn't my area, I've been led to believe there are some interesting possibilities that just haven't been explored much because they cost a lot and "let's just make it smaller next year" has traditionally been an easier R&D task.

Yep, I think you're right that both views are compatible. In terms of performance comparison, the architectures are quite different and so while looking at raw floating-point performance gives you a rough idea of the device's capabilities, performance on specific benchmarks can be quite different. Optimization adds another dimension entirely, for example NVIDIA has highly-optimized DNN libraries that achieve very impressive performance (as a fraction of raw floating-point performance) on their GPU hardware. AFAIK nobody is spending that much effort (e.g. teams of engineers x several months) to optimize deep learning models on CPU these days because it isn't worth the return on investment.

I’m finishing my PhD in hardware/ML and I’ve been thinking vaguely about hardware approaches for AI safety recently, so it’s great to see other people are thinking about this too! I hope to have more free time once I finish my thesis in a few weeks, and I’d love to talk more to anyone else who is interested in this approach and perhaps help out if I can.

I think this is a really nice write-up! As someone relatively new to the idea of AI Safety, having a summary of all the approaches people are working on is really helpful as it would have taken me weeks to put this together on my own.

Obviously this would be a lot of work, but I think it would be really great to post this as a living document on GitHub where you can update and (potentially) expand it over time, perhaps by curating contributions from folks. In particular it would be interesting to see three arguments for each approach: a “best argument for”, “best argument against” and “what I think is the most realistic outcome”, along with uncertainties for each.