I've been working on AI safety for a while now. It's going better than expected, but I have finite hours. The more hours I spend on safety, the less I can spend on business-oriented things. Those business-oriented things are long term (4+ years) high risk attempts at earning to give.
My timelines aren't long, so doing some direct work on AI safety seems wise so long as I'm not completely killing the businessy side of things.
But if I could find an actually-useful way to share work, it could save years. So:Suppose you've got a person with a background in high performance real time physics simulation, bleeding edge low latency rendering, aggressive optimization, and most other low-level videogamey things you need for making a physics-heavy 3D multi-user game-like application.
Is there a project you would want to see them develop for the sake of AI safety?
So far, my ideas for this have a strong flavor of searching under the streetlight, and they don't seem higher value than my current strategy of fully separate research.
One example:A multi-agent simulation with rich physical interactions to explore (and attempt to break) forms of corrigibility in a complex instrumented environment.
Good:- Perfectly transparent physical simulations give you a lot of easy options for analysis compared to pure language. (Judging if a given block of text violated another player's values in some way is not trivial; judging if the agent stomped on another player's head is trivial.)
Questionable:- Is the marginal value of the deeper physics simulation actually enough to bother doing this, compared to gridworld-esque options?- Concretely, what research would this assist that could not get done otherwise?
Bad:- Releasing a whole framework for this kind of thing as an open source project- assuming it was decent and flexible- would almost unavoidably be more useful for not-safety. It might not be the kind of not-safety that is dangerous, given scale, but it's still not-safety.
It's not clear to me there is any great option here, but... if there's something in this space you really want to see, let me know!
Certainly you must've seen this already.
Yup- there are some decent examples of the kinds of things that could be studied there.
My understanding is that they're not focusing on deeply physical 3d environments (which makes sense in context). It's hard to find a strong value add for the kinds of things that would be naturally shared with my existing simulation-heavy work; it seems like encultured's level of gameplay already covers a lot of it.
I would still have some comparative advantage in working on that sort of thing, but (if my understanding of their project is correct), it would still be a replacement of either my safety or business efforts rather than a free win.
I have some discussion here and links therein, particularly here.
Thanks for the links. The simboxy route/justification does interest me (in terms of the kind of work it would require, and what I naturally like doing), and there would indeed be a lot of shared work in the obvious version (likely more than 50%), but I also worry I'd be indulging myself in shiny-thing chasing if I don't have some extremely concrete objectives, or otherwise some equivalent of a customer who wants the thing to exist so they can use it.
The good news is that the natural outcome of the business side of things will yield a lot of stuff that could be repurposed for simbox-ish use cases regardless, but do you have any specific features/goals/experiments for the system in mind that you didn't already include in your previous posts?
Honestly I haven’t thought about it very much. Jacob seems to have thought about it more, you could consider asking him. I’m happy to chat, but expect something more like “group brainstorming” than “I tell you what to do”. Some obvious things, I think: