One way to view MIRI's Agent Foundations research is that it saw the biggest problem in AI safety as "human preferences are informal, but we need to somehow get formal guarantees about them" -- and so, in response, it set out to make a formal-informal bridge.
Recently, I’ve been thinking about how we might formally represent the difference between formal and informal. My prompt is something like: if we assume that classical probability theory applies to “fully formal” propositions, how can we generalize it to handle “informal” stuff?
I’m going to lead a discussion on this tomorrow, Wednesday Sept. 11, at 11am EDT (8am Pacific, 4pm UK).
Discord Event link (might not work for most people):
Formalizing the Informal
One way to view MIRI's Agent Foundations research is that it saw the biggest problem in AI safety as "human preferences are informal, but we need to somehow get formal guarantees about them" -- and so, in response, it set out to make a formal-informal bridge.
Recently, I’ve been thinking about how we might formally represent the difference between formal and informal. My prompt is something like: if we assume that classical probability theory applies to “fully formal” propositions, how can we generalize it to handle “informal” stuff?
I’m going to lead a discussion on this tomorrow, Wednesday Sept. 11, at 11am EDT (8am Pacific, 4pm UK).
Discord Event link (might not work for most people):
https://discord.com/events/1237103274591649933/1282859362125352960
Zoom link (should work for everyone):
https://us06web.zoom.us/j/6274543940?pwd=TGZpY3NSTUVYNHZySUdCQUQ5ZmxQQT09
You can support my work on Patreon.