Exploring safe exploration

(disclaimer: I worked on Safety Gym)

I think I largely agree with the comments here, and I don't really have attachment to specific semantics around what exactly these terms mean. Here I'll try to use my understanding of evhub's meanings:

First: a disagreement on the separation.

A particular prediction I have now, but is weakly held, is that episode boundaries are weak and permeable, and will probably be obsolete at some point. There's a bunch of reasons I think this, but maybe the easiest to explain is that humans learn and are generall... (read more)

4rohinmshah1yMy main reason for making the separation is that in every deep RL algorithm I know of there is exploration-that-is-incentivized-by-gradient-descent and exploration-that-is-not-incentivized-by-gradient-descent and it seems like these should be distinguished. Currently due to episode boundaries these cleanly correspond to within-episode and across-episode exploration respectively, but even if episode boundaries become obsolete I expect the question of "is this exploration incentivized by the (outer) optimizer" to remain relevant. (Perhaps we could call this outer and inner exploration, where outer exploration the exploration that is not incentivized by the outer optimizer.) I don't have a strong opinion on whether "safe exploration" should refer to just outer exploration or both outer and inner exploration, since both options seem compatible with the existing ML definition.

Hey Aray!

Given this, I think the "within-episode exploration" and "across-episode exploration" relax into each other, and (as the distinction of episode boundaries fades) turn into the same thing, which I think is fine to call "safe exploration".

I agree with this. I jumped the gun a bit in not really making the distinction clear in my earlier post “Safe exploration and corrigibility,” but I think that made it a bit confusing, so I went heavy on the distinction here—but perhaps more heavy than I actually think is warranted.

The problem I have with relaxi

