Inspired by Eliezer's Lethalities post and Zvi's response:
Has there been any research or writing on whether we can train AI to generalize out of distribution?
I'm thinking, for example:
And, of course, we can keep on adding piling layers on.
A few minutes of hasty Googling didn't turn up anything on this, but it seems pretty unlikely to be an original idea. But who knows! I wanted to get the idea written down and online before I had time to forget about it.
On the off chance it hasn't been beaten thought about to death yet by people smarter than myself, I would consider together longer, less hastily written post on the idea