The OpenAI ES algorithm isn't very plausible (for exactly why you said), but the general idea of: "existing parameters + random noise -> revert if performance got worse, repeat" does seem like a reasonable way to end up with an approximation of the gradient. I had in mind something more like Uber AI's Neuroevolution, which wouldn't necessarily require parallelization or storage if the brain did some sort of fast local updating, parameter-wise.

Reply

Brains and backprop: a key timeline crux

moss8y10

There has been some work lately on derivative-free optimization of ANNs (ES mostly, but I've seen some other genetic-flavored work as well). They tend to be off-policy, and I'm not sure how biologically plausible that is, but something to think about w/r/t whether current DL progress is taking the same route as biological intelligence (-> getting us closer to [super]intelligence)

Reply