x
Improved regret bound for DRL — LessWrong