More precise regret bound for DRL — LessWrong