OpenAI now has an RL API which is broadly accessible — LessWrong