Personal evaluation of LLMs, through chess — LessWrong