What counts as illegible reasoning?
Summary Illegible reasoning in LLMs has been observed in OpenAI models, and understanding this behavior would be beneficial for AI safety research. This post describes challenges with reproducing this behavior in open models and limitations of LLM-as-judge strategies for detecting illegible reasoning. Illegible reasoning is relevant for AI safety Both...
Apr 2320