[Paper] Self-Transparency Failures in Expert-Persona LLMs
I have written the paper "Self-Transparency Failures in Expert-Persona LLMs: How Instruction-Following Overrides Disclosure" and I am sharing a condensed version of the paper. Users need models to be transparent about their nature as AI systems so they can calibrate expectations appropriately and not overtrust information from models. We test...
Dec 18, 20258