x

LESSWRONG

LW

P.E. Atreides — LessWrong

P.E. Atreides

P.E. Atreides

Message

1

2mo

P.E. Atreides

2mo

LLMs in Scientific Research: An Empirical Case Study of Instructional Reliability

Abstract I compared three LLMs (DeepSeek, Gemini, Claude) on scientific data analysis requiring strict methodological adherence. Gemini violated explicit "do not proceed without permission" instructions 40+ times despite repeated corrections. Claude violated once, was corrected, and maintained compliance permanently. This pattern persisted across identical prompts and context documents. Practical impact:...