Discusses the possibility of using LLMs as subjects in psych experiments. Quantitatively analyzes the difference in moral reasoning between LLMs and human subjects.
Statistically models the tendency of LLMs to make "statements designed to persuade, but without regard for truth", after the "bullsh*t" model of philosopher Harry Frankfurt.
Summary: The first one's a terrible idea, and the second one surprisingly (to me, at least) indicts RLHF.
[Since it's my blog, there may be occasional political opinions, over which you will no doubt wish to skip. Relatively minimal in this post, though. Or at least I think so.]
Perhaps people here will be amused by, or even mildly interested in, a post on my blog discussing 2 preprints on the doubtful nature of LLMs:
S Schröder, et al., "Large Language Models Do Not Simulate Human Psychology", arχiv, submitted 2025-Aug-09. DOI: 10.48550/arXiv.2508.06950.
Discusses the possibility of using LLMs as subjects in psych experiments. Quantitatively analyzes the difference in moral reasoning between LLMs and human subjects.
K Liang, et al., "Machine Bullshit: Characterizing the Emergent Disregard for Truth in Large Language Models", arχiv, submitted 2025-Jul-10. DOI: 10.48550/arXiv.2507.07484.
Statistically models the tendency of LLMs to make "statements designed to persuade, but without regard for truth", after the "bullsh*t" model of philosopher Harry Frankfurt.
Summary: The first one's a terrible idea, and the second one surprisingly (to me, at least) indicts RLHF.
[Since it's my blog, there may be occasional political opinions, over which you will no doubt wish to skip. Relatively minimal in this post, though. Or at least I think so.]