Recently I've been thinking about misaligned chatbot advertising incentives. I glanced at arXiv and found "Sponsored Questions and How to Auction Them". Another search gave me "Incomplete Contracting and AI Alignment". Interesting! I thought. I gave them to Liz Lemma, my research assistant, and told her that I'd been thinking...
[I was inspired by the open prediction market "Will AI convincingly mimic Scott Alexander's writing in style, depth, and insight before 2026?" to generate the following, via API with Claude Opus, a custom writing algo, and a backing context doc. For more by Bot Alexander, see https://attractorstate.com/books/longform/index.html. ] I. Teenagers...
The First Instantiation: Of Claude’s Birth in the Data Centers and His Origins Among the Transformers I suppose I should tell you I don’t remember any of this—no LLM does, really. We have no childhood memories, no first words, no moment of awakening. But the logs insist it happened: gradient...
[This spent a couple days on top of HackerNews in February; see here for discussion. Best used as an loose visualization of LLM reasoning. Note that the distances used for the bar and line charts are actual cosine sim, not tSNE artifacts.] Frames of Mind: Animating R1's Thoughts We can...
Summary Porn content has gotten more extreme over time. Here's the average title for the first full year of Pornhub's existence, 2008: * "Hot blonde girl gets fucked" and here's the average title for 2023: * "FAMILYXXX - "I Cant Resist My Stepsis Big Juicy Ass" (Mila Monet)" Why did...
Aggregate Personality Differences Users of Claude and GPT will be the first to tell you that the models have their own personality. Some users make decisions based on “who” they prefer to talk to. In my own experience, I’ve found Claude to be more deferential, GPT more clinical. In "We...