LESSWRONG
LW

Fun Theory is the study of questions such as "How much fun is there in the universe?",
"Will we ever run out of fun?", "Are we having fun yet?" and "Could we be having
more fun?". It's relevant to designing utopias and AIs, among other things.

Customize

Quick Takes

eggsyntax16h8813

Thomas Kwa, Karl Krueger

A small but noteworthy initial step toward soft nationalization of AI companies Axios[1]: Some points of note[2]: 1. Unless I'm badly mistaken, these are limitations from Anthropic's standard terms of service, not something recently introduced. So 'pay a price for forcing our hand like this' seems misleading at best -- presumably the Pentagon read the terms before choosing to sign the contract and incorporating Claude into its processes[3]. 2. This conflict was ostensibly triggered by Anthropic asking about the use of Claude in the Maduro raid[4]. Senior administration official: "Anthropic asked whether their software was used for the raid to capture Maduro, which caused real concerns across the Department of War indicating that they might not approve if it was...Any company that would jeopardize the operational success of our warfighters in the field is one we need to reevaluate our partnership with going forward." 3. Axios is correct that designating Anthropic a supply chain risk would be an extremely unusual step; all previous uses of that designation have been for foreign companies with suspected ties to foreign intelligence services (eg Huawei). 4. Estimates from Opus-4.6 and ChatGPT-5.2 suggest that designating Anthropic a supply chain risk would cost them something like 5% of revenue. This does not include a couple of potential major additional impacts: a) spread from DoD contractors to all federal contractors (I haven't tried to estimate what this would cost them) and b) impact on Anthropic's valuation for their intended IPO later this year, which would be ~30x whatever the direct revenue impact turns out to be. So this looks like a small but real step in the direction of soft nationalization as described by Cheng & Katzke, where the government uses a range of levers to exercise control over the choices of private companies. To be clear: the military choosing not to use Anthropic would not qualify in my view, but taking the highly unusual step of

Tom Smith10h374

Nisan

The government implied that OpenAI, GDM, and xAI will allow their models to be used for mass surveillance of Americans. Are they right? The Department of War is trying to pressure Anthropic to allow their models to be used "to spy on Americans en masse, or to develop weapons that fire with no human involvement". Secretary of War Pete Hegseth is reportedly "close" to having the military refuse to do business with any company that doesn't cut ties with Anthropic. A senior Pentagon official says he wants to "make sure they pay a price for forcing our hand like this." (Source: Axios) Right now Claude is the only model that the military entrusts for use in classified systems, but soon they'll presumably switch to another company if Anthropic doesn't back down. The article states So it sounds like the government is, as a pressure tactic, implying OpenAI, Google, and xAI will roll over and let their models be used to surveil Americans and autonomously kill people. Is this true? I assume OpenAI, Google, and xAI employees wouldn't stand for this. Can OpenAI, Google, and xAI comment on if they will allow their models to be used to surveil Americans en masse or to autonomously kill people without safeguards (esp. measures to ensure they're not used against Americans)?

leogao1d4718

Steveot, Shankar Sivarajan, and 4 more

people generally talk about food preservatives in a negative way. certainly, some of them are not great for you. but I want to take a moment to appreciate how wonderful food preservatives (and refrigeration and pasteurization and canning) are as well. it's crazy how fast most normal food goes bad. like a loaf of real old fashioned bread will go stale after a day and then become moldy after a few more days. for almost all of human history, people just sort of lived with this, and if they wanted to make foods last they had to dry it out and/or drown it in salt or vinegar or alcohol. pickles and beef jerky are great, but it would suck if you had to eat them all the time.

leogao4h80

Kongo Landwalker

my life would often be better if I exercised more agency. why don't I do so more often? here is a taxonomy of reasons I've noticed: * energy: I'm often very fatigued, which makes it much harder for me to do anything, which includes anything new. * decision fatigue: a related thing is even for a given amount of energy, I have a limited number of decisions I can make, and a limited number of things I can focus on and think carefully about. * emotional avoidance: sometimes, exercising agency requires admitting that I've been doing things wrong all this time, or that part of my identity is not what I want it to be, or confronting some past trauma. sometimes I identify as being bad at X, which make it hard to improve at X. * conformism: there's a critic inside my head that yells at me when I do or even consider things that could be considered "crazy" by others. I ignore it more than most people, but it still has nonzero say. * uncreativity: in certain domains I've spent so much time thinking about a specific kind of idea that it becomes genuinely hard for me to even imagine other ideas. * cowardice: sometimes the ideas are obvious but they require large irreversible actions, and/or are likely to have unpredictable consequences.

faul_sname11h120

Thomas Kwa, AlphaAndOmega, and 1 more

Preregistration: I am seeing if Claude can oneshot a slightly novel interpretability technique to decompose a soft prompt into a weighted average over normal prompts. Background: A "soft prompt" is a sequence of continuous vectors prepended to a model's input at the embedding layer. Unlike a normal prompt, the vectors don't correspond to any token in the vocabulary. Soft prompts are interesting because you can do gradient descent over them - that is, you can take a model and a bunch of outputs, and then gradient descend over a fixed length prompt to find prefix embeddings on the base model which produce similar outputs to the fine-tuned or prompt-tuned model[0][3]. If you can do this with low KL, that means you can convert from weights changes to prefix changes (and you already could convert the other direction via context distillation). The downside, of course, is that soft prompts aren't interpretable (or at least haven't been interpreted, as far as I'm aware). You can even construct a soft prompt for any behavior which projects to any arbitrary tokens you choose[^1]. And running the projected hard prompt will not produce similar outputs to the soft prompt (obviously, as the projected hard prompt can be anything). Maybe it's possible to decompose a soft prompt into a weighted average over several hard prompts though. The idea is that if you have k normal prompts, you can combine their next-token distributions as a weighted geometric mixture: log p_combined(t) = sum_i w_i * log p(t | prompt_i) This appears to already exist and be a well-known technique called "product-of-experts" -- same mechanism as classifier-free guidance and contrastive decoding [^2]. Weights can be negative ("avoid what this prompt would predict"). With prompts fixed, solving for weights is convex optimization over the logits -- not quite standard least squares because of the normalization, but I think standard enough that off-the-shelf solvers should Just Work™." So: take a soft prom

Erich_Grunewald16h160

I wrote a post forecasting Chinese compute acquisition in 2026. The very short summary is that I expect about 60% to be legally imported NVIDIA H200s, with domestically produced Huawei Ascends accounting for about 25%, and the remainder being smuggled AI chips and Ascends illegally fabricated outside China via proxies. While China likely produces GPU dies in quite large quantities, it is likely bottlenecked by an HBM shortage, which limits the total number of Ascend 910Cs and other AI chips that can actually be assembled. I do expect domestic production to grow substantially in 2027 and 2028, as CXMT ramps up HBM production. In total, I expect China to acquire about 320,000 B300-equivalents (90% CI: 150,000 to 600,000) in 2026, enough to train about six Grok-4-scale models simultaneously. By comparison, the Stargate campus that Oracle has been building for OpenAI in Abilene, Texas will alone house over 300,000 B300-equivalents. Some Chinese companies are also renting AI chips from non-Chinese cloud providers. For example, according to SemiAnalysis, ByteDance is Oracle’s largest customer; their largest joint cluster, located in Southeast Asia, will perhaps reach about 250,000 B300-equivalents this summer. (I don’t count remote access as “acquisition” since there is no ownership.) NB. These estimates are quite rough, so take them with a grain of salt. But I think they give a good sense of the general size of these different pathways.

leogao6h60

i'm thinking of starting a new blog. it would be about some amount of AI/alignment stuff of course, but also about lots of random other things. for instance, some blog post ideas: * hiring pollsters to run deranged survey questions about transhumanism on the average american * rediscovering all of physics since 300 BC by submitting experiments to a grad student who simulates how the experiment would have gone * book review of the lyndon b johnson biography * miscellaneous short fiction about AI but also other things * hosting and then doing postmortems of weird experimental house party ideas (example idea) thing i need your help with: * i'm looking for ideas of what i could name my blog. by default i'm just going to go with something lame like "lg blog" but I feel like I could do a lot better. * i'm curious to hear which of the ideas you would be most excited to read, so i can prioritize them

Your Feed