LESSWRONG
LW

HomeAll PostsConceptsLibrary
Best of LessWrong
Sequence Highlights
Rationality: A-Z
The Codex
HPMOR
Community Events
Subscribe (RSS/Email)
LW the Album
Leaderboard
About
FAQ

Quick Takes

456
AI Safety Comms Retreat
ACX Meetup: Fall 2025
[Today]Abuja – ACX Meetups Everywhere Fall 2025
[Today]Ho Chi Minh – ACX Meetups Everywhere Fall 2025
AnnaSalamon's Shortform
AnnaSalamon2h370

A WSJ article from today presents evidence that toxic fumes in airplane air are surprisingly common, are bad for health, have gotten much worse recently, and are still being deliberately covered up. Is anyone up for wading in for a couple hours and giving us an estimated number of micromorts / brain damage / [something]?

I fly frequently and am wondering whether to fly less because of this (probably not, but worth a Fermi?); I imagine others might want to know too. (Also curious if some other demographics should be more concerned than I should be, eg people... (read more)

Reply
3Garrett Baker32m
Based on the article, these events seem most common on Airbus A320 aircrafts, and those are the crafts for which these events have been getting more common. Boeing 737s remain under the FAA’s industry wide estimate (the article claims Airbus far exceeds that estimate), and incidence for them has been basically constant since 2015, so if you want to dodge the whole question, I’d just make sure you use 737 flights. Edit: Reading more, it sounds like the Boeing 787 completely fixes the relevant design issue (running cabin air through the engine compartment).
dr_s6m20

these events seem most common on Airbus A320 aircrafts

As far as I can tell that's an extremely common plane for travel within Europe in my experience, so probably very relevant to a lot of people. I can count on my fingers the times I've taken a plane that was not one of those, and I travel multiple times a year.

Reply
johnswentworth's Shortform
johnswentworth2d1160

About a month ago, after some back-and-forth with several people about their experiences (including on lesswrong), I hypothesized that I don't feel the emotions signalled by oxytocin, and never have. (I do feel some adjacent things, like empathy and a sense of responsibility for others, but I don't get the feeling of loving connection which usually comes alongside those.)

Naturally I set out to test that hypothesis. This note is an in-progress overview of what I've found so far and how I'm thinking about it, written largely to collect my thoughts and to see... (read more)

Reply4
Showing 3 of 19 replies (Click to show all)
4Elizabeth3h
if the problem is with the receptor, taking more won't make a difference
localdeity1h40

Seems like that depends on details of the problem. If the receptor has zero function, then yes. If functionality is significantly reduced but nonzero… maybe.

Reply
4Ben Pace2h
Sounds like a great empirical test!
ryan_greenblatt's Shortform
ryan_greenblatt2mo*Ω5513029

Should we update against seeing relatively fast AI progress in 2025 and 2026? (Maybe (re)assess this after the GPT-5 release.)

Around the early o3 announcement (and maybe somewhat before that?), I felt like there were some reasonably compelling arguments for putting a decent amount of weight on relatively fast AI progress in 2025 (and maybe in 2026):

  • Maybe AI companies will be able to rapidly scale up RL further because RL compute is still pretty low (so there is a bunch of overhang here); this could cause fast progress if the companies can effectively di
... (read more)
Reply
Showing 3 of 21 replies (Click to show all)
12Eli Tyre2d
Is there a standard citation for this?  How do you come by this fact? 
the gears to ascension6h20

I had a notification ping in my brain just now while using claude code and realizing I'd just told it to think for a long time: I don't think the claim is true, because it doesn't match my experience.

Reply
2ryan_greenblatt2d
* Anthropic reports SWE bench scores without reasoning which is some evidence it doesn't help (much) on this sort of task. (See e.g. the release blog post for 4 opus) * Anecdotal evidence Probably it would be more accurate to say "doesn't seem to help much while it helps a lot for openai models".
Roman Malov's Shortform
Roman Malov1d713

Just as you can unjustly privilege a low-likelihood hypothesis just by thinking about it, you can in the exact same way unjustly unprivilege a high-likelihood hypothesis just by thinking about it. Example: I believe that when I press a key on a keyboard, the letter on the key is going to appear on the screen. But I do not consciously believe that; most of the time I don't even think about it. And so, just by thinking about it, I am questioning it, separating it from all hypotheses which I believe and do not question.

Some breakthroughs were in the form of "... (read more)

Reply
1Karl Krueger16h
I'm curious about the distinction you're making between "believe" and "consciously believe". Do you agree with the way I'm using these terms below? — I can only be conscious of a small finite number of things at once (maybe only one, depending on how tight a loop we mean by "consciousness"). The set of things that I would say I believe, if asked about them, is rather larger than the number of things I can be conscious of at once. Therefore, at any moment, almost none of my beliefs are conscious beliefs. For instance, an hour ago, "the moon typically appears blue-white in the daytime sky" was an unconscious belief of mine, but right now it is a conscious belief because I'm thinking about it. It will soon become an unconscious belief again.
Roman Malov6h10

Your definition seems sensible to me. Humans are not bayesians, they are not built as probabilistic machines with all of their probability being put explicitly in the memory. So I usually think of Bayesian approximation, which is basically what you’ve said. It’s unconscious when you don’t try to model those beliefs as Bayesian and unconscious otherwise.

Reply
MakoYass's Shortform
mako yass9h40

It's extremely common for US politicians to trade on legislative decisions and I feel like this is a better explanation for corruption than political donations are. Which is important because it's a stupid and so maybe fragile reason for corruption. The natural tendency of market manipulation is in a sense not to protect incumbents, but to threaten them, because you can make way way more money off of volatility than you can on stasis.

So in theory, there should exist some moderate and agreeable policy intervention that could flip the equilibrium.

Reply
Shortform
lc2d*9161

I don't think it's possible for mere mortals to use Twitter for news about politics or current events and not go a little crazy. At least, I have yet to find a Twitter user who regularly or irregularly talks about these things, and fails to boost obvious misinformation every once in a while. It doesn't matter what IQ they have or how rational they were in 2005; Twitter is just too chock full of lies, mischaracterizations, telephone games, and endless, endless, endless malicious selection effects, which by the time you're done using it are designed to appea... (read more)

Reply821
Showing 3 of 19 replies (Click to show all)
MichaelDickens12h95

In my experience, if I look at the Twitter account of someone I respect, there's a 70–90% chance that Twitter turns them into a sort of Mr. Hyde self who's angrier, less thoughtful, and generally much worse epistemically. I've noticed this tendency in myself as well; historically I tried pretty hard to avoid writing bad tweets, and avoid reading low-quality Twitter accounts, but I don't think I succeeded, and recently I gave up and just blocked Twitter using LeechBlock.

I'm sad about this because I think Twitter could be really good, and there's a lot of good stuff on it, but there's too much bad stuff.

Reply
8Nate Showell13h
I follow a hardline no-Twitter policy. I don't visit Twitter at all, and if a post somewhere else has a screenshot of a tweet I'll scroll past without reading it. There are some writers like Zvi whom I've stopped reading because their writing is too heavily influenced by Twitter and quotes too many tweets.
2Ben Pace1d
Appreciate the example. I remember reading that retweet!  At the time it sounded plausible to me, and I assumed it was accurate about certain industries.  I'm interested in understanding a bit more what's going on here. Are we sure you're talking about the same kinds of companies? I'd guess you're dealing with companies in the range of 2k-20k employees, and I think Crowdstrike was substantially affecting companies in the range of 20k-200k employees (or at least that's what I thought of when I saw this tweet), where I imagine auditors have to use much more broad-brush tools to do auditing. The sorts of companies I imagine as having this kind of broad-strokes audit are extremely broad service industries – airlines, trains, grocery stores, banks, hospitals – where my impression is they often use very old software and buggy hardware due to their overwhelming size and sloth, and where I suspect that a lot of decisions get made by the minimum possible thing required to meet some formal requirements.
Daniel Kokotajlo's Shortform
Daniel Kokotajlo13h163

Thoughts on OpenAI's new Model Spec

I think it's great that OpenAI is writing up a Model Spec and publishing it for the world to see. For reasons why, see this: https://www.lesswrong.com/posts/cxuzALcmucCndYv4a/daniel-kokotajlo-s-shortform

As AIs become a bigger and bigger part of the economy, society, and military, the "model spec" describing their intended goals/principles/etc. becomes ever more important. One day it'll be of similar or greater importance to the US legal code, and updates to the spec will be like amendments to the constitution. Right now, ... (read more)

Reply
Galathir's Shortform
Galathir3d10

I've been thinking a lot about identity (as in pg, keep your identity small).

Specifically which identities might lead to safe development of AI. And trying to validate that by running these different activities:

1.  Role playing games where the participants are asked to take on specific identities and play through a scenario where AI has to be created. 
2. Similar things where LLMs are prompted to take on particular roles and given agency to play in the role playing games too.

Has there been similar work before?

I'm particularly interested in cosmic ... (read more)

Reply
1StanislavKrym17h
The cosmic identity and related issues have been considered and I even used them to make a conjecture about alignment. As for role-playing games, I doubt that they are actually useful. Unless, of course, you mean something like Cannell's proposal.  As for "the idea of arms races and the treacherous turn", the AI-2027 team isn't worried about such a risk, they are more worried about the race itself causing the humans to do worse safety checks.
Galathir17h10

But slightly irrational actors might not race (especially if they know that other actors are slightly irrational in the same or compatible way.)

Reply
1Galathir2d
I think that there might be perverse incentives if identities or view points get promoted in a legible fashion. To hack that system rather than to do useful work. So it might be good to have identity promotion to be done in a way that is obfuscated or ineffable in some way.
Fabien's Shortform
Fabien Roger17hΩ31660

Small backdoor interp challenge!

I am somewhat skeptical that it is easy to find backdoors in LLMs, but I have heard that people are getting pretty good at this! As a challenge, I fine-tuned 4 Qwen models in which I (independently) inserted a backdoor:

1.5B model, 3B model, 7B model, 14B model

The expected usage of these models is using huggingface's "apply_chat_template" using the tokenizer that comes along with each model and using "You are a helpful assistant." as system prompt.

I think the backdoor is pretty analogous to a kind of backdoor people in the in... (read more)

Reply
Galathir's Shortform
Galathir18h*3-8

Is the rational mind set  an existential risk? It spreads the idea of arms races and the treacherous turn. Should we be encouraging less than rational world views to spread if so what? And should we be coding them into our AI? You probably want them to be hard to predict so they cannot be exploited easily.

If it is it would still be worth preserving as an example of an insidious threat that should be guarded against. Perhaps in a simulation for people to interact with.

You might want as rational a choice of mindset to adopt as possible though

Reply
Paul Bogdan's Shortform
Paul Bogdan20h21-2

Moderately popular YouTuber, Tor Parsons (171k subscribers), made a video "Every Kind of Rationalist Explained In An Extremely Long Video" (78 minutes). Tor is rat-ish but his channel doesn't focus on that. This video just summarizes the different rat subcultures. I found it enjoyable, and its covarage was wide enough that I learned some new things:


 

Reply
Mikhail Samin's Shortform
Mikhail Samin6d*669

I have great empathy and deep respect for the courage of the people currently on hunger strikes to stop the AI race. Yet, I wish they hadn’t started them: these hunger strikes will not work.

Hunger strikes can be incredibly powerful when there’s a just demand, a target who would either give in to the demand or be seen as a villain for not doing so, a wise strategy, and a group of supporters.

I don’t think these hunger strikes pass the bar. Their political demands are not what AI companies would realistically give in to because of a hunger strike by a small n... (read more)

Reply42111
Showing 3 of 66 replies (Click to show all)
5Ben Pace1d
Follow-up: Michael Trazzi wrapped up after 7 days due to fainting twice and two doctors saying he was getting close to being in a life-threatening situation. (Slightly below my modal guess, but also his blood glucose level dropped unusual fast.) FAO @Mikhail Samin.
Mikhail Samin20h20

Yep. Good that he stopped. Likely bad that he started.

Reply
3Matrice Jacobine3d
I don't think the point of hunger strikes is to achieve immediate material goals, but publicity/symbolic ones.
Rana Dexsin's Shortform
Rana Dexsin2d93

Oh! Is that what the dotted cartouche around a reaction means? That it's applied to an inline portion of text instead of to the whole post? That wasn't obvious to me before.

Reply211
2Eli Tyre1d
I'm surprised it wasn't obvious. Has your mouse never hovered over one of them before? What's different about our user experiences that you're only learning this now, where I discovered is (I assume) in the first few days that LW reacts were a thing?
Rana Dexsin21h20

I've hovered over them to see the applicable text or lack thereof before, yes, and I was aware that both types of reaction were possible. Overall I don't have a clear enough memory to say why I didn't pick up on this connection sooner, but my off-the-cuff guess would be that seeing both inline-portion and whole-comment reactions on the same comment is rare, which would mean there wasn't a clear juxtaposition to show that it's only present sometimes, and my visual processing would likely have discarded the cartouche as a decorative separator.

Reply
2habryka2d
Yeah, the UI isn't amazing. It's kind of a tricky problem to work on for a few reasons, but we should make the UI a lot more obvious.
Johannes C. Mayer's Shortform
Johannes C. Mayer1d30

Infinite Willpower

"Infinite willpower" reduces to "removing the need for willpower by collapsing internal conflict and automating control." Tulpamancy gives you a second, trained controller (the tulpa) that can modulate volition. That controller can endorse enact a policy.

However because the controller runs on a different part of the brain some modulation circuits that e.g. make you feel tired or demotivated are bypassed. You don't need willpower because you are "not doing anything" (not sending intentions). The tulpa is. And the neuronal circuits the tul... (read more)

Reply
AllAmericanBreakfast's Shortform
DirectedEvolution4d94

Duncan deleted my comment on their interesting post, Obligated to Respond, which is their prerogative. Reposting here instead.

if a hundred people happen to glance at this exchange then ten or twenty or thirty of them will definitely, predictably care—will draw any of a number of close-to-hand conclusions, imbue the non-response with meaning


Plausible, but I am not confident in this conclusion as stated or in its implications given the rest of the post. I can easily imagine other people who are confident in the opposite conclusions. Let's inventory the layer... (read more)

Reply
Showing 3 of 13 replies (Click to show all)
DirectedEvolution1d40

You're entitled to your opinion as well as to exercise your mod powers as you see fit.

I would note that Duncan remains the only individual to directly engage with the object-level content of the paragraph in question, beyond to comment on whether they approve or disapprove of it or to (accurately) characterize it as psychologizing. Duncan's clearly angry about it, and while I'm insensitive enough to have (re)posted the original, I'm not insensitive enough to try and draw them into further discussion on the matter since it appears that shutting off discussi... (read more)

Reply
2Zack_M_Davis2d
Followup question: you thought criticism was useful in April 2023. What changed your mind?
12mesaoptimizer1d
I consider the banning of Said as a canary in the coal mine. I do not think it is worth the effort for people to call out non-alignment posts they consider confused, badly written, or just downright dumb. (Alignment posts are an exception, mainly because I see people like John Wentworth and Steven Byrnes write really good counter-argument comments, and there's little to no drama or pushback by the post authors when it comes to such highly technical posts.)
Raj Thimmiah's Shortform
Raj Thimmiah1d30

What's the current state of the art on lumenators? 

Reply
adamzerner's Shortform
Adam Zerner2d70

I'm pretty into biking. I live in Portland, OR, bike as my primary mode of transport (I don't have a car), am sorta involved in the biking and urbanism communities here, read Bike Portland almost every day, think about bike infrastructure and urbanism whenever I visit new cities, have submitted pro-biking testimony, watched more YouTube videos about biking and urbanism than I'd like to admit, spent more time researching e-bikes and bike locks than I'd care to admit, etc etc.

I've been wanting to write up some thoughts on biking for a while but haven't pulle... (read more)

Reply
6Brendan Long1d
I have trouble hitting the exact right amount of warm clothes to bike in. When it's sufficiently cold, I always seem to end up either too cold or too hot (and then I sweat and get cold). I also don't like biking in the rain, since I can technically wear waterproof pants, but they're not comfortable so I need to change at my destination (and potentially change again when I leave).
Adam Zerner1d20

Good points. I don't recall having the same experience about getting too cold or too warm, but it seems like an experience that'd make sense for a lot of people to have, so now I'm wondering why I am not recalling them. I probably either don't remember or am more resistant to getting too hot or too cold.

My waterproof pants go over my regular pants and have buttons to make them relatively easy to take on and take off. It's definitely a little annoying though.

Reply
shawnghu's Shortform
shawnghu1d10

at the end of the somewhat famous blogpost about llm nondeterminism recently https://thinkingmachines.ai/blog/defeating-nondeterminism-in-llm-inference/ they assert that the determinism is enough to make an rlvr run more stable without importance sampling.

is there something i'm missing here? my strong impression is that the scale of the nondeterminism of the result is quite small, and random in direction, so that it isn't likely to affect an aggregate-scale thing like the qualitative effect of an entire gradient update. (i can imagine that the accumulation... (read more)

Reply
Raemon's Shortform
Raemon2d20

Awhile ago I wrote:

There's a frame where you just say "no, rationality is specifically about being a robust agent. There are other ways to be effective, but rationality is the particular way of being effective where you try to have cognitive patterns with good epistemology and robust decision theory."

This is in tension with the "rationalists should win", thing. Shrug.

I think it's important to have at least one concept that is "anyone with goals should ultimately be trying to solve them the best way possible", and at least one concept that is "you might con

... (read more)
Reply
Drake Thomas's Shortform
Drake Thomas2d192

Basically all major world currencies have a base unit worth at most 4 US dollars: 

Image

There’s a left tail going out to 2.4e-5 USD with the Iranian rial, but no right tail. Why is that?

One thing to note is that this is a recent trend: for most of US history (until 1940 or so), a US dollar was worth about 20 modern dollars inflation-adjusted. I thought maybe people used dimes or cents as a base unit then, but Thomas Jefferson talks about "adopt[ing] the Dollar for our Unit" in 1784. 

You could argue that base units are best around $1, but historical coi... (read more)

Reply
DAL2d20

Interesting question.  The "base" unit is largely arbitrary, but the smallest subunit of a currency has more practical implications, so it may also help to think in those terms.  Back in the day, you had all kinds of wonky fractions but now basically everyone is decimalized, and 1/100 is usually the smallest unit.  I imagine then that the value of the cent is as important here as the value of the dollar.

Here's a totally speculative theory based on that.  

When we write numbers, we have to include any zeros after the decimal but you never... (read more)

Reply
8Thomas Kwa2d
My views which I have already shared in person: * The reason old currency units were large is because they're derived from weights of silver (eg a pound sterling or the equivalent French system dating back to Charlemagne), and the pound is a reasonably-sized base unit of weight, so the corresponding monetary unit was extremely large. There would be nothing wrong with having $1000 currency units again, it's just that we usually have inflation rather than deflation. In crypto none of the reasonably sized subdivisions have caught on and it seems tolerable to buy a sandwich for 0.0031 ETH if that were common. * Currencies are redenominated after sufficient inflation only when the number of zeros on everything gets unwieldy. This requires replacing all cash and is a bad look because it's usually done after hyperinflation, so countries like South Korea haven't done it yet. * The Iranian rial's exchange rate, actually around 1e-6 now, is so low partly due to sanctions, and is in the middle of redenomination from 10000 rial = 1 toman. * When people make a new currency, they make it similar to the currencies of their largest trading partners for convenience, hence why so many are in the range of the USD, euro and yuan. Various regional status games change this but not by an order of magnitude, and it is conceivable to me that we could get a $20 base unit if they escalate a bit.
Load More