Politics is a hard subject to discuss rationally. LessWrong has a developed a unique set of norms and habits around politics. Our aim to allow for discussion to happen (when actually important) while hopefully avoiding many pitfalls and distractions. 

Rationality+Rationality+World Modeling+World Modeling+AIAIWorld OptimizationWorld OptimizationPracticalPracticalCommunityCommunity
Personal Blog+
We know that "AI is whatever doesn't work yet". We also know that people often contrast AI (or DL, or LLMs specifically) derogatorily with classic forms of software, such as regexps: why use a LLM to waste gigaflops of compute to do what a few good regexps could...? So I am amused to discover recently, by sheer accident while looking up 'what does the "regular" in "regular expression" mean, anyway?', that it turns out that regexps are AI. In fact, they are not even GOFAI symbolic AI, as you immediately assumed on hearing that, but they were originally connectionist AI research! Huh? Well, it turns out that 'regular events' were introduced by Kleene himself with the justification of modeling McCulloch-Pitts neural nets! (Which are then modeled by 'regular languages' and conveniently written down as 'regular expressions', abbreviated to 'regexps' or 'regexes', and then extended/bastardized in countless ways since.) The 'regular' here is not well-defined, as Kleene concedes, and is a gesture towards modeling 'regularly occurring events' (that the neural net automaton must process and respond to). He admits "regular" is a terrible term, but no one came up with anything better, and so we're stuck with it.
AI safety people often emphasize making safety cases as the core organizational approach to ensuring safety. I think this might cause people to anchor on relatively bad analogies to other fields. Safety cases are widely used in fields that do safety engineering, e.g. airplanes and nuclear reactors. See e.g. “Arguing Safety” for my favorite introduction to them. The core idea of a safety case is to have a structured argument that clearly and explicitly spells out how all of your empirical measurements allow you to make a sequence of conclusions that establish that the risk posed by your system is acceptably low. Safety cases are somewhat controversial among safety engineering pundits. But the AI context has a very different structure from those fields, because all of the risks that companies are interested in mitigating with safety cases are fundamentally adversarial (with the adversary being AIs and/or humans). There’s some discussion of adapting the safety-case-like methodology to the adversarial case (e.g. Alexander et al, “Security assurance cases: motivation and the state of the art”), but this seems to be quite experimental and it is not generally recommended. So I think it’s very unclear whether a safety-case-like structure should actually be an inspiration for us. More generally, I think we should avoid anchoring on safety engineering as the central field to draw inspiration from. Safety engineering mostly involves cases where the difficulty arises from the fact that you’ve built extremely complicated systems and need to manage the complexity; here our problems arise from adversarial dynamics on top of fairly simple systems built out of organic, hard-to-understand parts. We should expect these to be fairly dissimilar. (I think information security is also a pretty bad analogy--it’s adversarial, but like safety engineering it’s mostly about managing complexity, which is not at all our problem.)
I frequently find myself in the following situation: Friend: I'm confused about X Me: Well, I'm not confused about X, but I bet it's because you have more information than me, and if I knew what you knew then I would be confused. (E.g. my friend who know more chemistry than me might say "I'm confused about how soap works", and while I have an explanation for why soap works, their confusion is at a deeper level, where if I gave them my explanation of how soap works, it wouldn't actually clarify their confusion.) This is different from the "usual" state of affairs, where you're not confused but you know more than the other person. I would love to have a succinct word or phrase for this kind of being not-confused!
Fabien RogerΩ25444
I listened to the book Protecting the President by Dan Bongino, to get a sense of how risk management works for US presidential protection - a risk that is high-stakes, where failures are rare, where the main threat is the threat from an adversary that is relatively hard to model, and where the downsides of more protection and its upsides are very hard to compare. Some claims the author makes (often implicitly): * Large bureaucracies are amazing at creating mission creep: the service was initially in charge of fighting against counterfeit currency, got presidential protection later, and now is in charge of things ranging from securing large events to fighting against Nigerian prince scams. * Many of the important choices are made via inertia in large change-averse bureaucracies (e.g. these cops were trained to do boxing, even though they are never actually supposed to fight like that), you shouldn't expect obvious wins to happen; * Many of the important variables are not technical, but social - especially in this field where the skills of individual agents matter a lot (e.g. if you have bad policies around salaries and promotions, people don't stay at your service for long, and so you end up with people who are not as skilled as they could be; if you let the local police around the White House take care of outside-perimeter security, then it makes communication harder); * Many of the important changes are made because important politicians that haven't thought much about security try to improve optics, and large bureaucracies are not built to oppose this political pressure (e.g. because high-ranking officials are near retirement, and disagreeing with a president would be more risky for them than increasing the chance of a presidential assassination); * Unfair treatments - not hardships - destroy morale (e.g. unfair promotions and contempt are much more damaging than doing long and boring surveillance missions or training exercises where trainees actually feel the pain from the fake bullets for the rest of the day). Some takeaways * Maybe don't build big bureaucracies if you can avoid it: once created, they are hard to move, and the leadership will often favor things that go against the mission of the organization (e.g. because changing things is risky for people in leadership positions, except when it comes to mission creep) - Caveat: the book was written by a conservative, and so that probably taints what information was conveyed on this topic; * Some near misses provide extremely valuable information, even when they are quite far from actually causing a catastrophe (e.g. who are the kind of people who actually act on their public threats); * Making people clearly accountable for near misses (not legally, just in the expectations that the leadership conveys) can be a powerful force to get people to do their job well and make sensible decisions. Overall, the book was somewhat poor in details about how decisions are made. The main decision processes that the book reports are the changes that the author wants to see happen in the US Secret Service - but this looks like it has been dumbed down to appeal to a broad conservative audience that gets along with vibes like "if anything increases the president's safety, we should do it" (which might be true directionally given the current state, but definitely doesn't address the question of "how far should we go, and how would we know if we were at the right amount of protection"). So this may not reflect how decisions are done, since it could be a byproduct of Dan Bongino being a conservative political figure and podcast host. 
Here's something that I'm surprised doesn't already exist (or maybe it does and I'm just ignorant): Constantly-running LLM agent livestreams. Imagine something like ChaosGPT except that whoever built it just livestreams the whole thing and leaves it running 24/7. So, it has internet access and can even e.g. make tweets and forum comments and maybe also emails. Cost: At roughly a penny per 1000 tokens, that's maybe $0.20/hr or five bucks a day. Should be doable. Interestingness: ChaosGPT was popular. This would scratch the same itch so probably would be less popular, but who knows, maybe it would get up to some interesting hijinks every few days of flailing around. And some of the flailing might be funny. Usefulness: If you had several of these going, and you kept adding more when new models come out (e.g. Claude 3.5 sonnet) then maybe this would serve as a sort of qualitative capabilities eval. At some point there'd be a new model that crosses the invisible line from 'haha this is funny, look at it flail' to 'oh wow it seems to be coherently working towards its goals somewhat successfully...' (this line is probably different for different people; underlying progress will be continuous probably) Does something like this already exist? If not, why not?

Popular Comments

Recent Discussion


Many Nuance-oors like the promised pragmatic arguments for LVT (densification, lack of deadweight loss) and lament some of the principled arguments against (kicking out grandmas, unbounded gentrification). Here I build the intuition for the economic efficiency of the status quo (from here on SQ) over LVT which moved my opinion from "There is another way, have you heard of LVT?" to "Build more 1. houses 2. transport, it's literally the only way". It is not going to exhaustively prove any point or crunch many statistics but singularly gesture at a theoretical level why SQ has stronger economic arguments and LVT weaker ones than I have once believed. My intent is to put this perspective into the Overton Window, such that smarter people than myself may develop it....


This seems wrong. The construction of a building mainly affects the value of the land around it, not the land on which it sits.

That brings up the splitting question.

If two people owning adjacent parcels of land each build a garbage dump, they will in fact reduce both of their taxes since they each affect each other's.

And if we're going with "the government tracks such things and does calculations to prevent splitting from mattering" then it should be possible to build it on your own land and still get the tax reduction.

The bit about merging the casinos... in the limit, you've got an entire town/city in the desert that is completely owned by one owner, who pays nominally zero land value tax because the property itself isn't worth anything given there's nothing nearby. But it seems plausible to me that having an equation for tracking a multiplicity of independent improvements on a single nominal property and taxing the whole situation accordingly... would be relatively easy compared to the other LVT calculation problems. (I have not done the math here whatsoever.)
The splitting and merging thing is a great point. I sense that @Blog Alt is continuing to missing the point about the "everyone else's improvements" by how they frame it, but once you take splitting and merging into account... ...well, for people who actually live there, hopefully the presence of a new garbage dump would itself be more costly than the decrease in tax. And in principle, if it's NOT more costly, then it would then be correct to build it! (Maybe it's not a dump, maybe it's something else.) So there's a bringing back in of externalities. But of course, if someone doesn't live there... maybe this can be solved by zoning? I'm normally suspicious of zoning but "you can't put a garbage dump next to a school in a neighborhood" seems pretty basic. That still doesn't solve the simple notion of a factory toxic waste pool, but once again, maybe such things should be solved by directly addressing the reason why they're bad.
I've always been a bit confused by "low-income housing'. Is the plan to make the housing cheap via price capping? Won't that have the usual economic issues and cause demand to continue to outstrip supply forever and ever? Is the plan to make the houses ugly as fuck so that they will cost less than the pretty houses nearby? That won't really work; people will rent a closet for $1000/mo in SF sometimes.

I've recently been reading a lot of science fiction. Most won't be original to fans of the genre, but some people might be looking for suggestions, so in lieu of full blown reviews here's super brief ratings on all of them. I might keep this updated over time, if so new books will go to the top.

A deepness in the sky (Verner Vinge)

scifiosity: 10/10
readability: 8/10
recommended: 10/10

A deepness in the sky excels in its depiction of a spacefaring civilisation using no technologies we know to be impossible,  a truly alien civilisation, and it's brilliant treatment of translation and culture.

A fire upon the deep (Verner Vinge)

scifiosity: 8/10
readability: 9/10
recommended: 9/10

In a fire upon the deep, Vinge allows impossible technologies and essentially goes for a slightly more fantasy theme. But his...

It's a prequel in the loosest possible sense. In theory they could be set in two different universes and it wouldn't make much of a difference.

1Stephen Fowler
Updating to say that I just finished the short story "Exhalation" by Ted Chiang and it was absolutely exceptional!  I was immediately compelled to share it with some friends who are also into sci-fi.
1Julian Bradshaw
I read A Deepness in the Sky first, and haven't yet gotten around to A Fire Upon the Deep, and I enjoyed the former quite a lot.
5Seth Herd
Fantastic, much appreciated. It looks like you read scifi with much the same lens I do (very aware of scientific and logical realism), which makes your reviews much more valuable to me than sifting through amazon reviews. I agree with your reviews pretty closely for the maybe 1/2 of your list I've read. I'll just mention a few of my favorites from a similar perspective and try to not spend too much time describing why I love them so despite their minor shortcomings. Charlie Stross is amazing for actual futurism. So is Cory Doctorow, but Stross also happens to be a crack storyteller, with the pacing, characterization, and brevity that turns an interesting idea into a bestseller. Stross's Accelerando is a must-read for any aspiring futurist- which means anyone working on alignment. It's not as good a story, as an early work and collection of connected short stories, but some details of his near-future history toward AGI takeover are highly plausible and non-obvious. His Glass House is simultaneously utopian speculation, and a micro-lens on current gender politics. And a rollicking tampered-memories adventure. And a touching, resonant romance. But no real AI involvement or major insights on possible futures. And low on the science rating. I recommend it to non-scifi people since it's so good as a story and occasional mind=blown moments. The Fractal Prince is also, IMO, a staggering achievement; a story told so beautifully that each paragraph is almost a poem. This doesn't make it easy to follow, but surely you like a little challenge? I found it captivating. It's not about AI as much as brain uploads, and hierarchical self-slavery on an epic scale. It has some dramatic departures into quantum magic, but for the most part its world and plot is driven by theoretically realistic technologies of brain uploading and editing. But it's a world sculpted by poets as well as described by one; it is strange and beautiful first, and a study in futurism second. Few insights fo

I think this is a correct policy goal to coordinate around, and I seem momentum around it building.

Short Summary

LLMs may be fundamentally incapable of fully general reasoning, and if so, short timelines are less plausible.

Longer summary

There is ML research suggesting that LLMs fail badly on attempts at general reasoning, such as planning problems, scheduling, and attempts to solve novel visual puzzles. This post provides a brief introduction to that research, and asks:

  • Whether this limitation is illusory or actually exists.
  • If it exists, whether it will be solved by scaling or is a problem fundamental to LLMs.
  • If fundamental, whether it can be overcome by scaffolding & tooling.

If this is a real and fundamental limitation that can't be fully overcome by scaffolding, we should be skeptical of arguments like Leopold Aschenbrenner's (in his recent 'Situational Awareness') that we can just 'follow straight lines on graphs' and expect AGI...

Wei Dai20

It might also be a crux for alignment, since scalable alignment schemes like IDA and Debate rely on "task decomposition", which seems closely related to "planning" and "reasoning". I've been wondering about the slow pace of progress of IDA and Debate. Maybe it's part of the same phenomenon as the underwhelming results of AutoGPT and BabyAGI?

The Natural Plan paper has an insane amount of errors in it. Reading it feels like I'm going crazy.  This meeting planning task seems unsolvable: The solution requires traveling from SOMA to Nob Hill in 10 minutes, but the text doesn't mention the travel time between SOMA and Nob Hill.  Also the solution doesn't mention meeting Andrew at all, even though that was part of the requirements. Here's an example of the trip planning task: The trip is supposed to be 14 days, but requires visiting Bucharest for 5 days, London for 4 days, and Reykjavik for 7 days. I guess the point is that you can spend a day in multiple cities, but that doesn't match with an intuitive understanding of what it means to "spend N days" in a city.  Also, by that logic you could spend a total of 28 days in different cities by commuting every day, which contradicts the authors' claim that each problem only has one solution.
1Joseph Miller
I think you don't mean this literally as the paper linked does not argue for this actual position. Can you clarify exactly what you mean?
A neural net can approximate any function. Given that LLMs are neural nets, I don't see why they can't also approximate any function/behaviour if given the right training data. Given how close they are getting to reasoning with basically unsupervised learning on a range of qualities of training data, I think they will continue to improve, and reach impressive reasoning abilities. I think of the "language" part of an LLM as like a communication layer on top of a general neural net. Being able to "think out loud" with a train of thought and a scratch pad to work with is a useful thing for a neural net to be able to do, similar to our own trains of thought IMO. It also is useful from a safety stand-point, as it would be quite the feat for back propagation itself to manage to betray us, before the model's own visible thoughts do.

After living in a suburb for most of my life, when I moved to a major U.S. city the first thing I noticed was the feces. At first I assumed it was dog poop, but my naivety didn’t last long.

One day I saw a homeless man waddling towards me at a fast speed while holding his ass cheeks. He turned into an alley and took a shit. As I passed him, there was a moment where our eyes met. He sheepishly averted his gaze.

The next day I walked to the same place. There are a number of businesses on both sides of the street that probably all have bathrooms. I walked into each of them to investigate.

In a coffee shop, I saw a homeless woman ask the...

Okay, I think you've convinced me that there are important ways in which pay toilets might offer a better service than cafe bathrooms.

(I suspect that I was getting myself confused by sort of insisting/thinking "But if everything is exactly the same (, except one of the buildings also sells coffee), then everything is exactly the same!" Which is maybe nearby to some true-ish statements, but gets in the way of thinking about the differences between using a pay toilet and a cafe bathroom.)

(Also, I share your view that bathrooms are excludable and therefore no... (read more)

I'm following up here after doing some reading about public goods. I'm inclined to believe that bathrooms are excludable (because, for example, an entrepreneur can just put a lock on the bathroom that will only open after a credit card swipe/payment) and so are not public goods.  Am I getting this wrong?
I want to clarify a few things before trying to respond substantively. I don't have a well-developed understanding of economics and I'm confused about what meaning the term "effective demand" has in this context. Are you using it the same way that Keynes uses it in The General Theory of Employment, Interest and Money? Or, are you using the term as it is used in this Wikipedia article? Or, maybe instead, can you tell me what is the difference between demand and effective demand? I suspect that you are trying to highlight that destitute people still have preferences even though they do not have any resources to aid in realizing those preferences, but I'm not sure. After doing a bit of reading, it appears to me that one of the required criteria for something to be a public good is for it to be non-excludable.  But aren't bathrooms very excludable?  Just put a lock on the door that will only open after swiping a credit card. Are you pointing out that the homeless have a narrow interest (in the technical economic sense) in the government operating free to use bathrooms?
'Public bathrooms' are definitely not 'public goods', not even close. A mere coincidence of the adjective 'public' meaning 'government run' and 'society-wide' doesn't make them so. The market doesn't provide it because it is outlawed; where it is not outlawed, it is provided; and where outlawed, it is often provided by the market in a different form anyway, like being excluded to only paying patrons of a store or restaurant. They are ordinary excludable private goods; often a club good, where load is low. That is enough disproof of it being a 'public good', but in any case: * government-owned land has property rights, and these are allocated, leased, rented, or sold all the time to private parties all the time, and often building and management of facilities in things like parks are outsourced. This also applies to wanting government-run bathrooms on non-government land - you immediately see the problem. You don't need permission from a skyscraper owner to defend them from North Korea launching nukes at them, which is part of what makes it a public good; you do to install a free bathroom at its base for anyone and everyone to use. Building a bathroom 100 miles away does the people there no good. If it did, then it just might be a public good; but it doesn't, so... * they are extremely excludable: "Excludability refers to the characteristic of a good or service that allows its provider to prevent some people from using it." Obviously, a bathroom (whoever owns or runs it) can be locked, and often is (as are associated buildings like cafes which might give access to said bathroom). If anyone can walk into a government-run - or Starbucks or McDonalds - bathroom without a permit or paying etc, it's because whoever is in charge of that particular bathroom wants that, same as a privately-owned one. * they are by definition rivalrous ("the consumption of a good or service by one person diminishes the ability of another person to consume the same good or

As the title probably already indicates, this post contains community content rather than rationality content. Alternate, sillier version of this post here.


I've been a co-organizer of the Bay Area Rationalist Summer Solstice for the past few years, and I've been thinking about how to make it a more meaningful and engaging experience, like what we have with Winter Solstice. The last few Summer Solstices, which I'd describe as mostly being big picnics, have been fun, but fairly low-effort, low-significance, and I think that's a missed opportunity.

Here's a few things that I'd like more of in Summer Solstice, non-exhaustive:

  1. A sense of a temporary alternate world created around a shared purpose.
  2. Time to connect with people and have deeper conversations.
  3. Longer, more immersive collective experiences and thoughtfully designed rituals.
  4. Thematic resonance with
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with
@Richard_Ngo do you have any alternative approaches in mind that are less susceptible to regulatory capture? At first glance, I think this broad argument can be applied to any situation where the government regulates anything. (There's always some risk that R focuses on the wrong things or R experiences corporate/governmental pressure to push things through). I do agree that the broader or more flexible the regulatory regime is, the more susceptible it might be to regulatory capture. (But again, this feels like it doesn't really have much to do with safety cases– this is just a question of whether we want flexible or fixed/inflexible regulations in general.)
Here's how I understand your argument: 1. Some people are advocating for safety cases– the idea that companies should be required to show that risks drop below acceptable levels. 2. This approach is used in safety engineering fields. 3. But AI is different from the safety engineering fields. For example, in AI we have adversarial risks. 4. Therefore we shouldn't support safety cases. I think this misunderstands the case for safety cases, or at least only argues against one particular justification for safety cases. Here's how I think about safety cases (or really any approach in which a company needs to present evidence that their practices keep risks below acceptable levels): 1. AI systems pose major risks. A lot of risks stem from race dynamics and competitive pressures. 2. If companies were required to demonstrate that they kept risks below acceptable levels, this would incentivize a lot more safety research and curb some of the dangerous properties of race dynamics. 3. Other fields also have similar setups, and we should try to learn from them when relevant. Of course, AI development will also have some unique properties so we'll have to adapt the methods accordingly. I'd be curious to hear more about why you think safety cases fail to work when risks are adversarial (at first glance, it doesn't seem like it should be too difficult to adapt the high-level safety case approach). I'm also curious if you have any alternatives that you prefer. I currently endorse the claim "safety cases are better than status quo" but I'm open to the idea that maybe "Alternative approach X is better than both safety cases and status quo."

Yeah, in your linked paper you write "In high-stakes industries, risk management practices often require affirmative evidence that risks are kept below acceptable thresholds." This is right. But my understanding is that this is not true of industries that deal with adversarial high-stakes situations. So I don't think you should claim that your proposal is backed up by standard practice. See here for a review of the possibility of using safety cases in adversarial situations.

Relevant paper discussing risks of risk assessments being wrong due to theory/model/calculation error. Probing the Improbable: Methodological Challenges for Risks with Low Probabilities and High Stakes Based on the current vibes, I think that suggest that methodological errors alone will lead to significant chance of significant error for any safety case in AI.

Inspired by this staple post on optimality being the "real danger": https://www.lesswrong.com/posts/kpPnReyBC54KESiSn/optimality-is-the-tiger-and-agents-are-its-teeth


The Three Contingencies Of The Advent of ASI (as in, what the outcomes ASI ostensibly inevitably lead to):

  1. The Optimality Function is dominant over all other aspects of the Ai, and we do not know what it will optimize for. It may misalign just as much as Humans do from natural selection, like how human make condoms to actively avoid their latent optimality function–or rather, it may not misalign from its Natural Selection-like Optimality Function at all, and do something like maximally breed with no or much less concern for the various nuanced values of Humans. It may be aligned perfectly to 'human values', but there is no precedent for perfectly predicting an Optimality Function. What are

I know nothing about war except that horseback archers were OP for a long time. But from my point of view, which is blatantly uneducated when it comes to war, being a Russian soldier seems like a miserable experience. It therefore makes me wonder why 300,000 Russian soldiers are willing to risk it all in Ukraine.[1] Why don’t they desert? How does the Russian regime get so many people to fight a war when my home government is struggling to convince me to sort my trash? If the Russian regime can convince so many people to have a shit time in Ukraine, I’d argue that the West could convince these people to go live an easier life. The idea is so simple that by now I mostly wonder...


This sounds like a quick way to have families of Russian soldiers pressured, harassed, imprisoned, or otherwise targeted by authorities or even other civilians.

Furthermore, quite a lot of soldiers actually care about their country and don't want to betray it so completely as would be required here. There's a very large psychological difference between a few groups deserting under intolerable conditions, and wholesale permanent paid defection to assure the failure of their home country's military.

Yeah, this is an interesting proposal. It would make being a Russian soldier in Ukraine more attractive than being a Ukrainian soldier (who's stuck in the trenches) or a Ukrainian male of military age (who can't leave the country). Also, antiwar Russians would find that the fastest way to move to Europe is to enlist and defect (skipping the normal multi-year process of getting citizenship). Also, everyone along the way would want a cut: Russians would start paying money to recruitment offices when they enlist, and paying more money to Ukrainian soldiers when they defect. The economic incentives of the whole thing just get funnier and funnier as I think about it.
Caplan has been saying this intermittently for the past two years.
4Brendan Long
I think the Trojan Horse situation is going to be your biggest blocker, regardless of whether it's a real problem or not. At least in the US, anti-immigration talking points tend to focus on the working age military age men immigrating from a friendly country in order to get jobs. I can't imagine how strong the blowback would be if they were literally Russian soldiers. There's also a repeated-game concern where once you do this, the incentive is for every poor country to invade its neighbors in the hopes of getting its soldiers a cushy retirement and the ability to send remittances. One practical concern from the other side is that if soldiers start defecting, the Russian government can hold their families hostage. This is likely already sort-of the case but could be done in a more heavy-handed way if necessary. That said, I think something like this is probably a good idea if you could someone get past the impossible political situation. US residency alone is worth so much that you might not have to pay soldiers at all (and military age working age immigrants tend to be a net benefit in terms of taxes anyway).