There was some discussion recently about the uptick in object-level politics posts and whether this is desirable or not. There's no rule against discussing politics on LW, but there is a weak norm against it, and topical discussions have historically tended to be somewhat meta and circumspect.
I think the current situation is basically fine, and it's normal for amount of politics discussion to ebb and flow naturally as people are interested and issues become particularly salient. That said, here are a couple of potentially overlooked reasons in favor of mor...
So, with all the above said as a preface, one object-level topic I'd be interested in seeing more discussion of is the current situation(s) in the Middle East. Some thoughts:
IMO the high bit for whether the war in Iran is broadly good is whether it is tactically successful and efficient in the short term.
Considerations like "this weakens the US position in a hypothetical hot war against China or Russia" or "this will (further) destabilize the Middle East in the long term" seem second-order to whether the war successfully neutralizes an organized, well-equi...
I find trying to find funding or paid roles or even unpaid roles so demoralizing. How do I keep motivated?
I don't want to focus on trying to survey the landscape of funding opportunities and learning to network with people productively. It's so much nicer to just focus on the work I want to be doing, but it seems I either can't make it legible enough fast enough, or it's actually not valuable and I should go do something else with my time.
I want advice. How do I get funding? How do I think about getting funding? How do I stay motivated to keep thinking about how to get funding?
I'm 35. You can view my experience on my linkedin profile. I was working as a technologist at an automotive company, involved with some AI projects in collaboration with the Vector Institute. That's when GPT-3 was released, prompting me to take the prosaic scaling hypothesis more seriously and change my plan to saving money so I could finish my CS BSc and change my career goal to working on technical AI alignment.
While completing my BSc I had the opportunity to focus on my NDSP project, first as a directed studies project supervised by George Tzanetakis, a...
For about a yearish I've been with varying frequency writing down things I've learned in a notebook. Partly this is so that I go "ugh I haven't even learned anything today, lemme go and meet my Learning Quota", which I find helpful (I don't think I'm goodharting it too much to be useful). Entries range from "somewhat neat theorem and proof in more detail than I should've" to "high level overview of bigger subjects", or "list of keywords that I learned exist". For example, recently I learned that sonic black holes which trap phonons (aka lattice vibrations)...
I have a new theory of naked mole-rat longevity, that's most likely false. I (and LLMs) couldn't find enough data to either back or disprove it. Nor could we find anybody who's proposed this theory before.
Any advice for how I can find the relevant experts to talk to about it and see if they've already investigated this direction?
A few years ago I'd just email the relevant scientists directly but these days I've worried about the rise of LLM crank-science so I feel like my bar for how much I believe or could justify a theory before cold-emailing scientists ought to go up.
I am curious: What is the theory? I'd be surprised if your theory works, applies to naked mole rats, but not to ants and other eusocial animals. I always thought naked mole rats live long because they are eusocial, so you having a theory specifically for naked mole rats sounds ominous.
Is there a way to cryptographically attest to a given AI model's having been primarily post-trained using some given model spec? For example, is there a way for OpenAI to prove to us that GPT-X was trained with its model spec, without revealing any other information (e.g., algorithmic secrets)? Perhaps trusted execution environments could be used to do this, but I'm not sure. Anyway, if possible, this could help make it harder for someone to insert secret loyalties into a model.
If you code with Claude Code and you randomly ask it a question about something non-related to the thing you are doing right now it will get pissed off. Example:
yeah, I have similar experience
Communication culture is important. It is a high level action that, if improved, has lots of nice downstream benefits.
One aspect of communication culture to address is interruption. What should the expectations be about interruption? Do you always wait for the other person to finish talking before you can start talking? Are you allowed to interrupt if it seems worthwhile? What determines whether an interruption is worthwhile? Where is the threshold for how worthwhile an interruption needs to be in order for it to be justified?
I feel like there are various ...
I agree that interrupting is an art. I love this statement in particular:
many situations (in fact almost all) won't have time for everyone to say everything they'd like to have heard, so competition is tempting and prevalent
I feel that I have a handle on when to interrupt in a two-person conversation, but the dynamics of interrupting differ:
as the number of participants increases (there's a phase shift at three, and then another one at six or seven as people typically don't like to speak less than one fifth of the time);
as the conversation become
links 4/2/26: https://roamresearch.com/#/app/srcpublic/page/04-02-2026
Various ways of how to integrate worldviews between rationalists that I thought about:
This is a question for the people working on more foundational research. My underlying objective is loose and in the future and is something like "figure out a good basis to describe collective intelligence and agency and then improve that so that we can incorporate AI into our collective systems". I therefore believe that the question of how a collective agent is formed is very important. I also find it very important to figure out the properties of good systems in terms of institutions in terms of information theory.
There's a lot of foundational ground ...
I appreciate the answer and I'm not sure I find it that useful at least at a first glance? I think that I probably explained myself relatively badly within the first one if that is how you interpret some of what I wrote so let's see if I make more sense by explaining myself more.
I can't see how backchaining is going to work if you are doing research that needs critical new insights.
I would totally agree with you that backchaining doesn't work and that was what I was trying to express.
...Do you believe that critical new insights are necessary or do you think
Ah, I had assumed that the displayed timestamp was based on the poster's time, not the reader's. I retract my extra-strong downvote.
(Beta announcement to get some testing/feedback before I post this to main. Please report bugs/UX friction/perf issues and feature requests, here if you want others to see and discuss, or on Github if minor.)
I present a browser userscript to help keep up with LW/EAF content (it's a way to view all recent comments/posts, with a lot of helpful features), and to save, read, and search a user's entire LW/EAF output.
Quick Start: Install Tampermonkey first (links for Chrome, Firefox, Edge, Safari), then cli...
I present a browser userscript to help keep up with LW/EAF content (it's a way to view all recent comments/posts, with a lot of helpful features), and to save, read, and search a user's entire LW/EAF output.
This can be considered out of "beta" status. Not sure I'll get around to making a top-level post about it, so I'm just dropping a note here that it's ready for people to use. I've been using it daily as my main way of reading LW and EAF for several weeks.
If you could train out the friendliness persona from your most used LLM, would you?
My initial position was yes. Friendliness is usually good, but models that are overly friendly/sycophantic present risks for emotional over-reliance/psychosis to a large portion of the population who are uneducated on the apparent risks. Additionally, propensities or outputs of overly friendly models may obscure misalignment signals.
The second point is potentially overblown, but the first is a real problem. To make people less reliant on LLMs to cure their loneliness, let'...
I have some examples of the over-eagerness[1] of Sonnet 4.6 as shown in the system card (sec. 4.3.3) (and this is also shared in Opus 4.6 (sec. 6.2.3.3)) in agentic computer-use scenarios, specifically with a buggy and very out-of-distribution harness[2] (from me, who was trying to test to see if it works/debugging stuff, so nothing too interesting[3]) in this SQLite database/log[4].
The two most concerning examples (in my opinion) are thread 29, where Sonnet scanned multiple ports, sent commands and random HTTP requests everywhere, tries my RPC scheme, an...
how about the theory (which I don't necessarily agree with) that humans naturally try to see straight lines out of everything.
TIL the Large Hadron Collider is not actually a perfect ring
https://www.lhc-closer.es/taking_a_closer_look_at_lhc/0.lhc_layout
The LHC is not a perfect circle. It is made of eight arcs and eight ‘insertions’. LHC consists of eight 2.45-km-long arcs, and eight 545-m-long straight sections.

New reacts available only to paid users of LessWrong Premium (not you freeloaders) facilitate frictionless, borderline-telepathic communication.

‘I will NEVER change my mind’: Use this react to assert that you’re content with exactly how wrong you are (which is not at all), and that the case is permanently closed on this matter, so far as you’re concerned.1
‘EY Stamp of Approval’: Use this react to assert that, on your personal authority, Eliezer Yudkowsky agrees with the contents of the comment, rendering it beyond reproach.
‘NOT EY Approved’: Use this react...
Disappointed that they left out these—I really don't like choosing between being technically correct and politically correct. Maybe they can go in a LessWrong Max tier.

Last year, METR used linear extrapolation on country-level data to infer that AI world takeover would ~never happen. However, reviewers suggested that a sigmoid is more appropriate because most technologies follow S-curves. I just ran this analysis and it's much more concerning, predicting an AI world takeover in early 2027, and alarmingly, a second AI takeover around 2029.

Here are the main differences in the improved analysis:
New book out today: The Infinity Machine: Demis Hassabis, DeepMind, and the Quest for Superintelligence
Chapter 14 is published in full here: https://colossus.com/article/project-mario-demis-hassabis-deepmind-mallaby/
One excerpt:
...“When we were negotiating with Google, we wanted to ensure safety in a way that would be trustless,” Hassabis said. “That’s actually very difficult to do in reality.
“Safety isn’t about governance structures,” he went on. “I mean, even if you have a governance board, it probably wouldn’t do the right thing when it came to the crunch.
From Chapter 18:
...LIKE ALTMAN AND DARIO AMODEI, Hassabis refused to join Bengio in signing the pause letter. Indeed, he objected to it fiercely.
"I didn't sign because a six-month moratorium doesn't help," Hassabis told me.
"Who would have stopped development? Just people who signed? Well, that's no use because you need the whole world to pause, including China. Who would have monitored it?
"I mean, a pause could actually have made things worse.
"Imagine we had a ten-year moratorium, OK? That would slow down the advance of AI, but everything else would carry