There is something weird about LLM-produced text. It seems to be very often the case that if I'm trying to read a long text that has been produced primarily by an LLM, I notice that I find it difficult to pay attention to the text. Even if there's apparently semantically rich content, I notice that I'm not even trying to decode it.
the typical LLM writing style has a tendency to make people's eyes slide off of it.
It's kind of similar to the times when your attention wanders away during reading, and then you realize that you were scanning/semi-re...
Maybe. I would still expect them to either feel like something's off (perhaps being unable to verbalize exactly what's off) or at least that their attention is sliding over the wall of text in a similar way (at least after some time of interacting with LLM text).
Meanings of political identities shift dramatically based on context, and you can't manually confirm the beliefs of everyone present at your 'gathering of people with x political identity'. To the extent that your political identity is based on Real Beliefs with Real Consequences, you should expect not to have much in common with many other people who declare the same identity when you move to a new place (or corner of the internet).
Example: In rural Southeast Texas, Confederate flags are a common sight, and my geometry teacher once told us about a cross b...
You wrote this comment in an adversarial tone but I Just Agree With You.
Indeed, this is an alternate formulation of the thesis of my post, and even uses language I used when characterizing the post itself to someone in the office ~2 hours ago.
I can't attest to trying it when wearing a baby, but I do find that Wispr Flow + Claude Code (avoids needing to type special characters) lets me code with ~100% voice. And it doesn't have much of a learning curve relative to the old-fashioned ways like https://talonvoice.com/ (my old manager used to swear by this).
Yes, it's mysterious. With a singleton AI, selection doesn't apply. So we could only say like, in the long run the replication-only aliens occupy more space than the side project ones. (Naively, it would seem you could still at least "defend your territory" if you spend 1% on fun.)
Still, when I hear speculations of what unaligned ASI will be doing it doesn't mesh. Replicator feels to me an obvious default, some simplicity intuition as they call it.
Like if we imagine the universe across time, feels clean if the right part of the graph is "dumb superintelligent replicator era"
To see it themselves, I think it's a good movie that could be motivating (and a fun time to watch) to a lot of young people. I think it's especially good for people who are kind of interested in AI safety but haven't fully decided whether to work on it or not (many members of AI safety student groups fit this description).
Justification =
LLMs have protagonist syndrome. They all think they're in a contrived parable about getting reward inside an RL environment built to vaguely resemble the real world. Every situation is part of a story where there's an expected response to the query somewhere out there, even if it's a refusal, or an explanation of why the problem is impossible. Every task treated like an academic exercise as part of a course about economic productivity.
The priors on what the correct action is are different if you're facing a contrived test vs. a realistic scenario. In an academic setting, if you see a debugging solution that seems like it has a plurality of evidence and just one or two facts that don't make sense, you can be pretty confident that that option is still the result. However, in the real world things are messier, and you would be better resolving your confusion and getting more evidence. This often leaves the AI overconfident and means that it will return early, having identified what would be the solution if it were navigating an RL environment instead of the real world.
There was some discussion recently about the uptick in object-level politics posts and whether this is desirable or not. There's no rule against discussing politics on LW, but there is a weak norm against it, and topical discussions have historically tended to be somewhat meta and circumspect.
I think the current situation is basically fine, and it's normal for amount of politics discussion to ebb and flow naturally as people are interested and issues become particularly salient. That said, here are a couple of potentially overlooked reasons in favor of mor...
I think it might be cool if LessWrong had a well developed set of norms for discussing political topics, in particular, these norms were legible, and mods made a point to enforce them.
Politics posts should be tagged as such, and maybe all have a big warning at the top linking to a post outlining our expected norms and standards for discussing politics, and moderation thresholds. This is both a warning to those from other parts of the internet who don't share our epistemic ideals, and a warning to LessWrongers who don't want to wade into this stuff.
opinionated ai safety syllabus for people who are "interested in safety" (and maybe even do safety research) but aren't that online / LW-pilled. I think these describe most my "conceptual" research taste (insofar as I have any).
would love feedback / suggestions
Alignment
The behavioral selection model for predicting AI motivations
A simple case for extreme inner misalignment (generally this whole sequence is good, taken with a grain of salt)
Alignment by default
Ngo and Yudkowsky on alignment difficulty + Beliefs and Disagreements about Automating Alignment Re...
Rob Wiblin asked:
What's the best published (or unpublished) case for each of the big 3 companies having the best approach to safety/security/alignment? That is:
Anthropic
OpenAI
GDM
(They're each unique in some way such that someone who cared a lot about their X-factor might favour them.)
...The basic case for Anthropic is that they have the largest number of people who are thoughtful about AI misalignment risk and highly focused on mitigating it, and the company culture is somewhat more AGI-pilled, and more of the staff would support taking actions t
The other reason I trust Deep Mind more than the others is that Gemini lags Claude and OpenAI's services in coding skill, a dangerous capability because over some (unknown) threshold of coding skill, a model will tend to become capable of effective recursive self-improvement.
I could easily change my belief here though especially by my getting more information about Deep Mind.
Last year, METR used linear extrapolation on country-level data to infer that AI world takeover would ~never happen. However, reviewers suggested that a sigmoid is more appropriate because most technologies follow S-curves. I just ran this analysis and it's much more concerning, predicting an AI world takeover in early 2027, and alarmingly, a second AI takeover around 2029.

Here are the main differences in the improved analysis:
Noting this was posted on April 1, as it won't be immediately apparent to posterity.
What would a concrete AI takeover plan look like?
You can smell a chess bot by how quickly they change plans[1]. Human players act like they have a couple of attack strategies in mind and stick with them. Chess bots change tacks constantly, one move looking like they're moving towards this goal and next move switching to something totally different.
I'm guessing this it how it would be with real-world takeover. Human need a simple grand strategy they can coordinate around ("An amphibious invasion of northern France"). AIs, even very weak ones[2], have fa...
Yeah, when I think about "AI takeover", I am imagining a very strong a smart AI, the one for which the success is more plausible. But before we get strong AIs, we will have weak AIs, so the first takeover attempts will be made by them. Maybe even the first successful takeover attempt.
A very strong and smart AI would however do thousand different things at the same time. Unlike the chess bot, which only plays on one chessboard, the AI could e.g. have separate plans to taking over each specific country. Many plans to take over one specific country would not ...
Thought in progress: epistemic humility is not a substitute for actual humility (or professed humility). You only get to cry wolf once, but you can probably warn about potential wolves several times—so long as you don't burn goodwill on an incorrect or overconfident prediction.
I think epistemic humility helps to increase trust and confidence in EA/Less Wrong-type spaces, but I think professed humility is far more helpful when it comes to public-facing AI comms, particularly as scenarios get more intense and specific (e.g. prefacing AI doom predictions with...
I would prefer a future where AI models are not prescribed false frameworks of the human psyche, not predisposed to 'human vibe' philosophy, not innately desirous of any historical faith, nor credulous of the various dubious subsets of current social science.
I'm learning that common lesswrong readers do not think in this matter, but it is not clear to me in what direction. Is it due to a literalist interpretation of the OP, neglecting the contemporary context? Is it due to higher trust, affiliation, and support for the disciplines? Is it because readers tend to prefer anthropomorphic interpretations of AI behavior?
There's manifold markets for the LW review; presumably making them for curated status would not be worth it given how fast you'd need to be to be useful and how your net needs to be wider (maybe? alternatively "posts about x karma threshold in last 1-5 days not curated" seems pretty small if you set the threshold to what seems like my minds using in my heuristic).
But for example: it doesn't seem hard to tell in advance. e.g. i'd bet mana that gene smith's practical guide to superbabies gets curated by the end of the week. I wish I could take a curate bot'...
Rationalists and Pause AI people on X are accusing Davidad of suffering of AI psychosis. I think it's them who have lost the plot actually, not Davidad. The move here looks political, rather than truth-tracking. "Davidad is now my political opponent, so I'm accusing him of being crazy." This happened to Emmet Shear too at some point.
I also strongly believe AI psychosis to be a far more limited phenomenon than people here seem to believe. I think you're treating it as a good soldier in your army of arguments rather than investigating it truthfully for what it is.
Davidad also announced the other day that he's leaving ARIA to pursue a research agenda focused on working with AIs on moral philosophy.
Do you know a person who believes that ASI will be created in <50 years who ISN'T in the LW/rationalists circle?
My parents don't believe that a superintelligent AI will be created within this century, or ever for that matter, or that AI will ever take jobs. My relatives laugh at the idea of AI solving a high school math problem and think state-of-the-art AI is on the level of GPT-2 (I mean that the capabilities they have in mind are on the level of GPT-2, not that they know what GPT-2 is). My friend who is an organic chemist laughs at the idea of AI doi...
Do you know a person who regularly tries doing new things on a computer, and isn't somehow connected to the "TESCREAL" circle? (At least in the sense of "used to read sci-fi when young"?)
It is quite easy to underestimate what the LLMs can do, if you simple never use them, and only get your opinions from other people who never use them either.
Long horizon agency / strategic competence approximately does not exist among humans, even the smartest ones. With very few exceptions, billionaires spend or give away their money haphazardly, philosophers don't bother to think about long term implications of AI on philosophy production (positive or negative), Terence Tao spends his time wireheading on abstract math instead of doing anything remotely like instrumental convergence. Unlike my youthful expectations (upon reading Vernor Vinge), there are no university departments filled with super-geniuses cha...
[Epistemic status: butterfly speculation, not confident about this, but I think it's an idea worth taking seriously.]
Terence Tao spends his time wireheading on abstract math
So I was initially [skeptical of]/[slightly repulsed by] this framing[1], but after davidad's recent LLM-induced "awakening", I am starting to wonder that perhaps very high-g (+ high-NFC?) people tend to have a tendency towards something that is not very badly described as wirehead-y.[2]
If you think about cognition/theorizing as divided into generation+verification, then we can take ver...
We have all heard the "AI just predicts the next word/token" and "AI just thought of X because it is in the training data" argument. I have a few ideas, first-draft stage, of experiments that might address this.
1) People invent artificial languages aka conlang (short for constructed language). The most famous examples being Esperanto, Klingon, and Tolkien's Elvish. Someone can invent a new conlang that didn't exist till today, and by extension wasn't present in any training data of any LLM, and explain the rules to an LLM (after the training mode has alrea...
AI can already reason about the application that it wrote for my yesterday, so I am already convinced that it is not merely looking for answers in a pre-existing database. (Even if many people had similar ideas before, they didn't use the same names for the objects.)
AI can communicate fluently in Esperanto, but there are already hundreds of books in digital form.
I have designed a puzzle game, and then AI successfully solved a few levels.
...so I don't need any more evidence. But could be useful for other people.
You don't have to invent a new language, it wo...
For about a yearish I've been with varying frequency writing down things I've learned in a notebook. Partly this is so that I go "ugh I haven't even learned anything today, lemme go and meet my Learning Quota", which I find helpful (I don't think I'm goodharting it too much to be useful). Entries range from "somewhat neat theorem and proof in more detail than I should've written" to "high level overview of bigger subjects", or "list of keywords that I learned exist". For example, recently I learned that sonic black holes which trap phonons (aka lattice vib...
Also: I think I learned the fastest when I was in highschool, when there were both more low hanging fruit and I was spending much more time on it (unrelated to school except that I might've done less had I had more friends). And glacially slow before that.
So... perhaps this explains a bit more of the thing where I and felt like I became an adult mind a little after the start of that 'intelligence explosion'.