Politics is a hard subject to discuss rationally. LessWrong has a developed a unique set of norms and habits around politics. Our aim to allow for discussion to happen (when actually important) while hopefully avoiding many pitfalls and distractions. 

Recent Discussion

I saw Eliezer Yudkowsky at a grocery store in Los Angeles yesterday. I told him how cool it was to meet him in person, but I didn’t want to be a douche and bother him and ask him for photos or anything.
He said, “Oh, like you’re doing now?”
I was taken aback, and all I could say was “Huh?” but he kept cutting me off and going “huh? huh? huh?” and closing his hand shut in front of my face. I walked away and continued with my shopping, and I heard him chuckle as I walked off. When I came to pay for my stuff up front I saw him trying to walk out the doors with ... (read more)

I was thinking about the reaction to my posts over the past 48 hours, and it occurred to me that maybe they would receive a better reaction if I did a better job of demonstrating the epistemic virtues championed by the rationality community.

The all-time greatest hits for demonstrating epistemic virtue are, of course, falsifiable predictions about the outcome of an experiment that has not been conducted, could not be conducted using the resources available to the predictor, and which matter a great deal to the community assessing the epistemic virtue of the predictor.

So I thought I would post my theory of the etiology of Alzheimer's disease, along with a description of a low-cost treatment that this theoretical etiology would suggest would treat and even reverse the symptoms...

Yeah this sounds good to me.
How will we handle the desk drawer effect, where insignificant results are quietly shelved? I guess if the trial is preregistered this won't happen...

Yes, we could require the study to be preregistered. OR to have significant-enough results - say, effect sizes greater than RCTs of the current standard treatment? (Unless the current treatments really suck, I haven't looked into it)

https://chat.openai.com/share/068f5311-f11a-43fe-a2da-cbfc2227de8e Here are ChatGPT's speculations on how much it would cost to run this study. I invite any interested reader to work on designing this study. I can also write up my theories as to why this etiology is plausible in arbitrary detail if that is decision-relevant to someone with either grant money or interest in helping to code up the smartphone app we would need. to collect the relevant measurements cheaply. (Intuitively, it would be something like a Dual N-Back app, but more user-friendly for Alzheimer's patients.)

Ok, so we both had some feelings about the recent Conjecture post on "lots of people in AI Alignment are lying", and the associated marketing campaign and stuff

I would appreciate some context in which I can think through that, and also to share info we have in the space that might help us figure out what's going on. 

I expect this will pretty quickly cause us to end up on some broader questions about how to do advocacy, how much the current social network around AI Alignment should coordinate as a group, how to balance advocacy with research, etc.


Feelings about Conjecture post:

  • Lots of good points about people not stating their full beliefs messing with the epistemic environment and making it costlier for others to be honest.
  • The lying and
  • I think that the AI safety community in general (including myself) was too pessimistic about OpenAI's strategy of gradually releasing models (COI: I work at OpenAI), and should update more on that mistake.

I agree with this!

I thought it was obviously dumb, and in retrospect, I don't know.

2Eli Tyre1h
I think it remains to be seen what the right level of pessimism was. It still seems pretty likely that we'll see not just useless, but actively catastrophically counter-productive interventions from governments in the next handful of years. But you're absolutely right that I was generally pessimistic about policy interventions from 2018ish through to 2021 or so.  My main objection was that I wasn't aware of any policies that seemed like they helped and I was unenthusiastic about the way that EAs seemed to be optimistic about getting into positions of power without (seeming to me) to be very clear-to-themselves that they didn't have policy ideas to implement.  I felt better about people going into policy to the extent that those people had clarity for themselves, "I don't know what to recommend if I have power. I'm trying to execute one part of a two part plan that involves getting power and then using that to advocate for x-risk mitigating policies. I'm intentionally punting that question to my future self / hoping that other EAs thinking full time about this come up with good ideas." I think I still basically stand by this take. [1] My main update is it turns out that the basic idea of this post was false. There were developments that were more alarming than "this is business as usual" to a good number of people and that really changed the landscape.  One procedural update that I've made from that and similar mistakes is just "I shouldn't put as much trust in Eliezer's rhetoric about how the world works, when it isn't backed up by clearly articulated models. I should treat those ideas a plausible hypotheses, and mostly be much more attentive to evidence that I can see directly."     1. ^ Also, I think that this is one instance of the general EA failure mode of pursuing a plan which entails accruing more resources for EA (community building to bring in more people, marketing to bring in more money, politics to acquire power), without a clear persona
2Eli Tyre3h
I think it would be a very valuable public service to the community to have someone who’s job it is to read a journalist’s corpus and check if it seems fair and honest. I think we could, as a community, have a policy of only talking with journalists who are honest. This seems like a good move pragmatically, because it means coverage of our stuff will be better on average, and it also universalizes really well, so long as “honest” doesn’t morph into “agrees with us about what’s important.” It seems good and cooperative to disproportionately help high-integrity journalists get sources, and it helps us directly.
17Eli Tyre3h
My impression is that this was driven by developments in AI, which created enough of a sense that other people might predict that other people would take concern seriously, because they could all just see ChatGPT. And this emboldened people. They had more of a sense of tractability. And Eliezer, in particular, went on a podcast, and it went better than he anticipated, so he decided to do more outreach. My impression is that this is basically 0 to do with FTX?

Short version: In a saner world, AI labs would have to purchase some sort of "apocalypse insurance", with premiums dependent on their behavior in ways that make reckless behavior monetarily infeasible. I don't expect the Earth to implement such a policy, but it seems worth saying the correct answer aloud anyway.



Is advocating for AI shutdown contrary to libertarianism? Is advocating for AI shutdown like arguing for markets that are free except when I'm personally uncomfortable about the solution?

Consider the old adage "your right to swing your fists ends where my nose begins". Does a libertarian who wishes not to be punched, need to add an asterisk to their libertarianism, because they sometimes wish to restrict their neighbor's ability to swing their fists?

Not necessarily! There are many theoretical methods available...

I think there isn't an issue as long as you ensure property rights for the entire universe now. Like if every human is randomly assigned a silver of the universe (and then can trade accordingly), then I think the rising tide situation can be handled reasonably. We'd need to ensuring that AIs as a class can't get away with violating our existing property rights to the universe, but the situation is analogous to other rights.

This is a bit of an insane notion of property rights and randomly giving a chunk to every currently living human is pretty arbitrary, but I think everything works fine if we ensure these rights now.

1Christopher King12h
Human labor becomes worthless but you can still get returns from investments. For example, if you have land, you should rent the land to the AGI instead of selling it.
People who have been outcompeted won't keep owning a lot of property for long. Something or other will happen to make them lose it. Maybe some individuals will find ways to stay afloat, but as a class, no.
1Kabir Kumar14h
What about regulations against implementations of known faulty architectures?

Status: Vague, sorry. The point seems almost tautological to me, and yet also seems like the correct answer to the people going around saying “LLMs turned out to be not very want-y, when are the people who expected 'agents' going to update?”, so, here we are.


Okay, so you know how AI today isn't great at certain... let's say "long-horizon" tasks? Like novel large-scale engineering projects, or writing a long book series with lots of foreshadowing?

(Modulo the fact that it can play chess pretty well, which is longer-horizon than some things; this distinction is quantitative rather than qualitative and it’s being eroded, etc.)

And you know how the AI doesn't seem to have all that much "want"- or "desire"-like behavior?

(Modulo, e.g., the fact that it can play chess pretty...

An oracle doesn't have to have hidden goals. But when you ask it what actions would be needed to do the long term task, it chooses the actions that lead to that would lead to that task being completed. If you phrase that carefully enough maybe you can get away with it. But maybe it calculates the best output to achieve result X is an output that tricks you into rewriting itself into an agent. etc.

In general, asking an oracle AI any question whose answers depend on the future effects in the real world of those answers would be very dangerous.

On the other ha... (read more)

I agree with the main point of the post. But I specifically disagree with what I see as an implied assumption of this remark about a "quantitative gap". I think there is a likely assumption that the quantitative gap is such that the ability to play chess better would correlate with being higher in the most relevant quantity. Something that chooses good chess moves can be seen as "wanting" its side to do well within the chess game context. But that does not imply anything at all outside of that context. If it's going to be turned off if it doesn't do a particular next move, it doesn't have to take that into account. It can just play the best chess move regardless, and ignore the out-of-context info about being shut down. LLMs aren't trained directly to achieve results in a real-world context. They're trained: 1. to emit outputs that look like the output of entities that achieve results (humans) 2. to emit outputs that humans think are useful, probably typically with the humans not thinking all that deeply about it To be sure, at least item 1 above would eventually result in selecting outputs to achieve results if taken to the limit of infinite computing power, etc., and in the same limit item 2 would result in humans being mind-controlled. But both these items naturally better reward the LLM for appearing agentic than for actually being agentic (being agentic = actually choosing outputs based on their effect on the future of the real world). The reward for actually being agentic, up to the point that it is agentic enough to subvert the training regime, is entirely downstream of the reward for appearance of agency.  Thus, I tend to expect the appearance of agency in LLMs to be Goodharted and discount apparent evidence accordingly. Other people look at the same evidence and think it might, by contrast, be even more agentic than the apparent evidence due to strategic deception. And to be sure, at some agency level you might get consistent strategic deception to
If it's true, why is shutdown problem not solved? Even if it's true that any behavior can be represented as EUM, it's at least not trivial.
This is actually a partially solved issue, see here: https://www.lesswrong.com/posts/sHGxvJrBag7nhTQvb/invulnerable-incomplete-preferences-a-formal-statement-1 Also, this: It is trivial, since everything is an EUM for a utility function under the behaviorist definition.
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with

You know it must be out there, but you mostly never see it.

Author's Note 1: In something like 75% of possible futures, this will be the last essay that I publish on LessWrong.  Future content will be available on my substack, where I'm hoping people will be willing to chip in a little commensurate with the value of the writing, and (after a delay) on my personal site (not yet live).  I decided to post this final essay here rather than silently switching over because many LessWrong readers would otherwise never find out that they could still get new Duncanthoughts elsewhere.

Author’s Note 2: This essay is not intended to be revelatory.  Instead, it’s attempting to get the consequences of a few very obvious things lodged into your brain, such...

I think your posts have been among the very best I have seen on LessWrong or elsewhere. Thank you for your contribution. I understand, dimly from the position of an outsider but still, I understand your decision, and am looking forward to reading your posts on your substack.

Any community which ever adds new people will need to either routinely teach the new and (to established members) blindingly obvious information to those who genuinely haven’t heard it before, or accept that over time community members will only know the simplest basics by accident of osmosis or selection bias. There isn’t another way out of that. You don’t get to stop doing it. If you have a vibrant and popular group full of people really interested in the subject of the group, and you run it for ten years straight, you will still sometimes run across people who have only fuzzy and incorrect ideas about the subject unless you are making an active effort to make Something Which Is Not That happen.

Or in other words; I...

What would you say to the academic solution to needing 101 spaces? AKA opening posts with a list of prerequisites and recommended prior readings, and setting a norm that pointing people to that list is acceptable if they make a comment that demonstrates lack of familiarity with same?
Just checking we're on the same page: academic programs usually have a clear track students move through, with a graph of prerequisites. The idiom "101 space" originates with the way University programs are numbered, with the 100s place denoting what year a uni student expects to take a class. If you don't have the prereqs or aren't able to handle a course, you don't take it/drop out and do the lower material. We're talking about that solution, right? Seems decent for them, though obviously students sometimes wind up above or below their capacity. You can kind of approximate this with cohorts in some places. I like the norm of putting required/suggested readings at the top of posts expanding on material or pointing someone to a short, specific post that walks them through a mistake or gap. I kind of don't like a norm of "read this two thousand page tome" even as it can be tempting sometimes, mostly because I don't think people are going to take me up on it very often. There's an is/ought distinction around here. Sometimes someone makes a reply that indicates they didn't read the whole tweet, you know? Whether or not your 101 space works (do people use it, do they come out knowing what you want) is relevant to whether it is functioning, even if we think they ought to use it.
  1. Yes we're talking about the same solution. However, academic institutions usually also have many options for elective courses, which still have prerequisites. That seems like a closer analogy that required courses for a major. Universities also have lectures/seminars/colloquiums that are nominally open to anyone, but that doesn't mean anyone will be able to actively participate in practice, though usually they'll be welcome as long as they're making an effort to learn and aren't disruptive.
  2. I agree very few people will take up the suggestion to read 2000 pa
... (read more)

It has been brutal out there for someone on my beat. Everyone extremely hostile, even more than usual. Extreme positions taken, asserted as if obviously true. Not symmetrically, but from all sides nonetheless. Constant assertions of what happened in the last two weeks that are, as far as I can tell, flat out wrong, largely the result of a well-implemented media campaign. Repeating flawed logic more often and louder.

The bright spot was offered by Vitalik Buterin, who offers a piece entitled ‘My techo–optimism,’ proposing what he calls d/acc for defensive (or decentralized, or differential) accelerationism. He brings enough nuance and careful thinking, and clear statements about existential risk and various troubles ahead, to get strong positive reactions from the worried. He brings enough credibility and track record,...

The quote from Schmidhuber literally says nothing about human extinction being good. I'm disappointed that Critch glosses it that way, because in the past he has been leveler-headed than many, but he's wrong. The quote is: Humans not being the "last stepping stone" towards greater complexity does not imply that we'll go extinct. I'd be happy to live in a world where there are things more complex than humans. Like it's not a weird interpretation at all -- "AI will be more complex than humans" or "Humans are not the final form of complexity in the universe" simply says nothing at all about "humans will go extinct." You could spin it into that meaning if you tried really hard. But -- for instance-- the statement could also be about how AI will do science better than humans in the future, which was (astonishingly) the substance of the talk in which this statement took place, and also what Schmidhuber has been on about for years, so it probably is what he's actually talking about. I note that you say, in your section on tribalism. It would be great if people were a tad more hesitant to accuse others of wanting omnicide.

Ehhh, I get the impression that Schidhuber doesn't think of human extinction as specifically "part of the plan", but he also doesn't appear to consider human survival to be something particularly important relative to his priority of creating ASI. He wants "to build something smarter than myself, which will build something even smarter, et cetera, et cetera, and eventually colonize and transform the universe", and thinks that "Generally speaking, our best protection will be their lack of interest in us, because most species’ biggest enemy is their own kind... (read more)