The thing I’m pretty worried about here is people running around saying ‘Eliezer advocated violence’, and people hearing ‘unilaterally bomb data centers’ rather than ‘build an international coalition that enforces a treaty similar to how we treat nuclear weapons and bioweapons, and enforce it.”
I hear you saying (and agree with) “guys you should not be oblivious to the fact that this involves willingness to use nuclear weapons” Yes I agree very much it’s important to stare that in the face.
But “a call for willingness to use violence by state actors” is just...
people hearing ‘unilaterally bomb data centers’ rather than ‘build an international coalition that enforces a treaty similar to how we treat nuclear weapons and bioweapons, and enforce it.”
It is rare to start wars over arms treaty violations. The proposal considered here -- if taken seriously -- would not be an ordinary enforcement action but rather a significant breach of sovereignty almost without precedent within this context. I think it's reasonable to consider calls for preemptive war extremely seriously, and treat it very differently than if one had proposed e.g. an ordinary federal law.
It seems like this makes all proposed criminalization of activities punished by death penalty a call for violence?
Yes! Particularly if it's an activity people currently do. Promoting death penalty for women who get abortion is calling for violence against women; promoting death penalty for apostasy from Islam is calling for violence against ex-apostates. I think if a country is contemplating passing a law to kill rapists, and someone says "yeah, that would be a great fuckin law" they are calling for violence against rapists, whether or not it is justified.
I don't really care whether something occurs beneath the auspices of supposed international law. Saying "this co...
In the past few weeks I've noticed a significant change in the Overton window of what seems possible to talk about. I think the broad strokes of this article seem basically right, and I agree with most of the details.
I don't expect this to immediately cause AI labs or world governments to join hands and execute a sensibly-executed-moratorium. But I'm hopeful about it paving the way for the next steps towards it. I like that this article, while making an extremely huge ask of the world, spells out exactly how huge an ask is actually needed.
Many people...
Yeah, this comment seemed technically true but seems misleading with regards to how people actually use words
It is advocating that we treat it as the class-of-treaty we consider nuclear treaties, and yes that involves violence, but "calls for violence" just means something else.
The use of violence in case of violations of the NPT treaty has been fairly limited and highly questionable in international law. And, in fact, calls for such violence are very much frowned upon because of fear they have a tendency to lead to full scale war.
No one has ever seriously suggested violence as a response to potential violation of the various other nuclear arms control treaties.
No one has ever seriously suggested running a risk of nuclear exchange to prevent a potential treaty violation. So, what Yudkowsky is suggesting i...
These measures let us talk about things like bottlecaps as optimizers much more precisely.
I'm a bit surprised this line came up in counterfactual optimization rather than robustness of optimization. I think the reason a bottlecap isn't an optimizer is that if you change the environment around it it doesn't keep the water in the bottle. I felt like I understood the counterfactual optimization consideration but don't know how it applies here.
fyi it looks like you have a lot of background reading to do before contributing to the conversation here. You should at least be able to summarize the major reasons why people on LW frequently think AI is likely to kill everyone, and explain where you disagree.
I'd start reading here: https://www.lesswrong.com/posts/LTtNXM9shNM9AC2mp/superintelligence-faq
(apologies both to julie and romeo for this being kinda blunt. I'm not sure what norms romeo prefers on his shortform. The LessWrong mod team is trying to figure out what to do about the increa...
I agree. But the point is, in order to do the thing that the CEO actually wants, the AI needs to understand goodness at least as well as the CEO. And this isn't, like, maximal goodness for sure. But to hold up under superintelligent optimization levels, it needs a pretty significantly nuanced understanding of goodness.
I think there is some disagreement between AI camps about how difficult it is to get to the level-of-goodness the CEO's judgment represents, when implemented in an AI system powerful enough to automate scientific research.
I think the "a...
What do you think will actually happen with the term notkilleveryonism?
When you say you're worried about "nonkilleveryoneism" as a meme, you mean that this meme (compared to other descriptions of "existential risk from AI is important to think about") is usually likely to cause this foot-in-mouth-quietly-stop reaction, or that the nature of the foot-in-mouth-quietly-stop dynamic just makes it hard to talk about at all?
It was useful to me to know this was happening, thanks.
This is a bit later-than-usual, but, curated.
I've continued to appreciate Natalia digging into the details here. The spot check on "did lab and/or wild animals get more obese" seemed pretty significant. I also liked tying everything in at the end to a concrete metaculus prediction.
I'm not sure how to best engage with individual posts, but I had thoughts on the "Alignment" !== "Good" post.
I agree it's useful for alignment not to be a fully-general-goodness word, and to have specific terminology that makes it clear what you're talking about.
But I think there are a desiderata the word originally meant in this context, and I think it's an important technical point that the line between "generically good" and "outer aligned" is kinda vague. I do think there are important differences between them but I think some of the confusion li...
I still claim this should be three paragraphs. In this breaking at section 4 and section 6 seems to carve it at reasonable joints.
I predict most people will have an easier time reading the second one that the first one, holding their jargon-familiarity constant. (the jargon basically isn't at all a crux for me at all)
(I bet if we arranged some kind of reading comprehension test you would turn out to do better at reading-comprehension for paragraph-broken abstracts vs single-block abstracts. I'd bet this at like 70% confidence for you-specifically, and... like 97% confidence for most college-educated people)
A few reasons I expect this to be true (other than just generalizing from my e...
However, if your post doesn't look like a research article, you might have to format it more like one (and even then it's not guaranteed to get in, see this comment thread).
I interpreted this as saying something superficial about style, rather than "if your post does not represent 100+ hours of research work it's probably not a good fit for archive." If that's what you meant I think the post could be edited to make that more clear.
If the opening section of your essay made it more clear which posts it was talking about I'd probably endorse it (although I'm not super familiar with the nuances of arXiv gatekeeping so am mostly going off the collective response in the comment section)
Oh yay. Thanks! Yeah that's much better.
Another possibility is that LessWrong is swamped with AI safety writing, and so people don't want any more of it unless it's really good. They're craving variety.
I think this is a big part of it.
I'm also confused about the degree of downvotes. (It's not really new content for LessWrong but I'm happy to see more rationality content on the margin, even if it's re-covering the basics)
(I do think opening with "you have 'zero' chance of being intellectually wise without this" is some combination of "not necessarily true" and "sure sounds like you need to have resolved the ambiguity of what counts as intellectually wise to be sure of that", and wish that line was different)
Yeah this seems plausibly good
Yeah I do think writing a post that actually-tabooed-frame-control would be good. (The historical reason this post doesn't do that is in large part because I initially wrote a different post, called "Distinctions in Frame Control". realized that post didn't quite have enough of a purpose, and sort of clarified my goal at the last minute and then hastily retrofitted the post to make it work.)
Indeed, I found myself sufficiently impatient to read such a post that I wrote it myself…
FWIW I did quite appreciate that comment. I may have more to say about it later...
So, recap that I think the word "frame" is used metaphorically for three different things:
For "everything is coordination + cryptography" guy, I'm thinking mostly in terms of "framework" (although frameworks tend to also imply which-parts-of-reality-to-pay-attention-to).
The way they model society routes through a structure...
FYI, I updated this post somewhat in response to some of your comments here (as well as some other commenters in other venues like FB and my workplace slack). The current set of updates is fairly small (adding a couple sentences and changing wordings). But there's a higher level problem that I think requires reworking the post significantly. I'm probably just going to write a followup post optimized a bit differently.
In this post I was deliberately trying not to be too opinionated about which things "count as frame control", "is frame control bad?" or what...
I buy that people who read abstracts all day get better at reading them, but I'm... pretty sure they're just kinda objectively badly formatted and this'd at least save time learning to scan it.
Like looking at the one you just linked
...The ATLAS Fast TracKer (FTK) was designed to provide full tracking for the ATLAS high-level trigger by using pattern recognition based on Associative Memory (AM) chips and fitting in high-speed field programmable gate arrays. The tracks found by the FTK are based on inputs from all modules of the pixel and silicon microstr
I mean the control group here is "not doing evals", which eventually autofails.
Do you have a link to a specific part of the gwern site highlighting this, and/or a screenshot?
I... kinda want to ping @Jeffrey Ladish about how this post uses "play to your outs", which is exactly the reason I pushed against that phrasing a year ago in Don't die with dignity; instead play to your outs.
A high level thing about LessWrong is that we're primarily focused on sharing information, not advocacy. There may be a later step where you advocate for something, but on LessWrong the dominant mode is discussing / explaining it, so that we can think clearly about what's true.
Advocacy pushes you down a path of simplifying ideas rather than clearly articulating what's true, and pushing for consensus for the sake of coordination regardless of whether you've actually found the right thing to coordinate on.
"What is the first step towards alignment" isn'...
Relevant:
Mod note. (LW mods are trying out moderating in public rather than via PMs. This may feel a bit harsh if you're not used to this sort of thing, but we're aiming for a culture where feedback feels more natural. I think is important to do publicly for a) accountability and b) so people can form a better model of how the LW moderators operate)
I do think globally banning autonomous weapons is a reasonable idea, but the framing of this post feels pretty off.
I downvoted for the first paragraph, which makes an (IMO wrong) assumption that this is the first step to...
Yeah. I had a goal with the "Keep your beliefs cruxy and your frames explicit" sequence to eventually suggest people do this for this reason (among others), but hadn't gotten around to that yet. I guess this new post is maybe building towards a post on that.
Actual answer is that Eliezer has tried a bunch of different things to lose weight and it's just pretty hard. (He also did a quite high-effort thing in 2019 which did work. I don't know how well he kept the pounds off in the subsequent time)
You can watch a fun video where he discusses it after the 2019 Solstice here.
(I'm not really sure how I feel about this post. It seems like it's coming from an earnest place, and I kinda expect a other people to have this question, but it's in a genre that feels pretty off to be picking on individual people about and I ...
and/or exert pressure to fall in line with that frame
This line makes me realize I was missing one subcomponent of frame control. We have
But then there's "pressure/threaten someone into adopting a frame". The line between pressure and "merely expressing confidence" might feel blurry in some cases, but the difference is intended to be "there's an implication that if you don't adopt the frame, you will be socially punished".
Yeah, basically agreed that this is what's going on.
I agree that listening in a collaborative way is a good thing to do when you have a friend/colleague in this situation.
I'm not sure what to do in the context of this post, if the problem comes up organically. The collaborative listening thing seems to work best in a two-person pair, not an internet forum. I guess "wait for it to come up" is fine.
I had a discussion with on Facebook about this post, where someone felt my examples seemed pointed a different definition of frame control than them. After some back-and-forth and some confusion on my part, it seemed like their conception of frame control was something more like 'someone is trying to control you, and they happen to be using frames to do it', whereas my conception here was more like 'someone is trying to control your frame.'
I'm not actually sure how different these turn out to be in practice. If someone is controlling your frame, they're al...
Notes: this was tagged 'effective altruism', but on LessWrong 'effective altruism' tag is used to talk about the movement at a meta level, and this post should be classified as 'world optimization'.
A thing that occurs to me, as I started engaging with some comments here as well as on a FB thread about this:
Coercion/Abuse/Manipulation/Gaslighting* often feel traumatic and triggering, which makes talking about them hard.
One of the particular problems with manipulation is that it's deliberately hard to take about or find words to explain what's wrong about it. (if you could easily point to the manipulation, it wouldn't be very successful manipulation). Successful manipulators tailor their manipulations towards precisely the areas where their marks don't...
What I think is problematic is that some people are able to make genuine threats to get their way, enforcing compliance with their values and language and preferences and norms
One of my main points here is that I think we probably should call threatening behavior "threatening" and maybe "coercive" or "abusive" or whatever seems appropriate for the situation, and only use the phrase 'frame control' when the most relevant thing is that someone is controlling a frame. (And, maybe, even then try to say a more specific thing about what they're doing, if y...
The adjective “manipulatively” here seems like it is not justified by the preceding description.
The intended justification is the previous sentence:
Years later looking back, you might notice that they always changed the topic, or used various logical fallacies/equivocations, or took some assumptions for granted without ever explaining them.
I'm surprised you don't consider that sort of thing manipulative. Do you not?
Yeah this variant does feel more like explicit frame control (I think "frame manipulation", although it feels like it strains a bit with the cluster I'd originally been thinking of when I described it)
Next I asked it:
It responded with this image:
code:
<svg width="300" height="300" viewBox="0 0 300 300" xmlns="http://www.w3.org/2000/svg">
<!-- Background circle -->
<circle cx="150" cy="150" r="140" fill="none" stroke="black" stroke-width="2"/>
<!-- Body -->
<ellipse cx="150" cy="100" rx="30" ry="40" fill="none" stroke="black" stroke-width="2"/>
<rect x="140" y="140" width="20" height="60" fill="none" stroke="black" stroke-width="2"/>
<line x1="100" y1="140" x2="200" y2="140" stroke="black" stroke-w
... The lecturer talks about how objects move, without reference to the emotions of people around them or what spirits think.
Something I like about this is that "without reference to the emotions of people around them" is actually legitimately a contender for "meaningful frame." Like, cars move because people decide to drive them, soil gets moved around because humans wanted a nicer landscaping, dams get built because beavers decided to do it.
Eventually Jupiter might get disassembled because powerful AI decided to. This will not necessarily route through...
Here was the final one:
<svg viewBox="0 0 800 600" xmlns="http://www.w3.org/2000/svg">
<!-- Field -->
<rect x="0" y="0" width="100%" height="100%" fill="#8BC34A"/>
<!-- Sky and sun -->
<rect x="0" y="0" width="100%" height="40%" fill="#90CAF9"/>
<circle cx="700" cy="100" r="50" fill="#FFEB3B"/>
<!-- Mountains -->
<polygon points="100,300 300,100 500,300" fill="#BDBDBD"/>
<polygon points="350,400 550,200 750,400" fill="#9E9E9E"/>
<!-- Castle -->
<rect x="200" y="150" width="2
... I tried again, accidentally using GPT3.5 this time, which initially gave something really lame, but then said "more realistic please", and it gave me:
Note that ASCII art isn't the only kind of art. I just asked GPT4 and Claude to both make SVGs of a knight fighting a dragon.
Here's Claude's attempt:
And GPT4s:
I asked them both to make it more realistic. Claude responded with the exact same thing with some extra text, GPT4 returned:
I asked followed up asking it for more muted colors and a simple background, and it returned:
Do you have particular examples of non-profound ideas you think are being underexplored?
I wanna flag the distinction between "deep" and "profound". They might both be subject to the same bias you articulate here, but I think they have different connotations, and I think important ideas are systematically more likely to be "deep" than they are likely to be "profound." (i.e. deep ideas have a lot of implications and are entangled with more things than 'shallow' ideas. I think profound tends to imply something like 'changing your conception of something that was fairly important in your worldview.')
i.e. profound is maybe "deep + contrarian"
This post was oriented around the goal of "be ready to safely train and deploy a powerful AI". I felt like I could make the case for that fairly straightforwardly, mostly within the paradigm that I expect many AI labs are operating under.
But one of the reasons I think it's important to have a strong culture of safety/carefulness, is in the leadup to strong AI. I think the world is going to be changing rapidly, and that means your organization may need to change strategies quickly, and track your impact on various effects on society.
Some examples of problem...
I'm specifically talking about the reference class of nuclear and bioweapons, which do sometimes involve invasion or threat-of-invasion of sovereign states. I agree that's really rare, something we should not do lightly.
But I don't think you even need Eliezer-levels-of-P(doom) to think the situation warrants that sort of treatment. The most optimistic people I know of who seem to understand the core arguments say things like "10% x-risk this century", which I think is greater than x-risk likelihood from nuclear war.