Wiki Contributions


At first maybe you try to argue with them about it. But over time, a) you find yourself not bothering to argue with them

>Whose fault is that, exactly…?

b) even when you do argue with them, they’re the ones choosing the terms of the argument.


If they think X is important, you find yourself focused on argue whether-or-not X is true, and ignoring all the different Ys and Zs that maybe you should have been thinking about.



I agree that nothing about the examples you quote is unacceptably bad – all these things are "socially permissible." 

At the same time, your "Whose fault is that, exactly...?" makes it seem like there's nothing the guru in question could be doing differently. That's false.

Sure, some people are okay with seeing all social interactions as something where everyone is in it for themselves. However, in close(r) relationship contexts (e.g. friendships, romantic relationships, probably also spiritual mentoring from a guru?), many operate on the assumption that people care about each other and want to preserve each other's agency and help each other flourish. In that context, it's perfectly okay to have an expectation that others will (1) help me notice and speak up if something doesn't quite feel right to me (as opposed to keeping quiet) and (2) help me arrive at informed/balanced views after carefully considering alternatives, as opposed to only presenting me their terms of the argument.

If the guru never says "I care about you as a person," he's fine to operate as he does. But once he starts to reassure his followers that he always has their best interest in mind – that's when he crosses the line into immoral, exploitative behavior. 

You can't have it both ways. If your answer to people getting hurt is always "well, whose fault was that?" 

Then don't ever fucking reassure them that you care about them!

In reality, I'm pretty sure "gurus" almost always go to great lengths convincing their followers that they care more about them than almost anyone else. That's where things become indefensible.

Compare the two: 

(1) The difference between "bad frame control" and "good frame control" is that, in the latter, the frame matches physical reality and social reality.

Here, I use "social reality" in the sense of "insights about what types of actions or norms help people flourish."

(2) The difference between lying and telling the truth is that, when someone doesn't lie, what they say matches physical reality and social reality.

I feel like there's a sense in which (1) is true, but it's missing the point if someone thinks that this is the only difference. If you lie a lot around some subject matter, or if you manipulate someone with what aella calls frame control, there's always an amount of friction around the subject matter or around the frame that you introduce. This friction wouldn't be there if you're going with the truth. The original post points out how frame controllers try to hide that sort of friction or bring down your defenses against it. Those subtleties are what's bad about the bad kind of frame control. Noticing these subtleties is what it's all about.

Someone might object as follows: 

"Friction" can mean many things. If you try to push people to accomplish extraordinary feats with your weird-seeming startup, you have to motivate them and push against various types of "friction" – motivate your co-workers, make them okay with being seen as weird as long as your idea hasn't succeeded, etc.

I agree with all that. Good leaders have to craft motivating frames and inspire others with their vision. But I still feel like that's not the same thing as what happens in (the bad kind of) frame control. The word "control" is a clue about where the difference lies. It's hard to pin down the exact difference. Maybe it's something like this: 

Good leadership is about offering frames to your followers that create win-win situations (for them and for the world!) by appealing to virtues that they already endorse, deliberately drawing attention to all the places where there's friction from social conventions or inertia/laziness, but presenting a convincing vision about why it's worth it to push against that friction. 

By contrast, frame control (the bad, sneaky/coercive kind) is about guilting people into thinking it's their fault if they struggle because of the friction, or trying to not have them notice that there are alternative frames for them. 

Hm, maybe. I can see that frame control comes in handy when you're a general in a war, or a CEO of a startup (and probably at least some generals or CEOs are good people with good effects on the world). However, in wartime, it feels like a necessary evil to have to convince your soldiers to march to the their death. And in startups – I don't know, cultishness can have its advantages, but I feel like the best leadership is NOT turning your underlings into people who look cultish to outsiders. So, I think the good version of frame control is generally weaker than the bad version, for instance because good leaders don't have anything to fear in terms of their followers becoming better at passing Ideological Turing tests for opposing views. But I guess that's just expressing your point in different words: we can say that, if our frame is aligned with physical reality and avoids negative social outcomes, it shouldn't look like the people who buy into it are cultists.

I also think it's informative to think about the context of a romantic relationship. In that context, I'm not sure there's a version of "good frame control" that's necessary. Except maybe for frames like "good communication is important" – if one person so far struggled to express their needs because they weren't taken seriously in their past life, it can be good for both individuals if the more securely attached person pushes that kind of frame. However, the way you would do that isn't by repeating "good communication is important" as a mantra or weapon to shame the other person for not communicating the way you want! Instead, you try showing them the benefits of good communication, convincing them through evidence of how nice it feels when it works. That's very different from the bad type of frame control in relationships. Also, let's say you have two people who already understand that good communication is important. Then no one is exerting any frame control – you simply have two happy people who live in the same healthy frame. And insofar as they craft features of their personal "relationship frame," it's a mutual sort of thing, so no one is exactly exerting any sort of control.

These examples, and the fact that you can have relationships (not just romantic ones) where something feels mutual rather than "control exerted by one party," makes me think that there's more to it than "good frame control differs from bad frame control merely in terms of correspondence to physical reality (and social reality)." I guess it depends what we mean by "social reality." I think bad frame control is primarily about a lack of empathy, and that happens to leave a very distinct pattern, which you simply can't compare to "good leadership."

Edit: I saw another commenter making a good point in reply to your comment. What you call "good frame control" is done out in the open. The merits of good frames are often self-evident or at least verifiable. By contrast, the OP discusses (bad) frame control as a type of sneak attack. It tries to overcome your epistemic defenses.

Yeah, what I meant was the belief that there's no incorrect way to set up a language game.

and some audiences have measurably better calibration.

It's not straightforward in all contexts to establish what counts as good calibration. It's straightforward for empirical forecasting, but if we were to come up with a notion like "good calibration for ethical judgments," we'd have to make some pretty subjective judgment calls. Similarly, something like "good calibration for coming up with helpful abstractions for language games" (which we might call "doing philosophy" or a subskill of it) also seems (at least somewhat) subjective. 

That doesn't mean "anything goes," but I don't yet see how your point about dialogue trees applies to "maybe a society of AIs would build abstractions we don't yet understand, so there'd be a translation problem between their language games and ours." 

There are correct and incorrect ways to play language games.

That's the crux. Wittgenstein himself believed otherwise and spent the most part of the book arguing against it. I think he makes good points.

At one point, he argues that there's no single correct interpretation for "What comes next in the sequence: '2, 4, 6, 8, 10, 12, ...?'" 

Maybe this goes a bit too far. :) I think he's right in some nitpicky sense, but for practical purposes, sane people will say "14" every time and that works well for us.

We can see this as version of realism vs anti-realism debates: realism vs anti-realism about natural abstractions.  As I argue in the linked post, anti-realism is probably the right way of looking at most or even all of these, but that doesn't mean "anything goes." Sometimes there's ambiguity about our interpretations of things, but reality does have structure, and "ambiguity" isn't the same as "you can just make random stuff up and expect it to be useful."

Thanks for sharing your reasoning, that was very interesting to read! I kind of agree with the worldview outlined in the quoted messages from the "Closing-Office-Reasoning" channel. Something like "unless you go to extreme lengths to cultivate integrity and your ability to reason in truth-tracking ways, you'll become a part of the incentive-gradient landscape around you, which kills all your impact."

Seems like a tough decision to have to decide whether an ecosystem has failed vs. whether it's still better than starting from scratch despite its flaws. (I could imagine that there's an instinct to just not think about it.)

Sometimes we also just get unlucky, though. (I don't think FTX was just bad luck, but e.g., with some of the ways AI stuff played out, I find it hard to tell. Of course, just because I find it hard to tell doesn't mean it's objectively hard to tell. Maybe some things really were stupid also when they happened, not just in hindsight.)

I'm curious if you think there are "good EA orgs" where you think the leadership satisfies the threshold needed to predictably be a force of good in the world (my view is yes!). If yes, do you think that this isn't necessarily enough for "building the EA movement" to be net positive? E.g., maybe you think it boosts the not-so-good orgs just as much as the good ones, and "burns the brand" in the process? 

I'd say that, if there are some "good EA orgs," that's a reason for optimism. We can emulate what's good about them and their culture. (It could still make sense to be against further growth if you believe the ratio has become too skewed.) Whereas, if there aren't any, then we're already in trouble, so there's a bit of a wager against it.

I think “Luck could be enough” should be the strong default on priors,2 so in some sense I don’t think I owe tons of argumentation here (I think the burden is on the other side).

I agree with this being the default and the burden being on the other side. At the same time, I don't think of it as a strong default.

Here's a frame that I have that already gets me to a more pessimistic (updated) prior:

It has almost never happened that people who developed and introduced a revolutionary new technology displayed a lot of foresight about its long-term consequences. For instance, there were comparatively few efforts at major social media companies to address ways in which social media might change society for the worse. The same goes for the food industry and the obesity epidemic or online dating and its effects on single parenthood rates. When people invent cool new technology, it makes the world better on some metrics but creates new problems on its own. The whole thing is accelerating and feels out of control.

It feels out of control because even if we get cool new things from tech progress, we don't seem to be getting any better at fixing the messiness that comes with it (misaligned incentives/goodhearting, other Molochian forces, world-destroying tech becoming ever more accessible). Your post says "a [] story of avoiding catastrophe by luck." This framing makes it sound like things would be fine by default if it isn't for some catastrophe happening. However, humans have never seemed particularly "in control" over technological progress. For things to go well, we need the opposite of a catastrophe – a radical change towards the upside. We have to solve massive coordination problems and hope for a technology that gives us god-like power, finally putting sane and compassionate forces in control over the future. It so happens that we can tell a coherent story about how AI might do this for us. But to say that it might go right just by luck – I don't know, that seems far-fetched!

All of that said, I don't think we can get very far arguing from priors. What carries by far the most weight are arguments about alignment difficulty, takeoff speeds, etc. And I think it's a reasonable view to say that it's very unlikely that any researchers currently know enough to make highly confident statements about these variables. (Edit: So, I'm not sure we disagree too much – I think I'm more pessimistic about the future than you are, but I'm probably not as pessimistic as the position you're arguing against in this post. I mostly wanted to make the point that I think the "right" priors support at least moderate pessimism, which is a perspective I find oddly rare among EAs.) 

FWIW, it's not obvious to me that slow takeoff is best. Fast takeoff at least gives you god-like abilities early on, which are useful from a perspective of "we were never particularly in control over history; lots of underlying problems need fixing before we pass a point of no return." By contrast, with slow takeoff, coordination problems seem more difficult because (at least by default) there will be more actors using AIs in some ways or other and it's not obvious that the AIs in a slow-takeoff scenario will be all that helpful at facilitating coordination. 

I don't understand this sentence:

Oh, yeah, I butchered that entire description. 

It's the gap between the training compute of 'the first AGI' and what?

What I had in mind was something like the gap between how much "intelligence" humans get from the compute they first build AGI with vs. how much "intelligence" AGI will get out of the same compute available, once it optimizes software progress for a few iterations.

So, the "gap" is a gap of intelligence rather than compute, but it's "intelligence per specified quantity of compute." (And that specified quantity is how much compute we used to build AGI in the first place.)

And I asked some friends what "hardware overhang" means and they had different responses (a plurality said it means sufficient hardware for human-level AI already exists, which is not a useful concept).

It's not a useful concept if we can't talk about the probability of "finding" particularly efficient AGI architectures through new insights. However, it seems intelligible and strategically important to talk about something like "the possibility that we're one/a few easy-to-find insight(s) away from suddenly being able to build AGI with a much smaller compute budget than the largest training runs to date." That's a contender for the concept of "compute overhang." (See also my other comment.) 

Maybe what you don't like about this definition is that it's inherently fuzzy: even if we knew everything about all possible AGI architectures, we'd still have uncertainty about how long it'll take AI researchers to come up with the respective insights. I agree that this makes the concept harder to reason about (and arguably less helpful).

Load More