Rob Bensinger

Communications lead at MIRI. Unless otherwise indicated, my posts and comments here reflect my own views, and not necessarily my employer's.

Comments

What should we do once infected with COVID-19?

A couple minutes after I wrote this question I found out Scott Alexander said July 29:

I don't have the energy to write a 5000 word blog post explaining my reasoning, but I think ≤10% chance HCQ has clinically significant effects against COVID, chances of really impressive effects even lower.

What should we do once infected with COVID-19?

What's your current epistemic state re hydroxychloroquine?

What is Ra?

From a January 2017 Facebook conversation:

Rob B: I gather Ra is to a first approximation just 'the sense that things are impersonally respectable / objective / authoritative / credible / prestigious / etc. based only on superficial indirect indicators of excellence.'

Ruby B: I too feel like I do not understand Ra. [...] Moloch, in my mind, was very clearly defined. For any given thing, I could tell you confidently whether it was Moloch or not. I can't do that with Ra. Also, Moloch is a single clear concept while Ra seems to be a vague cluster if it's anything. [...]

Rob B: Is there anything confusing or off about the idea that Ra is 'respectability and prestige maintained via surface-level correlates of useful/valuable things that are not themselves useful/valuable (in the context at hand)'? Either for making sense of Sarah's post or for applying the concept to real-world phenomena?

Ruby B: Yes, there is something off about that summary since the original post seems to contain a lot more than "seeking prestige via optimizing for correlates of value than actual value". [...] If your summary is at the heart of it, there are some links missing to the "hates introspection", "defends itself with vagueness, confusion, incoherence." [...]

Rob B: There are two ideas here:

(1) "a drive to seek prestige by optimizing for correlates of value that aren't themselves valuable"

(2) "a (particular) drive toward inconsistency / confusion / vagueness / ambiguity"

The connection between these two ideas is this paragraph in Sarah's essay:

"'Respectability' turns out to be incoherent quite often — i.e. if you have any consistent model of the world you often have to take extreme or novel positions as a logical conclusion from your assumptions. To Ra, disrespectability is damnation, and thus consistent thought is suspect."

1 is the core idea that Sarah wants to point to when she says "Ra". 2 is a particular phenomenon that Sarah claims Ra tends to cause (though obviously lots of other things can cause fuzzy/inconsistent thinking too, and a drive toward such). Specifically, Sarah is defining Ra as 1, and then making the empirical claim that this is a commonplace drive, that pursuing any practical or intellectual project sufficiently consistently will at least occasionally require one to either sacrifice epistemics or sacrifice prestige, and that the drive is powerful enough that a lot of people do end up sacrificing epistemics when that conflict arises.

Ruby B: Okay, yeah, I can start to see that. Thanks for making it clearer to me, Rob!

Rob B: I think Sarah's essay is useful and coherent, but weirdly structured: she writes a bunch of poetry and mentions a bunch of accidental (and metaphorical, synesthetic, etc.) properties of Ra before she starts to delve into Ra's essential properties. I think part of why I didn't find it confusing was that I skimmed the early sections and got to the later parts of the essay that were more speaking-to-the-heart-of-the-issue, then read it back in reverse order. :P So I got to relatively clear things like the Horus (/ manifest usefulness / value / prestige-for-good-reasons) vs. Ra (empty respectability / shallow indicators of value / prestige-based-on-superficial-correlates-of-excellence) contrast first:

"Horus likes organization, clarity, intelligence, money, excellence, and power — and these things are genuinely valuable. If you want to accomplish big goals, it is perfectly rational to seek them, because they’re force multipliers. Pursuit of force multipliers — that is, pursuit of power — is not inherently Ra. There is nothing Ra-like, for instance, about noticing that software is a fully general force multiplier and trying to invest in or make better software. Ra comes in when you start admiring force multipliers for no specific goal, just because they’re shiny."

And:

"When someone is willing to work for prestige, but not merely for money or intrinsic interest, they’re being influenced by Ra. The love of prestige is not only about seeking 'status' (as it cashes out to things like high quality of life, admiration, sex), but about trying to be an insider within a prestigious institution."

(One of the key claims Sarah makes about respectability and prestige maintained via surface-level correlates of useful/valuable things that are not themselves useful/valuable (/ Ra) is that this kind of respectability accrues much more readily to institutions, organizations, and abstractions than to individuals. Thus a lot of the post is about how idealized abstractions and austere institutions trigger this lost-purposes-of-prestige mindset more readily, which I gather is because it's harder to idealize something concrete and tangible and weak, like an individual person. Or maybe it has to do with the fact that it's harder to concretely visualize the proper function and work of something that's more abstract and large-scale, so it's easier to lose sight of the rationale for what you're seeing?)

"Seen through Ra-goggles, giving money to some particular man to spend on the causes he thinks best is weird and disturbing; putting money into a foundation, to exist in perpetuity, is respectable and appropriate. The impression that it is run collectively, by 'the institution' rather than any individual persons, makes it seem more Ra-like, and therefore more appealing."

All of that stuff makes sense. The earlier stuff from the first 2 sections of the post doesn't illuminate much, I think, unless you already have a more specific sense of what Sarah means by "Ra" from the later sections.

Ruby B: Your restructuring and rephrasing is vastly more comprehensible. That said, poetry and poetic imagery is nice and I don't begrudge Sarah her attempt.

And given your explanation, perhaps your summary description could be made slightly more comprehensive (though less comprehensible) like so:

"Ra is a drive to seek prestige by optimizing for correlates of value that aren't themselves valuable because you have forgotten the point of the correlates was to attain actual value." [...]

Rob B: Maybe "Ra is a drive to seek prestige by optimizing for correlates of value, in contexts where the correlates are not themselves valuable but this fact is made non-obvious by the correlate's abstract/impersonal/far-mode-evoking nature"?

Rob B's Shortform Feed

From Facebook:

Mark Norris Lance: [...] There is a long history of differential evaluation of actions taken by grassroots groups and similar actions taken by elites or those in power. This is evident when we discuss violence. If a low-power group places someone under their control it is kidnapping. If they assess their crimes or punish them for it, it is mob justice or vigilanteism. [...]

John Maxwell: Does the low power group in question have a democratic process for appointing judges who then issue arrest warrants?

That's a key issue for me... "Mob rule" is bad because the process mobs use to make their judgements are bad. Doubly so if the mob attacks anyone who points that out.

A common crime that modern mobs accuse people of is defending bad people. But if people can be convicted of defending bad people, that corrupts the entire justice process, because the only way we can figure out if someone really is bad is by hearing what can be said in their defense.

Relevant pre-AGI possibilities

Yet almost everyone agrees the world will likely be importantly different by the time advanced AGI arrives.

Why do you think this? My default assumption is generally that the world won't be super different from how it looks today in strategically relevant ways. (Maybe it will be, but I don't see a strong reason to assume that, though I strongly endorse thinking about big possible changes!)

Evan Hubinger on Inner Alignment, Outer Alignment, and Proposals for Building Safe Advanced AI

A part I liked and thought was well-explained:

I think there's a strong argument for deception being simpler than corrigibility. Corrigibility has some fundamental difficulties in terms of... If you're imagining gradient descent process, which is looking at a proxy aligned model and is trying to modify it so that it makes use of this rich input data, it has to do some really weird things to make corrigibility work.

It has to first make a very robust pointer. With corrigibility, if it's pointing at all incorrectly to the wrong thing in the input data, wrong thing in the world model, the corrigible optimizer won't correct that pointer. It'll just be like, "Well, I have this pointer. I'm just trying to optimize for what this thing is pointing for," and if that pointer is pointing at a proxy instead, you'll just optimize that proxy. And so you have this very difficult problem of building robust pointers. With deception, you don't have this problem. A deceptive model, if it realizes the loss function is different than what it thought, it'll just change to doing the new loss function. It's actually much more robust to new information because it's trying to do this instrumentally. And so in a new situation, if it realizes that the loss function is different, it's just going to automatically change because it'll realize that's the better thing to do instrumentally.

What are the risks of permanent injury from COVID?

The last time I saw it mentioned that COVID-19 can cause pulmonary fibrosis, it was in the context of autopsies. Do we have any more evidence about whether fibrosis is occurring in survivors, and if so about how common it is?

Rob B's Shortform Feed

Devoodooifying Psychology says "the best studies now suggest that the placebo effect is probably very weak and limited to controlling pain".

Load More