LESSWRONG
LW

334
Unnamed
7971Ω42811197
Message
Dialogue
Subscribe

Posts

Sorted by New

Wikitag Contributions

Comments

Sorted by
Newest
Diagonalization: A (slightly) more rigorous model of paranoia
Unnamed7h252

I count 5 strategies in this post & the previous one, rather than 3:

  1. Blinding. Block information input from the adversary to you.
  2. Privacy. Block information output from you to the adversary.
  3. Disempowerment. Don't let the adversary have control over parts of the environment that you care about.
  4. Vindictiveness. Do things that are opposed to the adversary's interests.
  5. Randomness. Do things that are hard for the adversary to predict.

#3 Disempowerment was least explicitly stated in your writing but was present in how you talked about purging / removal from your environment. Examples: Don't have a joint bank account with them, don't appoint them to a be in charge of a department in your organization, don't make agreements with them where they have the official legal rights but there's a handshake deal that they'll share things with you. 

Truman's response to the Red Scare included all (or at least most) of the first 4 strategies. It was primarily #2 Privacy - in fact the Soviet spies were mainly doing espionage - acquiring confidential information from the US government - and purging them was blocking them from getting that information. But Truman was worried about them doing subversion (getting the US government to make bad decisions) which would make purging them #3 Disempowerment. And executing them (rather than just firing them) makes it #4 Vindictive too.

The Madman Theory example in the other post is mainly about vindictiveness (it's a threat to retaliate), even though it's done in a way that involves some randomness.

#5 Randomness feels least like a single coherent thing out of these 5. I'd break it into:

5a Maximin. Do things that work out best in the worst case scenario. This often involves a mixed strategy where you randomize across multiple possible actions (assuming you have a hidden source of randomness).

5b Erraticness. Thwart their expectations. Don't do the thing that they're expecting you to do, or do something that they wouldn't have expected.

Though #5b Erraticness seems like an actively bad idea if you have been fully diagonalized, since in case you won't actually succeed at thwarting their expectations and your erratic action will instead be just what they wanted you to do. It is instead a strategy for cat-and-mouse games where they can partially model you but you can still hope to outsmart them.

If you have been diagonalized, it's better to limit your repertoire of actions. Choose inaction where possible, stick to protocol, don't do things that are out of distribution. The smaller the set of actions that you ever do, the fewer options the diagonalizer has for what to get you to do. A hacker gets a computer system into a weird edge case, a social engineer gets someone to break protocol, a jailbreaker gets an LLM into an out-of-distribution state. An aspiring diagonalizer also wants to influence the process that you use to make decisions, and falling back on a pre-existing protocol can block that influence. I would include this on my list of strategies, maybe #6 Act Conservatively.

Looking back through these, most of them aren't that specific to diagonalization scenarios. Strategies 4 (Vindictiveness) & 5a (Maximin) are standard game theory which come up in lots of contexts. I think that strategies 1-3 fall out of a fairly broad sense of what it means for someone to be an adversary - they are acting contrary to your interests, in a way that's entangled with you; they're not just off somewhere else doing things you don't like, they are in some way using you to get more of the thing that's bad for you. In what ways might they be using you to get more of the thing? Maybe they're getting information from you which they can then use for their purposes, maybe they're trying to influence what you do so you do what they want, maybe you've let them have control over something which you could have disallowed. Strategies 1 (Blinding), 2 (Privacy), and 3 (Disempowerment) just involve undoing/blocking one of those.

Reply
Paranoia: A Beginner's Guide
Unnamed8h150

That helped give me a better sense of where you're coming from, and more of an impression of what the core thing is that you're trying to talk about. Especially helpful were the diagonalization model at the end (which I see you have now made into a separate post) and the part about "paranoia to me is centrally invoked by high-bandwidth environments that are hard to escape from" (while gesturing at a few examples, including you at CEA). Also your exchange elsewhere in the comments with Richard.

I still disagree with a lot of what you have to say, and agree with most of my original bullet points (though I'd make some modifications to #2 on your division into three strategies and #6 on selective skepticism). Not sure what the most productive direction is to go from here. I have some temptation to get into a big disagreement covid, where I think I have pretty different models than you do, but that feels like it's mainly a tangent. Let me instead try to give my own take on the central thing:

The central topic is situations where an adversary may have compromised some of your internal processes. Especially when it's not straightforward to identify what they've compromised, fix your processes, or remove their influence. There's a more theoretical angle on this which focuses on what are good strategies to use in response to these sorts of situations, potentially even what's the optimal response to a sufficiently well-specified version of this kind of scenario. And there's a more empirical angle which focuses on what do people in fact do when they think they might be in this sort of situation, which could include major errors (including errors in identifying what situation you're in, e.g. how much access/influence/capability the adversary has in relation to you or how adversarial the relationship is) though probably often involves responses that are at least somewhat appropriate.

This is narrower than what you initially described in this post (being in an environment with competent adversaries) but broader than diagonalization (which is the extreme case of having your internal processes compromised, where you are fully pwned). Though possibly this is still too broad, since it seems like you have something more specific in mind (but I don't think that narrowing the topic to full diagonalization captures what you're going for).

Reply2
Paranoia: A Beginner's Guide
Unnamed3d4110

I downvoted this post because it felt slippery. I kept running into parts that didn't fit together or otherwise seemed off.

If this was a google doc I might leave a bunch of comments quickly pointing to examples. I guess I can do that in list format here.

  • The post highlights the market for lemons model, but then the examples keep not fitting the lemons setup. Covid misinformation wasn't an adverse selection problem, nor was having spies in the government, nor was the Madman Theory situation.
  • "there are roughly three big strategies" is a kind of claim that I generally start out skeptical of, and this post failed to back that one up
  • The description of the CDC and what happened during covid seems basically inaccurate, e.g. that's not how concerns about surfaces played out
  • The Madman Theory example doesn't feel like an example of the Nixon administration being in an adversarial information situation or being paranoid. It's trying to make threats in a game theory situation.
  • The way you talk about the 3 strategies I get the sense that you're saying: when you're in an adversarial information scenario here are 3 options to consider. But in the examples of each strategy, the other 2 strategies don't really make sense. They are structurally different scenarios.
  • A thing that I think of as paranoia centrally involves selective skepticism: having some target that you distrust, but potentially being very credulous towards other sources of views on that topic such as an ingroup that also identifies that target as untrustworthy or engaging in confirmatory reasoning about your own speculative theories that don't match the untrusted target. That's missing from your theory and your examples.
  • The thing you're calling "blinding" includes approaches that seem pretty different. A source I distrust is claiming X, and X seems like the sort of thing they'd want me to believe, so I'll (a) doubt X, (b) find some other method of figuring out whether X is true that doesn't rely on that source, or c) go someplace else so that X doesn't matter to me. I associate paranoia mainly with (a), or with doing an epistemically atrocious job of (b), or a self-deceiving variant of (c).
  • Despite all the examples, there's a lack of examples that are examples of the core thing - here's an epistemically adversarial situation, here's a person being paranoid, here's what that looks like, here's how that's relatively appropriate/understandable even if not optimal, (perhaps also) here's how that involves a bunch of costs/badness (from the concluding paragraph this is maybe part of the core, though that wasn't apparent before that point)
Reply1
8 Questions for the Future of Inkhaven
Unnamed4d60

As a reader, I wish there was more filtering or signal boosting to help bring some Inkhaven posts to my attention. 

There are a few ways that could happen. It could be something reddit-like where there's a centralized place which at least has links to all the Inkhaven posts and people can upvote them. It could be something like LW curation where some moderators pick a few posts to curate (possibly some of them could even be cross-posted and curated on LW). It could be a linkpost style thing (as Vaniver has been done some of) where people post links to some of their favorite Inkhaven posts.

I could imagine setting up Inkhaven with the intention of having the residents do linkposts. Maybe each Sunday is linkpost day when residents are encouraged to make their daily post a linkpost (with no word requirement) where they link to 1-3 of their best posts from the past week, 3-10 other Inkhaven posts from the past week that they liked, and optionally a few things from elsewhere. Then on Monday there could be a centralized roundup post which links to all of those linkpost and all the posts which got multiple recommendations in those linkposts.

Reply
Experiment: Test your priors on Bernoulli processes.
Unnamed1mo30

Explanation:

Hypothesis 1: The data are generated by a beta-binomial distribution, where first a probability x is drawn from a beta(a,b) distribution, and then 5 experiments are run using that probability x. I had my coding assistant write code to solve for the a,b that best fit the observed data and show the resulting distribution for that a,b. It gave (a,b) = (0.6032,0.6040) and a distribution that was close but still meaningfully off given the million experiment sample size (most notably, only .156 of draws from this model had 2 R's compared with the observed .162).

Hypothesis 2: With probability c the data points were drawn from a beta-binomial distribution, and with probability 1-c the experiment instead used p=0.5. This came to mind as a simple process that would result in more experiments with exactly 2 R's out of 4. With my coding assistant writing the code to solve for the 3 parameters a,b,c, this model came extremely close to the observed data - the largest error was .0003 and the difference was not statistically significant. This gave (a,b,c) = (0.5220,0.5227,0.9237).

I could have stopped there, since the fit was good enough so that anything else I'd do would probably only differ in its predictions after a few decimal places, but instead I went on to Hypothesis 3: the beta distribution is symmetric with a=b, so the probability is 0.5 with probability 1-c and drawn from beta(a,a) with probability c. I solved for a,c with more sigfigs than my previous code used (saving the rounding till the end), and found that it was not statistically significantly worse than the asymmetric beta from Hypothesis 2. I decided to go with this one because on priors a symmetric distribution is more likely than an asymmetric distribution that is extremely close to being symmetric. Final result: draw from a beta(0.5223485278, 0.5223485278) distribution with probability 0.9237184759 and use p=0.5 with probability 0.0762815241. This yields the above conditional probabilities out to 6 digits.

Reply
OpenAI #15: More on OpenAI’s Paranoid Lawfare Against Advocates of SB 53
Unnamed1mo163

Chris Lehane, the inventor of the original term ‘vast right wing conspiracy’ back in the 1990s to dismiss the (true) allegations against Bill Clinton by Monica Lewinsky

This is inaccurate in a few ways.

Lehane did not invent the term "vast right wing conspiracy", AFAICT; Hillary Clinton was the first person to use that phrase in reference to criticisms of the Clintons, in a 1998 interview. Some sources (including Lehane's Wikipedia page) attribute the term to Lehane's 1995 memo Communication Stream of Conspiracy Commerce, but I searched the memo for that phrase and it does not appear there. Lehane's Wikipedia page cites (and apparently misreads) this SFGate article, which discusses Lehane's memo in connection with Clinton's quote but does not actually attribute the phrase to Lehane.

The memo's use of the term "conspiracy" was about how the right spread conspiracy theories about the Clintons, not about how the right was engaged in a conspiracy against the Clintons. Its primary example involved claims about Vince Foster which it (like present-day Wikipedia) described as "conspiracy theories" (as you can see by searching the memo for the string "conspirac").

Also, Lehane's memo was published in July 1995 which was before the Clinton-Lewinsky sexual relationship began (Nov 1995), and so obviously wasn't a response to allegations about that relationship.

Lehane's memo did include some negatives stories about the Clintons that turned out to be accurate, such as the Gennifer Flowers allegations. So there is some legitimate criticism about Lehane's memo, including how it presented all of these negative stories as part of a pipeline for spreading unreliable allegations about the Clintons, and didn't take seriously the possibility that they might be accurate. But it doesn't look like his work was mainly focused on dismissing true allegations.

Reply
The Most Common Bad Argument In These Parts
Unnamed1mo21

Exhaustive Free Association is a step in a chain of reasoning where the logic goes "It's not A, it's not B, it's not C, it's not D, and I can't think of any more things it could be!"[1] Once you spot it, you notice it all the damn time.

This description skips over the fallacy part of the fallacy. On its own, the sentence in quotes sounds like a potentially productive contribution to a discussion.

Reply
Experiment: Test your priors on Bernoulli processes.
Unnamed1mo31

[0.111019, 0.324513, 0.5, 0.675487, 0.888981]

Reply
The Counterfactual Quiet AGI Timeline
Unnamed1mo2118

My leading guess is that a world without Yudkowsky, Bostrom, or any direct replacement looks a lot more similar to our actual world, at least by 2025. Perhaps: the exact individuals and organizations (and corporate structures) leading the way are different, progress is a bit behind where it is in our world (perhaps by 6 months to a year at this point), there is less attention to the possibility of doom and less focus on alignment work.

One thing that Yudkowsky et al. did is to bring more attention to the possibility of superintelligence and what it might mean, especially among the sort of techy people who could play a role in advancing ML/AI. But without them, the possibility of thinking machines was already a standard topic in intro philosophy classes, the Turing test was widely known, Deep Blue was a major cultural event, AI and robot takeover were standard topics in sci-fi, Moore's law was widely known, people like Kurzweil and Moravec were projecting when computers would pass human capability levels, various people were trying to do what they could with the tech that they had. A lot of AI stuff was in the groundwater, especially for the sort of techy people who could play a role in advancing ML/AI. So in nearby counterfactual worlds, as there are advances in neural nets they still have ideas like trying to get these new & improved computers to be better than humans at Go, or to be much better chatbots. 

Yudkowsky was also involved in networking, e.g. helping connect founders & funders. But that seems like a kind of catalyst role that speeds up the overall process slightly, rather than summoning it where it otherwise would be absent. The specific reactions that he catalyzed might not have happened without him, but it's the sort of thing where many people were pursuing similar opportunities and so the counterfactual involves some other combination of people doing something similar, perhaps a bit later or a bit less well.

Reply
High-level actions don’t screen off intent
Unnamed2mo60

e.g., Betty could cause one more girl to have a mentor either by volunteering as a Big Sister or by donating money to the Big Sisters program.

In the case where she volunteers and mentors the girl directly, it takes lots of bits to describe her influence on the girl being mentored. If you try to stick to the actions->consequences framework for understanding her influence, then Betty (like a gamer) is engaging in hundreds of actions per minute in her interactions with the girl - body language, word choice, tone of voice, timing, etc. What the girl gets out of the mentoring may not depend on every single one of these actions but it probably does depend on patterns in these micro-actions. So it seems more natural to think about Betty's fine-grained influence on the girl she's mentoring in terms of Betty's personality, motivations, etc., and how well she and the girl she's mentoring click, rather than exclusively trying to track how that's mediated by specific actions. If you wanted to know how the mentoring will go for the girl, you'd probably have questions about those sorts of things - "What is Betty like?", "How is she with kids?", etc.

In the case where Betty donates the money, the girl being mentored will still experience the mentoring in full detail, but most of those details won't be coming directly from Betty so Betty's main role is describable with just a few bits (gave $X which allowed them to recruit & support one more Big Sister). e.g., For the specific girl who got a mentor thanks to Betty's donation, it probably doesn't make any difference what facial expression Betty was making as she clicked the "donate" button, or whether she's kind or bitter at the world. Though there are still some indirect paths to Betty influencing fine-grained details for girls who receive Big Sisters mentoring, as the post notes, since the organization could change its operations to try to appeal to potential donors like Betty.

Reply
Load More
29Using smart thermometer data to estimate the number of coronavirus cases
6y
8
11Case Studies Highlighting CFAR’s Impact on Existential Risk
9y
1
53Results of a One-Year Longitudinal Study of CFAR Alumni
10y
35
23The effect of effectiveness information on charitable giving
12y
0
31Practical Benefits of Rationality (LW Census Results)
12y
5
56Participation in the LW Community Associated with Less Bias
13y
50
14[Link] Singularity Summit Talks
13y
3
26Take Part in CFAR Rationality Surveys
13y
4
2Meetup : Chicago games at Harold Washington Library (Sun 6/17)
13y
0
2Meetup : Weekly Chicago Meetups Resume 5/26
14y
0
Load More
Alief
4 years ago
(+111/-18)
History of Less Wrong
5 years ago
(+154/-92)
Virtues
5 years ago
(+105/-99)
Time (value of)
5 years ago
(+117)
Aversion
14 years ago
Less Wrong/2009 Articles/Summaries
15 years ago
(+20261)
Puzzle Game Index
15 years ago
(+26)