Raemon

I've been a LessWrong organizer since 2011, with roughly equal focus on the cultural, practical and intellectual aspects of the community. My first project was creating the Secular Solstice and helping groups across the world run their own version of it. More recently I've been interested in improving my own epistemic standards and helping others to do so as well.

Sequences

Privacy Practices
The LessWrong Review
Keep your beliefs cruxy and your frames explicit
Kickstarter for Coordinated Action
Open Threads
LW Open Source Guide
Tensions in Truthseeking
Project Hufflepuff
Rational Ritual
Load More (9/10)

Comments

Some AI research areas and their relevance to existential safety

Curated, for several reasons.

I think it's really hard to figure out how to help with beneficial AI. Various career and research paths vary in how likely they are to help, or harm, or fit together. I think many prominent thinkers in the AI landscape have developed nuanced takes on how to think about the evolving landscape, but often haven't written up those thoughts. 

I like this post both for laying out a lot of object-level thoughts about that, and also for demonstrating a possible framework for organizing those object-level thoughts, and for doing it very comprehensively.  

I haven't finished processing all of the object level points and am not sure which ones I endorse at this point. But I'm looking forward to debate on the various points here. I'd welcome other thinkers in the AI Existential Safety space writing up similarly comprehensive posts about how they think about all of this.

Embedded Interactive Predictions on LessWrong

Holy christ this is sure is the highest karma-to-effort ratio I think I've ever gotten.

Sunday 22nd: The Coordination Frontier

Marcin's notes:

  • I use argument mapping software to help with coordination problems on policy issues between different levels of administrative government in Poland. This is anonymous process and there is some resemblance with Double Crux. Arguments are put forward anonymously and reviewed with the use of reusable meta-argument structures. It creates transparent chains of reasoning that can be reviewed by anyone.
Sunday 22nd: The Coordination Frontier

Gentzel's links:

Sunday 22nd: The Coordination Frontier

Abram's notes:

  • Kickstarters don’t need a high bar of understanding in the general population to work -- people who understand can join and spread. This seems like a nice property; what other sorts of coordination tech have this? (Counterexample: white flags require common understanding to work)
  • Paths Forward seems real good, particularly for coordination conflicts -- whatever your coordination norms, they should leave some path for improvement, and that path should be explicit (example: “I’m shutting down this conversation right now, because I think it violates important norms, but I will talk to you about it again under the following conditions”)
  • Power grabs were mentioned -- I recently finished Dictator’s handbook and have been kinda relating it to everything, so, Dictator’s Handbook seems relevant
    • Raemon mentioned the importance of leaders being able to declare their own space with higher coordination standards. But also mentioned that he is now less gung-ho on “Archipelago” and thinks there’s a big bottleneck around shared norms. Someone mentioned concern that this less-archipelago stance being power-grabby.
    • Claim: “elected dictators” are the best of both worlds. You generate common knowledge (amongst the electorate) that the dictator’s coordination power is accepted by the group. Because it’s a democracy the electorate can throw out the leader. But you still have a strong leader.
      • Pirates had this, except the captain’s dictatorship is locked in during battles.
      • In general locking in the dictatorship during “battles” seems important, but also very risky -- the old trick of declaring a state of emergency to lock rule in!
      • Who gets to declare a state of emergency? Need another person, not the captain, to do this!
    • Elections: the original kickstarter
  • I don’t personally buy that archipelago is bad in the way described, it seems like you need these insular communities in order to generate norms much above the sanity waterline
Sunday 22nd: The Coordination Frontier

John’s thoughts:

  • High-value open coordination problems today mainly seem to stem from hard mechanism design problems, scalability problems, modern communities being much larger than Dunbar’s number, high communication overhead on technically difficult problems, etc. It’s not a matter of deciding which of two people’s norm-suggestions to use, it’s a problem of good norms being technically difficult to locate in the first place.
  • On the scalability front, key norm-design constraints include:
    • Does the norm work well (or at least not fail disastrously) if only one person is following it? (RA +1)
    • Memetic fitness of norm
  • How do we tell which norm is better? I don’t mean how do we reach a consensus, I mean how do we ground that judgement? It often just isn’t obvious how well two norms would actually work without trying them both, or how well they would work after some time to adjust vs upfront, or …. It’s too expensive to try them all. Often it’s expensive even just to process an argument. So how does judgement-of-norm-quality get entangled with the actual quality of the norm? How does the map end up matching the territory?
  • Ex.: one of the basic problems of democracy as a government system is that most people are not experts in the vast majority of areas a government is involved in. The more government activity is in specific areas requiring lots of technical background to understand at all, the harder it is for voters to have any meaningful judgement (so they substitute really terrible heuristics, like ingroup/outgroup dynamics). This is a problem of scale, and a problem of modern social/economic systems being far more complicated than historical/ancestral-environment systems.
  • Often we can’t even tell in hindsight which of two norms worked better, without a full-blown RCT. E.g. two communities may have different norms, but did the different norms cause different outcomes, or did success bring about new norms, or were both caused by differences in community initial conditions? Again, government provides loads of examples of this sort of thing - does democracy cause economic success, or does economic success cause democracy, or does some other underlying factor cause both?
Sunday 22nd: The Coordination Frontier

Doe's notes:

  • For time-crunch conflicts you need to defer to the captain. A dictator that everyone sees as high-status/trusts as competent.
  • For two-person conflicts with trust, competence, and time, double crux is good.
  • For two-person conflicts without those, some kind of court system can work. Agree to an arbiter and system you both trust.
  • For larger societies, you need representatives who have the time and energy to deliberate. You can’t expect everybody to pay these costs at all times. There are better and worse ways to do this. A parliament elected by proportional representation or a citizens’ assembly selected by sortition can work well. Parliaments have procedures.
  • Maybe we should survey what social technologies are already working well before trying to invent more from scratch.
  • You can’t negotiate well until you figure out what the other side wants.
Sunday 22nd: The Coordination Frontier

Some random additional concepts that feel relevant to me:

Goodwill Kickstarter – often, my willingness to extend someone goodwill depends on them being willing to extend me goodwill. 

Humans have some default cognitive/emotional machinery for how to coordinate. At least sometimes, this machinery sort of assumes that other people have similarly to you.

    Anger – anger credibly signals to someone that you might punish them, and that you might punish them again if they do the same action again.

    Frustration – frustration signals to yourself that there is something suboptimal about a situation, which you have the power to fix.

    Grieving – a process by which you come to terms with the fact there is something wrong that you *can't* change.

    Sadness (the kind you tell people about), and being scared (the kind you tell people about) – tools for getting other people to help you (sometimes, helping you with the specific thing you're sad about or afraid of. Othertimes, simply reassuring you that you are understood, and valued, which signals that you can get help with other problems in the future)

    Paralytic Sadness or Fear – tools for conserving energy or avoiding conflict.

Sunday 22nd: The Coordination Frontier

My own original notes as I prepared for the talk:

Thesis:    

It is possible to learn new coordination principles, with implications on how to optimally interact. 

In an ideal world, you'd be able to state the principle and have people go "oh, that seems right", and then immediately coordinate using it. Unfortunately this is often not so easy.

If people don't understand your principles, you won't be able to use them to cooperate with people. (People might be able to make simplified guesses about your principles, but may either overfit or overgeneralize their response and not be able to respond in nuanced ways. They may also just decide you aren't worth interacting with as much)

It's harder to bring up new principles during conflict or high-stakes negotiations, because everyone knows it's possible to use clever arguments to persuade people of things falsely. I know I sometimes pursue "biased fairness", where there might be multiple fair-ish-sounding solutions to a conflict, but I'm incentivized to notice and argue for the one that benefits me. I worry that other people are doing the same. During a conflict, I neither trust myself nor trust other people to be as fair, clear thinking or impartial as they would be Not-During-A-Conflict.

During _stressful_ conflict, where people are operating in scarcity mindset, I trust them (and myself) even less.

People also just sometimes impose norms on each other in a bullying way that doesn't respect each other at all. The internet is full of people doing this, so people have defense mechanisms against it. I claim this is correct of people.

Thus, if you want to get a frontier principle or norm into the general coordination-toolkit for your community, I recommend:

1. Try to write a public blogpost *before* a major conflict comes up, where people have the ability to think clearly, argue, and mull it over *before* any high stakes implications come up.

2. In some cases, it might still be important to unilaterally enforce a norm or defend a boundary that people don't understand, or disagree with. 

If Alice decides to unilaterally enforce a norm, my suggested "good sportsmanship" rules for doing so are...

  • state "I recognize that Bob doesn't agree with this norm/boundary/principle."
  • state "I'm aware that this increases the cost of Bob interacting with me, and that I have some limited ability to impose this cost of Bob before he [stops being my friend] / [stops working with me] / [becomes less pleasant to interact with] / [possibly decides to impose costs back on me]".
  • depending on Alice's relationship with Bob, she might say "Bob, I'd like it if you tried to understand why this is important to me. I'm willing to put in interpretive labor explaining this if you're also willing to put in interpretive labor listening." 

            (see: "Goodwill Kickstarter")

I'd summarize all this as "Alice can spend social capital on unilaterally enforcing a norm, or commanding people's attention to think more about the norm even if they don't think it makes sense on the face of it. This social capital is limited. Alice can also _gamble_ social capital, where if people end up thinking 'oh, Alice was right to enforce that norm', then Alice gains more capital than she loses, and gets to do it again later."

But, importantly, social capital isn't infinite. If you spend too much social capital, you might find people being less 

I think this all goes more smoothly if it's actually an agreed upon meta-norm than if people are doing it randomly, and if it's made explicit rather than implicit.

One key problem: "social capital" is a vague abstraction that isn't super clearly tracked. 

You also kinda have a different social bank account with different people who care about different things. I think humans are moderately good at tracking this implicitly, but not _that_ good. You might accidentally overspend your bank accounts.     

People might also disagree about how much social capital you have to spend on a thing. If Alice unilaterally imposes a norm on Bob and Charlie, Bob might think this was okay, but Charlie doesn't. And then that can cause conflict, imposing costs on all three people.

There's a fine line between "judiciously spending social capital on things that are important" and "just bullying people into getting your way a lot, and maybe being charismatic enough to get away with it."

Sunday 22nd: The Coordination Frontier

Here are the notes from the conversation.

https://docs.google.com/document/d/11IuD2k0R-KF5n-byg_O9WtIo1cak25lL3g-WBos417w/edit#

I'm copy-pasting them here as LW comments to make them a bit more visible

Load More