You Get About Five Words

Raemon

LESSWRONG
LW

Modularity

You Get About Five Words

by Raemon

2 min read12th Mar 201977 comments

201

Common KnowledgePublic Discourse

Frontpage

Cross posted from the EA Forum.

Epistemic Status: all numbers are made up and/or sketchily sourced. Post errs on the side of simplistic poetry – take seriously but not literally.

If you want to coordinate with one person on a thing about something nuanced, you can spend as much time as you want talking to them – answering questions in realtime, addressing confusions as you notice them. You can trust them to go off and attempt complex tasks without as much oversight, and you can decide to change your collective plans quickly and nimbly.

You probably speak at around 100 words per minute. That's 6,000 words per hour. If you talk for 3 hours a day, every workday for a year, you can communicate 4.3 million words worth of nuance.

You can have a real conversation with up to 4 people.

(Last year the small organization I work at considered hiring a 5th person. It turned out to be very costly and we decided to wait, and I think the reasons were related to this phenomenon)

If you want to coordinate on something nuanced with, say, 10 people, you realistically can ask them to read a couple books worth of words. A book is maybe 50,000 words, so you have maybe 200,000 words worth of nuance.

Alternately, you can monologue at people, scaling a conversation past the point where people realistically can ask questions. Either way, you need to hope that your books or your monologues happen to address the particular confusions your 10 teammates have.

If you want to coordinate with 100 people, you can ask them to read a few books, but chances are they won't. They might all read a few books worth of stuff, but they won't all have read the same books. The information that they can be coordinated around is more like "several blogposts." If you're trying to coordinate nerds, maybe those blogposts add up to one book because nerds like to read.

If you want to coordinate 1,000 people... you realistically get one blogpost, or maybe one blogpost worth of jargon that's hopefully self-explanatory enough to be useful.

If you want to coordinate thousands of people...

You have about five words.

This has ramifications on how complicated a coordinated effort you can attempt.

What if you need all that nuance and to coordinate thousands of people? What would it look like if the world was filled with complicated problems that required lots of people to solve?

I guess it'd look like this one.

Common KnowledgePublic Discourse

Frontpage

201

Evolution of Modularity

12 comments174 karma

Coherent decisions imply consistent utilities

81 comments148 karma

Mentioned in

305Epistemic Legibility

299Heads I Win, Tails?—Never Heard of Her; Or, Selective Reporting and the Tragedy of the Green Rationalists

276Covid 12/24: We’re F***ed, It’s Over

258"Carefully Bootstrapped Alignment" is organizationally hard

245The 101 Space You Will Always Have With You

Load More (5/26)

New Comment

77 comments, sorted by

top scoring

Click to highlight new comments since: Today at 4:30 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

[-]Benquo5y580

You're massively underestimating the upper bound.

I've interacted a bunch recently with members of a group of about 2 million people who recite a 245-word creed twice daily, and assemble weekly to read from an 80,000 word text such that the whole text gets read annually. This is nowhere near a complete accounting of engagement with verbal canon within the group. Each of these practices is preceded and followed by an additional standardized text of substantial length, and many people study full-time a much larger canonical text claiming to interpret the core text.

They also engage in behavior patterns that, while they don't necessarily reflect detailed engagement by each person with the content of the core text, do reflect a lot of fine-grained responsiveness to the larger interpretive canon.

You might be closer for what can be done very quickly (within a single generation) under current conditions. But a political movement plenty of people are newly worried about which likely has thousands of members has a 14-word creed.

[-]Raemon5y170

Nod. Social pressure and/or organizational efforts to read a particular thing together (esp. in public where everyone can see that everyone else is reading) does seem like a thing that would work.

It comes with drawbacks such as "if it turns out you need to change the 80,000 word text because you picked the wrong text or need to amend it, I expect there to be a lot of political drama surrounding that, and the process by which people building momentum towards changing it probably would be subject to the bandwidth limits I'm pointing to [edit: unless the organization has specifically built in tools to alleviate that]"

(Reminder that I specifically said "all numbers are made up and/or sketchily sourced". I'm pointing to order of magnitude. I did consider naming this blogpost "you have about five words" or "you have less than seven words". I think it was a somewhat ironic failure of mine that I went with "you have four words" since it degrades less gracefully than "you have about five words.")

5Benquo5y

14 is still half an order of magnitude above 5, and I don't think neo-Nazis are particularly close to the most complex coordination thousands of people can achieve with a standardized set of words.

9Raemon5y

I suppose, but, again, "all numbers are made up" was the first sentence in this post, and half an order of magnitude feels within bounds of "the general point of the essay holds up." I also don't currently know of anyone writing on LessWrong or EA forum who should have reason to believe they are as coordinated as the neo-Nazis are here. (See elsethread comment on my take on the state of EA coordination, which was the motivation for this post). (In Romeo's terms, the neo-nazis are also using a social tech with unfolding complexity, where their actual coordinated action is "recite the pledge every day", which lets them them encode additional information. But to get this you need to spend your initial coordinated action on that unfolding action)

1CharlieHorse2y

Are you talking about Judaism?

[-]drethelin5y380

Walmart coordinates 2.2 million people directly and millions more indirectly.

Even the boy scouts coordinates 2.7 million.

Religions coordinate, to a greater or lesser extent, far more.

The key to coordination is to not consider yourself as an individual measuring out a ration of words you can force x number of people to read. Most people never read the bible.

[-]ryan_b5y320

These are good examples that drive the point home.

Most people never read the bible.

They don't coordinate based on the nuanced information in it, either. Mostly they coordinate on a few very short statements, like:

Say you are Christian.

Go to church.

A much smaller group of people coordinates on a few more:

Give money to the church.

Run a food drive OR help build houses OR staff a soup kitchen OR ...

The Walmart example seems a little different, because it isn't as though working at Walmart is that different from any other kind of hourly employment. Mostly all employers try to get people to coordinate on a few crucial things:

Show up on time.

Count the money correctly.

Stock the shelves.

Sweep the floor.

And it seems to me there is never a shortage of preachers or employers complaining about people's inability to do even these basic things.

It looks to me like successful coordination on the scale of millions largely amounts to iterating four-word actions.

[-]romeostevensit5y210

Agree, and I'd roll in the incentives more closely. It feels more like:

you have at most space for a few feedback loops

you can improve this by making one of the feedback loops a checklist that makes calls out to other feedback loops

the tighter and more directly incentivized the feedback loop, the more you can pack in

every employer/organization is trying to hire/recruit people who can hold more feedback loops at once and do some unsupervised load balancing between them

you can make some of people's feedback loops managing another person's feedback loops

Now jump to this post https://slatestarcodex.com/2017/11/09/ars-longa-vita-brevis/

another frame is that instead of thinking about how many bits you can successfully transmit, think about whether the behaviors implied by the bits you transmit can run in loops, whether the loops are supervised or unsupervised and what range of noise they remain stable under.

4Matt Goldenberg5y

I didn't make the leap from bits of information to feedback loops but it makes intuitive sense. Transmiting information that compresses by giving you the tools to figure out the information yourself seems useful.

[-]Raemon5y160

Heh, "read the sequences" clocks in at 3 words.

2Gunnar_Zarncke1y

Doesn't include which sequences.

4Raemon1y

In the original context this means Eliezer's posts from 2007 to 2009 (which got compiled as The Sequences and later recompiled into the half-as-long Rationality A-Z)

[-]Raemon5y160

The point is not "rationing out your words" is the correct way to coordinate people. The point is that you need to attend, as part of your coordination strategy, to the fact that most people won't read most of your words. Insofar as your coordination strategy relies on lots of people hearing an idea, the idea needs to degrade gracefully as it loses bandwidth.

Walmart I expect to do most of it's coordination via oral tradition. (At the supermarket I worked at, I got one set of cultural onboarding from the store manager, who gave a big speech... which began an ended with a reminder that "the four virtues of the Great Atlantic and Pacific Tea company are integrity, respect, teamwork and responsibility." Then, I learned most of the minutia of how to run a cash register, do janitorial duties or be a baker via on-the-job training, by someone who spent several weeks telling me what to do and giving me corrective feedback)

(Several years later, I have some leftover kinesthetic knowledge of how to run a cash register, and the dangling words "integrity, respect, teamwork, responsibility" in my head, although also I probably only have that because I thought the virtues were sort of funny and wrote a song about it)

5catherio5y

The recent EA meta fund announcement linked to this post (https://www.centreforeffectivealtruism.org/blog/the-fidelity-model-of-spreading-ideas ) which highlights another parallel approach: in addition to picking idea expressions that fail gracefully, to prefer transmission methods that preserve nuance.

[-]DanielFilan3y290Review for 2019 Review

I think this post, as promised in the epistemic status, errs on the side of simplistic poetry. I see its core contribution as saying that the more people you want to communicate to, the less you can communicate to them, because the marginal people aren't willing to put in work to understand you, and because it's harder to talk to marginal people who are far away and can't ask clarifying questions or see your facial expressions or hear your tone of voice. The numbers attached (e.g. 'five' and 'thousands of people') seem to not be super precise.

That being said: the numbers are the easiest thing to take away from this post. The title includes the words 'about five' but not the words 'simplifed poetry'. And I'm just not sure about the numbers. The best part of the post is the initial part, which does a calculation and links to a paper to support an order-of-magnitude calculation on how many words you can communicate to people. But as the paragraphs go on, the justifications get less airtight, until it's basically an assertion. I think I understand stylistically why this was done, but at the end of the day that's the trade-off that was made.

So a reader of this post has to ask themselves... (read more)

[-]Raemon3y100

The aspiring-rigorous-next-post I hope to write someday is called "The Working Memory Hypothesis", laying out more concretely that at some maximum scale, your coordination-complexity is bottlenecked on a single working-memory-cluster, which (AFAICT based on experience and working memory research) amounts to 3-7 chunks of concepts that people already are familiar with.

So, I am fairly confident that in the limit it is actually about 5 words +/- 2, because Working Memory Science and some observations about what slogans propagate. (But, am much less sure about how fast the limit approaches and what happens along the way)

8DanielFilan3y

Aren't working memory chunks much bigger than one word each, at least potentially?

9Raemon3y

I think if you end up having a chunk that you use repeatedly and need to communicate about, it ends up turning into a word. (like, chunks are flexible, but so are words)

4DanielFilan3y

To me, this suggests a major change to the message of the post. Reading it, I'd think that I have five samples from the bank of existing words, but if the constraint is just that I have five concepts that can eventually be turned into words, that's a much looser constraint!

2Raemon3y

Not 100% sure I understand the point, but for concepts-you-can-communicate, I think you are bottlenecked on already-popular-words. Chunks and words don't map perfectly. But... word-space is probably mostly a subset of chunk-space? I think wordless chunks matter for intellectual progress, where an individual thinker might have juuuust reached the point where they've distilled a concept in their head down into a single chunk, so they can then reason about how that fits with other concepts. But, if they want to communicate about that concept, they'll need to somehow turn it into words.

2DanielFilan3y

Is the claim that before I learn some new thing, each of my working memory slots is just a single word that I already know? Because I'm pretty sure that's not true.

4Raemon3y

First: the epistemic status of this whole convo is "thing Ray is still thinking through and is not very sure about." Two, for your specific question: No, my claim is that wordspace is a (mostly) subset of chunkspace, not the other way round. My claim is something like "words are chunks that you've given a name", but you can think in chunks that have not been given names. Three: I'm not taking that claim literally, I'm just sorta trying it out to see if it fits, and where it fails. I'm guessing it'll fail somewhere but I'm not actually sure where yet. If you can point to a concrete way that it fails to make sense that'd be helpful. But, insofar as I'm running with this idea: An inventor who is coming up with a new thing might be working entirely with wordless chunks, that they invent, combine them into bigger ideas, compress into smaller chunks, without ever being verbalized or given word form.

3ryan_b3y

This part points pretty directly at research debt and inferential distance, where the debt is how many of these chunks need to be named and communicated as chunks, and the distance is how many re-chunking steps need to be done.

3Raemon3y

Thinking a little more: I think when I'm parsing a written sentence, words are closer like one-word-to-one-chunk correspondence. When I'm thinking them, I think groups of words tend to be more like a chunk. "Politics is the mind killer" might collapse into a single slot that I'm not looking at at super-high resolution, allowing me to reason something like "'Politics is the mindkiller' is an incomplete idea.'"

2DanielFilan3y

If wordspace is a subset of chunkspace and not the other way around, and you have about five chunks, do you agree that you do not have about five words, but rather more?

3Raemon3y

Yes, although I've heard mixed things about how many chunks you actually have, and that the number might be more like 4. Also, the ideas often get propagated in conjunction with other ideas. I.e. people don't just say "politics is the mindkiller", they say "politics is the mindkiller, therefore X" (where X is whatever point they're making in the conversation). And that sentence is bottlenecked on total comprehensibility. So, basically the more chunks you're using up with your core idea, the more you're at the mercy of other people truncating it when they need to fit other ideas in. I'd argue "politics is the mindkiller" is two chunks initially, because people parse "is" and "the" somewhat intuitively or fill them in. Whereas Avoid Unnecessary Political Arguments is more like 4 chunks. I think you typically need at least 2 chunks to say something meaningful, although maybe not always. Once something becomes popular it can eventually compress down to 1 chunk. But, also, I think "sentence complexity" is not only bottlenecked on chunks. "Politics is the mindkiller" can be conceptually one chunk, but it still takes a bunch of visual or verbal space up while parsing a sentence that makes it harder to read if it's only one clause in a multi-step argument. I'm not 100% sure if this is secretly still an application of working memory, or if it's a different issue.

2Raemon3y

Continuing to babble down this thought-trail: I'm wondering how Gendlin Focusing interacts with working memory. I think the first phase of focusing is pre-chunk, as well as pre-verbal. You're noticing a bunch of stuff going on in your body. It's more of a sensation than a thought. The process of focusing is trying to get those sensations into a form your brain can actually work with and think about. I... notice that focusing takes basically all my concentration. I think at some part of the process it's using working memory (and basically all of my working memory). But I'm not sure when that is. One of the things you do in focusing is try to give your felt-sense a bunch of names and see if they fit, and notice the dissonance. I think when this process starts, the felt-sense is not stored in chunk form. I think as I try to give it different names Gendlin Focusing might be a process where a) first I'm trying to feel out a bunch of felt-data that isn't even in chunk form yet b) I sort of feel it out, while trying different word-combos on it. Meanwhile it's getting more solid in my head. I think it's... slowly transitioning from wordless non-chunks into wordless chunks, and then when I finally find the right name that describes it I'm like "ah, that's it", and then it simultaneiously solidifies into one-or-more chunks I can store properly in working memory, and also gets a name. (The name might be multiple words, and depending on context those words could correspond to one chunk or multiple)

5ryan_b3y

Not about Gendlin, but following the trail of relating chunks to other things: I wonder if propaganda or cult indoctrination can be described as a malicious chunking process. I've weighed in against taking the numbers literally elsewhere, but following this thread I suddenly wondered if the work that using few words was doing isn't delivering the chunk, but rather screening out any alternative chunk. If what we are interested in is common knowledge, it isn't getting people to develop a chunk per se that is the challenge; rather everyone has to agree on exactly which chunk everyone else is using. This sounds much more like the work of a filter than a generator. When I thought about it in those terms, it occurred to me that it is perfectly possible to drive this in any direction at all; we aren't even meaningfully constrained by reality. This feels obvious in retrospect - there've been lots of times when common knowledge was utterly wrong - but doing that on purpose never occurred to me. So now it feels like what cults do, and why they sound so weird to everyone outside of them, is deliberately create a different sequence of chunks for normal things for the purpose of having different chunks. Once that is done, the availability heuristic will sustain communication on that basis, and the artificially-induced inferential distance will tend to isolate them from anyone outside the group.

2DanielFilan3y

Do working memory chunks come in order? Like, I'd kind of expect that if you have 5 concepts in working memory, you can't additionally remember the order they should go in, because that's another working memory chunk. Or if you can remember the order they should go in, then introspectively I'd imagine they'd become one working memory chunk.

3Raemon3y

I don't really know, but my guess is that, well, it's a bit messy, and yes if your chunks need to fit in a particular combination that you don't have a good grasp on, that strains your working memory. But, I don't think there are literal chunks and ordering them literally costs a chunk. Chunks are patterns of thought that can bring associations of other patterns of thought, and those associations can be stronger or weaker. If the associations are sufficiently strong it makes sense to model the chunk-cluster as a single chunk. (I notice I'm somewhat confused about this, and somewhat going off "there's enough working memory research that I'm fairly confident 'chunks' is a useful abstraction, but I'm not sure why.") I'm kinda brain-dead right now and can't introspect well enough to figure out how it subjectively feels for me. I think this post of mine is... probably relevant, although it might require some additional inference to make the relevance obvious: https://www.lesswrong.com/posts/n7vPLsbTzpk8XXEAS/what-s-your-cognitive-algorithm

2DanielFilan3y

The thing I'm unsure about here is why does that not apply to one-on-one communication? And if one-on-one communication doesn't suffer from this limit, why does it not hold for getting a message to thousands by mathematical induction? Perhaps the problem is that you lose bits in the retelling when people forget things or word things badly - but surely you also pick up bits in more people actually thinking about the message and seeing flaws in it and ways it can be tweaked to be more true?

2Raemon3y

I think all communication is bottlenecked by the working memory limit, but the limit has different ramifications in different contexts. I agree with Romeo's take elsethread that part of what's going on here is "how many feedback loops you can have going on at once. Feedback loops can unpack into larger things, but you have to actually do the unpacking." (I have a bunch more thoughts on this that are probably need to be a top-level post) note that if people are seeing flaw and improving your idea, then they aren't coordinating on a single thing, and if it matters that lots of people are moving in lockstep it can be actively harmful if they're 'improving' your idea. But, more realistically: most people aren't necessarily improving things, they're adapting them to make them better/more-convenient/more-aligned for them. (Or, just forgetting or misremembering or whatever) Preserving a complex idea at high fidelity is very hard.

[-]Zvi3y220Nomination for 2019 Review

I use this concept often, including explicitly thinking about what (about) five words I want to be the takeaway or that would deliver the payload, or that I expect to be the takeaway from something. I also think I've linked to it quite a few times.

I've also used it to remind people that what they are doing won't work because they're trying to communicate too much content through a medium that does not allow it.

A central problem is how to create building blocks that have a lot more than five words, but where the five words in each block can do a reasonable substitute job when needed.

3Zvi3y

As an additional data point, a link to this post will appear in the 12/10 Covid weekly roundup.

2Matt Goldenberg3y

This is pretty cool. Can you give some example of about five word takeaways you've created for different contexts?

[-]Zvi3y250

Here are some attempted takeaways for things I've written, some of which were explicit at the time, some of which were implicit:

Covid-19: "Outside, social distance, wear mask."

Simulacra (for different posts/models): "Truth, lies, signals, strategic moves" or "level manipulates/dominates level below" or "abstractions dominate, then system collapses"

Mazes: "Modern large organizations are toxic" or "middle management destroys your soul"

Asymmetric Justice: "Unintentional harms count, benefits don't" or "Counting only harms destroys action" or similar.

Or one can notice that we are abstracting out a conclusion from someone else's thing, or think about what we hope another will take away. Often but not always it's the title. Constantly look to improve. Pain not unit of effort. Interacting with system creates blameworthiness. Default AI destroys all value. Claim bailey, retreat to motte. Society stuck in bad equilibrium. Etc.

[-]Dagon5y150

Hierarchies (which provide information-cheap mechanisms for coordination) and associative processes (which get people with shared information closer, so less information exchange is necessary) both would seem to expand the numbers greatly from those you suggest.

There are examples of fairly complicated cooperation across many millions. For example, all the expectations behind credit card usage take many pages of contracts, which implicitly depend on many volumes of law, which implicitly depend on uncountable bits of history and social norms.

2Raemon5y

Yes, but it's important to note that if you haven't purposefully built that hierarchy, you can't rely on it existing. (And, it's still a fairly common problem within an org for communication to break down as it scales – I'd argue that most companies don't end up successfully solving this problem) The motivating example for this post at-the-time-of-writing was that in the EA sphere, there's a nuanced claim made about "EA being talent constrained", which large numbers of people misinterpreted to mean "we need people who are pretty talented" and not "we need highly specific talents, and the reason EA is talent constrained is that the median EA does not have these talents." There were nuanced blogposts discussing it, but in the EAsphere, the shared information is capped at roughly "1 book worth of content and jargon, which needs to cover a diverse array of concepts, so any given concept won't necessarily have much nuance", and in this case it appeared to hit the literal four word limit.

2Dagon5y

It might be worth a second post examining the reasons that the standard and well-known coordination mechanisms (force, social pressure, hierarchy, broadcast/mass media, etc.) aren't available for the kind of coordination you think is needed, and what you're considering as replacements (or just accepting that a loosely-committed voluntary group with no direct rewards or sanctions has a cap on effectiveness). (note: I'm not particularly EA-focused; this is a trap) Or perhaps a description of how "the EA community" can have needs that require such coordination, as opposed to actual projects that clearly need aggregated effort to have impact.

2Raemon5y

I do think that'd be a valuable post (and that sort of thing is going on on the EA forum right now, with people proposing various ways to solve a particular scaling problem). I don't know that I have particularly good ideas there, although I do have some. The point of this post was just "don't be surprised when your messages loses nuance if you haven't made special efforts to prevent it from doing so" (or, if it gets out-competed by a less nuanced message that was designed to be scalable and/or viral) I wrote this post in part so that I could more easily reference later at some point when I had either concrete ideas about what to do, or when I think someone is mistaken in their strategy because they're missing this insight.

3Dagon5y

Fair enough. Interestingly, if I replace "coordinate with" with "communicate a nuanced belief to", my reaction changes radically, in favor of numbers shaped like yours. I'll have to think more about why those concepts are so different.

4Raemon5y

Nod. The claim here is specifically about how much nuance can be relevant to your coordination, not how many people you can coordinate with. (If this failed to come across, that also says something about communicating nuance being hard)

4Dagon5y

I think I was taking "coordination" in the narrow sense of incenting people to do actions toward a relatively straightforward goal that they may or may not share. In that view, nuance is the enemy of coordination, and most of the work is simplifying the instructions so that it's OK that there's not much information transmitted. If the goal is communication, rather than near-term action, you can't avoid the necessity of detail.

4Raemon5y

The whole point is that coordination looks different at different scales. So, I think I was looking at this through a nonstandard frame (Maybe more nonstandard than I thought). There are two different sets of numbers in this post: — 4.3 million words worth of nuance — 200,000 words of nuance — 50,000 words — 1 blogpost (1-2k words) — 4 words And separately: — 1-4 people — 10 people — 100 people — 1000 people — 10,000 people+ While I'm not very confident about any of the numbers, I am more confident in the first set of numbers than the second set. If I look out into the world, I see clear failures (and successes) of communication strategies that cluster around different strata of communication bandwidth. And in particular, there is clearly some point at which the bandwidth collapses to 3-6 words.

[-]Raemon5y130

So, I think I optimized this piece a bit too much as poetry at the expense of clarity. (I was trying to keep it brief overall, and have the sections sort of correspond in length to how much reading you could expect people to read at that scale).

Obviously people in the real world do successfully coordinate on things, and this piece doesn't address the various ways you might try to do so. The core claim here is just that if you haven't taken some kind of special effort to ensure your nuanced message will scale, it will probably not scale.

Hierarchies are a way to address the problem. Oral tradition that embeds itself in people's socializing process is a way to address the problem. Smaller groups is a way to address the problem. Social pressure to read a specific thing is a way to address the problem. But each of these address it only in particular ways and come with particular tradeoffs.

[-]Raemon3y100Review for 2019 Review

Partial Self Review:

There's an obvious set of followup work to be done here, which is to ask "Okay, this post was vague poetry meant to roughly illustrate a point. But, how many words do you actually precisely have?" What are the in-depth models that let you predict precisely how much nuance you have to work with?

Less obvious to me is whether this post should become a longer, more rigorous post, or whether it should stay it's short, poetic self, and have those questions get explored in a different post with different goals.

Also less obvious to me is how the LessWrong Review should relate to short, poetic posts. I think it's quite important that this post be clearly labeled as poetry, and also, that we consider the work "unfinished" until there is a some kind of post that delves more deeply into these questions. But, for example, I think Babble last year was more like poetry than like a clear model, and it was nonetheless valuable and good to be part of the Best Of book.

So, I'm thinking about this post from two lenses.

What are simple net-improvements I can make to this post, without sacrificing it's overall aim of being short/accessible/poetic?
Sketch out the research/the

... (read more)

[-]PhilGoetz5d80

Isn't LessWrong a disproof of this? Aren't we thousands of people? If you picked two active LWers at random, do you think the average overlap in their reading material would be 5 words? More like 100,000, I'd think.

[-]Benquo5y80

A productive thing to do here would be to try to reconcile the claim that a large number of people can't reasonably be expected to read more than a few words, and the claim that something like EA or Rationalism is possible at anything like the current scale. These are in obvious tension.

Another claim to reconcile with yours would be a claim that there's anything like law going on, or really anything other than gang warfare.

[-]Raemon5y200

My claim is "a large number of people can't reasonably be expected to read more than a few words in common", which I think is subtly different (in addition to the thing where this post wasn't about ways to address the problem, it was about the default state of the problem in the absence of an explicit coordination mechanism)

If your book-length-treatise reaches 1000 people, probably 10-50 of those people read the book and paid careful attention, 100 people read the book, a couple hundred people skimmed the book, and the rest just absorbed a few key points secondhand.

I think it is in fact a failure of law that that the law has grown to the point where a single person can't possibly know it all, and only specialists can know most of it (because this creates an environment where most people don't know what laws they're breaking which enables certain kinds of abuse)

I think the way EA and LessWrong work is that there's a large body of work people are vaguely expected to read (In the case of LessWrong, I think the core sequences are around [edit: a million words, I initially was using my cached pageCount rather than wordCount] not sure how big the ... (read more)

2Vaniver5y

This feel like a 100x underestimate; The Sequences clocks in at over a million words, I believe, and it's not the case that only 1% of the words are core.

2Raemon5y

Whoops. I was confusing pages with words.

2Raemon5y

(The mental-action I was performing was "observing what seems to actually happen and then grab the numbers that I remembered coinciding with those actions", rather than working backwards from a model of numbers, which may or may not have been a good procedure, but in any case means that being off by a factor of 100 doesn't influence the surrounding text much)

[-]ryan_b3y60Review for 2019 Review

I think this post is excellent, and judging by the comments I diverge from other readers in what I liked about it.

In the first, I endorse the seriously-but-not-literally standard for posting concepts. The community - rightly in my view - is under continuous pressure to provide high quality posts, but when the standard gets too high we start to lose introduction of ideas and instead they just languish in the drafts folder, sometimes for years. In order to preserve the start of the intellectual pipeline, posts of this level must continue to be produced.

In th... (read more)

[-]Raemon5y60

I think the actual final limit is something like:

Coordinated actions can't take up more bandwidth than someone's working memory (which is something like 7 chunks, and if you're using all 7 chunks then they don't have any spare chunks to handle weird edge cases).

A lot of coordination (and communication) is about reducing the chunk-size of actions. This is why jargon is useful, habits and training are useful (as well as checklists and forms and bureaucracy), since that can condense an otherwise unworkably long instruction into something p... (read more)

3Yoav Ravid5y

What is meant by 7 chunks? seems like that in itself was condensed jargon that i didn't understand :P

8Raemon5y

"Something that your mind thinks of as one unit, even if it's in fact a cluster of things." The "Go to the store" is four words. But "go" actually means "stand up. walk to the door. open the door. Walk to your car. Open your car door. Get inside. Take the key out of your pocket. Put the key in the ignition slot..." etc. (Which are in turn actually broken into smaller steps like "lift your front leg up while adjusting your weight forward") But, you are capable of taking all of that an chunking it as the concept "go somewhere" (as as well as the meta concept of "go to the place whichever way is most convenient, which might be walking or biking or taking a bus"), although if you have to use a form of transport you are less familiar with, remembering how to do it might take up a lot of working memory slots, leaving you liable to forget other parts of your plan.

1Yoav Ravid5y

So "7 chunks" was used as almost a synonym for "7 words"? I thought that was some cool concept from neuroscience about working memory :)

5Raemon5y

I think the near-synonym nature is more about convergent evolution. (i.e. words aim to be reflect a concept, working memory is about handling concepts). https://en.wikipedia.org/wiki/Working_memory

[-]DanielFilan2y50

Relevant twitter thread.

[-]TristanTrim5y50

I like this direction of thought, and I suspect it is true as a general rule, but ignores the incentive people have for correctly receiving the information, and the structure through which the information is disseminated. Both factors (and probably others I haven't thought of) would increase or decrease how much information could be transferred.

5Mauricio_AG3y

This is a good point. We can explain why students in medical school carefully digest millions of words by discussing the near-term incentives of final exams and the long-term incentives of increased salary and social status.

[-]ryan_b5y50

This puts me in mind of the mandatory reading of a narrative memo they use at Amazon, which appears to conform to the 'several blog posts' level of coordination. It is hierarchically enforced, and the people who use it are the senior leadership which has, I assume, a capability distribution heavily weighted towards the top of the scale.

Also relevant is the Things I Learned From Working With a Marketing Advisor post.

[-]orthonormal3y40Review for 2019 Review

This is a retroactively obvious concept that I'd never seen so clearly stated before, which makes it a fantastic contribution to our repertoire of ideas. I've even used it to sanity-check my statements on social media. Well, I've tried.

Recommended, obviously.

[-]DirectedEvolution3y40Review for 2019 Review

I see where Raemon is going with this, and for a simplified model, where number of words is the only factor, this is at least plausible. Super-simplified models can be useful not only insofar as they make accurate predictions, but because they suggest what a slightly more complex model might look like.

In this case, what other factors play into the number of people you can coordinate with about X words?

Motivation (payment, commitment to a cause, social ties, status) Repetition, word choice, presentation Intelligence of the audience Concreteness and familiar... (read more)

[-]Ben Pace3y20Review for 2019 Review

Okay, whenever I read this post, I don't get it.

There's some fermi-estimation happening, but the fermi is obviously wrong. As Benquo points out, certain religions have EVERYONE read their book, memorize it, chant it, discuss it every Sunday (or Saturday).

I feel like the post is saying "there are lots of bandwidth problems. the solution to all of them is '5'." and I don't get why 5.

So I read Ray's comment on Daniel Filan's review, where he says:

...at some maximum scale, your coordination-complexity is bottlenecked on a single working-memory-cluster, which (

... (read more)

2Raemon3y

I currently am neutral between "have" and "get", but prefer "have" just because changing a post title on a whim makes it harder to find. If most people preferred "get" I'd be happy to change.

2Ben Pace3y

If it were easy to make elicit things, I'd post one here for people to give a probability that "You get about five words" is better than "You have about five words". Would appreciate someone doing that.

3kjz3y

I prefer "get". It implies more strongly that if someone actually needs to convince others of their argument, they need to make sure their message is as concise and optimized as possible, before trying to convince anyone. As the original post says: You still only get five words.

2Ben Pace3y

I'm also interested in someone else (e.g. Kaj, Zvi, Orthonormal, etc) who managed to get this stuff from the post, trying to make me less confused about how people are getting things from this post.

[-]Kaj_Sotala3y20Nomination for 2019 Review

I've found this valuable to keep in mind.

[-]Jacob Falkovich5y20

This immediately got me thinking about politics.

How many voters could tell you what Obama's platform was in 2008? But 70,000,000 of them agreed on "Hope and Change". How many could do the same for Trump? But they agreed on "Make America Great Again". McCain, Romney, and Hillary didn't have a four-words-or-less memorable slogan, and so...

2Raemon5y

I'm actually two levels of surprised here. I'd have naively expected McCain, Romney and Hillary to have competent enough staffers to make sure they had a slogan, and sort of passively assumed they had one. It'd be surprising if they didn't have one, and if they did have one, surprising that I hadn't heard it. (I hung out in blue tribe spaces so it's not that weird that I'd have failed to hear McCain's or Romneys) Quick googling says that Hillary's team thought about 84 slogans before settling on "Stronger Together", which I don't remember hearing. (I think instead I heard a bunch of anti-Trump slogans like "Love Trumps Hate", which maybe just outcompeted it?)

2philh5y

I had been under the impression that Hillary's was "I'm with her"? But I think I mostly heard that in the context of people saying it was a bad slogan.

[-]Yoav Ravid5y20

So, an action coordination website should be able to phrase actions in four words?

This idea seems interesting, i'd love to see it somehow more formulated.

Do shorter kickstarter descriptions get funded more?

Do protest events on Facebook which have a shorter description get more attendees?

It probably also depends on personality - if you want to coordinate people who are high in contentiousness, you may need more words. for low contentiousness, less words. and if you want both, than you need to give a clear 4-word heading, and a bunch of nuance below.

2Raemon5y

I don't think this directly bears on how to build an action coordination website, more than in lieu of such a site you should expect action coordination to succeed at the 4-word level of complexity. I haven't thought as much about how to account for this when trying hard to build a coordination platform. But, I do think that kickstarters tend to succeed more if the 4-word version of them are intuitively appealing.

[-]Decaeneus1y10

Might LLMs help with this? You could have a 4.3 million word conversation with an LLM (with longer context windows than what's currently available) which could then, in parallel, have similarly long conversations with arbitrarily many members of the organization, adequately addressing specific confusions individually, and perhaps escalating novel confusions to you for clarification. In practice, until the LLMs become entertaining enough, members of the organization may not engage for long enough, but perhaps this lack of seductiveness is temporary.

[-]Mary Chernyshenko3y10

Seems like today the size of the phone screen defines how much of the text one is willing to read (an unselected someone). It's still unclear what people do with it later and how much they retain. But reading in itself seems not so tightly limited; five-words-at-most is what I expect from billboards. But I also expect them to be more like roadsigns/reminders, not original messages (and really I would be surprised if someone treated the words as something beyond advertisement.)

Also, repeated exposure is a thing, which is often the case when one coordinates many people. And the ability of factions to work together although their "core texts" are very different.

Moderation Log