[META] Building a rationalist communication system to avoid censorship

by Donald Hobson1 min read23rd Jun 202033 comments


PoliticsInformation HazardsCensorship
Personal Blog

The recent disappearance of Star Slate Codex made me realise that censorship is a real threat to the rationalist community. Not hard, government mandated censorship, but censorship in the form of online mobs prepared to harass and threaten those seen to say the wrong thing.

The current choice for a rationalist with a controversial idea seems to be to publish it online, where the most angry mobs from around the world can access it easily, or not to publish at all.

My solution, digital infrastructure for a properly anonymous, hidden rationalist community.

Related to kolmogorov-complicity-and-the-parable-of-lightning (Now also deleted, but here are a few people discussing it)



So we need to create the social norms and digital technologies to allow good rationalist content to be created without fear of mobs. My current suggestions include.


1) Properly anonymous publishing. Each post that is put into this system is anonymous. If a rationalist posts many posts, then subtle clues about their identity could add up, so make each post independently anonymous. Given a specific post, you can't find others by the same author. Record nothing more than the actual post. With many rationalists putting posts into this system, and none of the posts attributable to a specific person, mobs won't be able to find a target. And noone knows who is putting posts into the pool at all.

2) Delay all published posts by a random time up to a week, we don't want to give info away about your timezone, do we.

3) Only certain people can access the content. Maybe restrict viewing to people with enough less wrong karma. Maybe rate limit it to 4 posts a day or something, to make it harder to scrape the whole anonymous site. (Unrestricted anonymous posting, restricted viewing is an unusual dynamic)

4) Of course only some posts will warrant such levels of paranoia, so maybe these features could be something that can be turned on and off independently.

My current model of online mobs is that they are not particularly good at updating on subtle strands of evidence and digging around online. One person who wants to stir up a mob does the digging, and then posts the result somewhere obvious. This raises the possibility of misinformation. If we can't stop one person putting our real name and address on a social media post where mobs can pass it around, could we put out other false names and addresses.

Preventing Spam

1) Precondition GPT-X on a sample of rationalist writings. Precondition another on samples of spam. Anything that causes more surprise on the rationalist net than the spam net is probably spam. (In general, AI based spam filtering)

2) Reuse codes. When you input a post, you can also put in a random codeword. Posts are given preferential treatment for the spam filtering if they are associated with a code that was also given with known good posts. codewords are hashed and salted before being stored on the server, along with a number representing reputation, and never shown. Posts are stored with their reputation + a small random offset.

3) Changing password. Every hour, come up with a new password. Whenever anyone with enough Karma requests any page, put the hours password in small text at the bottom of the page (or even in a html comment). When someone uses this password, you know that it was someone who visited some lesswrong page in the last hour, and can't tell who. You could restrict viewing with the same password.

I look forward to a discussion of which cryptographic protocols are most suitable for building this.


33 comments, sorted by Highlighting new comments since Today at 2:57 AM
New Comment

This seems like seeking security through obscurity.

If a blogger wanted to be fully anonymous, they could do so by excluding personal details and using software to disguise their IP address.

The trouble with SCC is that Scott wants to be able to talk about his job and personal life, be known for his blog among friends, and cultivate IRL community around the blog. He just wants to avoid mass-media attention/fame. He wants a readership that finds their own way to the blog. He also wants to avoid creating a perception of shadiness.

It seems to me that The Whisper would both look shady (it already sounds that way), and not actually accomplish this goal. The only way to accomplish it is if mass-media outlets abstain from outing bloggers. In fact, The Whisper would probably attract additional media attention.

Security point 1: The site becomes the target. As has been proven many times hosting and DNS are vulnerable. If they can't get you then they'll get all the roads to you. They don't care about collateral damage.

I like this idea, but I have a couple of thoughts.

Anything that approaches anonymity always seems to attract people that would otherwise be shunned, because they feel comfortable spewing nonesense behind a veil of anonymity. How would this be avoided?

And do we have an exampe of any network like this working? It would be nice to glean lessons from previous attempts.

I like the idea of unlimited posting, limited viewing- it's unusual. But it wouldn't be very user friendly. Scaling with karma does provide a good incentive, it would also mean people are really incentivized to keep holding on to their accounts.

One alternate possibility would be to actually let go of anonymity towards the network organizers. The posts could be anonymous, but one can only access the network if one has a personal connection to someone willing to vouch for them, and both their real names are coupled to their accounts.

This way spam and outside mob access is completely prevented. But the question is are people comfortable with posting on an account that is coupled to their real name, even if that real name is only accessable to the network organizers?

I know I usually want to keep all my internet accounts as uncoupled as possible. I'll delete an account and start over with a fresh slate, so my past comments are not linked to my current accounts. As I keep leaking personal information with each interaction (writing style, timezone, etc.) At a certain point I feel I've exposed too much, and exposing more with the same online persona would lead to too hogh a chance of getting deanonymized. So I start over.

This isn't a very rigorous system: it would be better to couple this to a calculation of leaked information instead of just a feeling. But it's something.

Users on this "network" are capable of being pseudonymous. Anonymity is probably also possible, tho (much?) harder. We don't seem to have attracted too many people "spewing nonsense", or that many at all.

Requiring a personal connection to existing users will shut out a lot of potential users. And it's probably better for plausible deniability that we continue to allow anyone to signup.

I – and I'd guess most other users – are not doing enough to reliably avoid de-anonymization. It requires very strict opsec in general.

And I don't know how you could possibly calculate the probability of being de-anonymized, even with perfect information about everything you've leaked. Relying on your feelings is probably the only practical option, besides not sharing any info.

I might register if it required a real name, but I definitely wouldn't comment or post. The first defense against doxxing is to avoid using your real name on anything you don't want linked to you.

It's also an unenforceable rule. If Facebook can't do it, I doubt LW is going to be up to the technical challenge.

Simply requiring log-in to read some posts, and limiting the rate of new users (maybe even make it invite only most of the times, like a private torrent tracker), should go a long way to prevent mob attacks.

If log-in is required to read posts, I am afraid there would be no new users. How would anyone find out that the interesting debate exists in the first place? But if you have no users, there is no one to talk to.

I think this is a problem, but not an insurmountable one (note that Facebook requires login to see most things)

I meant hiding just the CWish posts. There're enough non-CWish posts to attract people that value the way of thinking in general.

Also, it doesn't sound that bad to attract users through 1 to 1 recommendations only. Or allow unlogged people to read all, but only little by little release the power for new users to interact with the content. Maybe release it all at once if a high karma user vouches for you (they lose it that person gets banned or something). Maybe instead of karma, there could be another value that better reflects how much you are likely to value proper manners and thinking (e.g., it could be obtained by summing karma from different topics i in a way that overvalues breadth of interest ).

I'm just thinking out loud in real time. My main point one can go a long way just by limiting the rate at which new users can invade and screw with the content.

A quiz and a day's wait before adding a new user is another option. Make it something that a regular lurker who read the rules would be able to pass easily, but a rando couldn't. SCP wiki did something like this, it seemed to help with quality control.

Rotate through 3 different quizes, or scramble the quiz order sometimes, if you want to make automated sign-ups annoying for mobs and spammers. Have the web people track the number of sign-up-quiz fails (it's a nice metric for "is there a mob at the doorstep").

(Edit: Ah, someone already proposed a more-elaborate variant using GPT-X. Simple quizes with a few mild gotcha-questions should be enough of a screen for most cases, though.)

A proposal I think I haven't seen posed is giving new members a "trial period." If an average (or randomly-selected) post doesn't have a karma score of at least X by the end of the period (or if it dips below Y at any point), they're out and their stuff is deleted. Ban them from handing out karma until after the trial, or this quickly breaks. This probably still has weird incentive consequences that I'm not seeing, though...

...it does mean having a bit of an evaporative-filter for quality-ratings, and it means links to crappy posts turn into deadlinks in just a matter of time.

Make a captcha with GPT-X rationalist content against real rationalist content. If you can't tell the difference, you are out :P

Also, train GPT-X on content that triggers mobs, and then use it to keep them busy elsewhere :P

This might become difficult with values of , though…

I have seen some captchas like "What's the capital of [Some Country]" in some forums. We can add some basic captchas that need some level of highschool math and basic familiarity with the sequences for verifying new user registrations, and then wall-off certain posts from unverified users.

I'm not sure this will be that  effective though. All it takes to defeat it is some screenshots.

I was 80% kidding. I do believe that the type of people that could attack this community are hugely people that can't tolerate trying to read and understand the kind of content in here; let alone Scott's 999999 word analytical yet clear essays. They didn't sign up for real thinking and nuance when they went into activism.

And unlike others, I don't think mobs are organized. They look like it, but its some sort of emergent behaviour that can be managed by making it boring for the average mob member to attack.

They look like it, but its some sort of emergent behaviour,

I agree with this assessment. It almost feels like a hive mind; I've dipped into the peripherals of online mobs before, and have felt "hey, this action is a good idea" thoughts enter my head unbidden. I'd probably participate in such things often, if I didn't have a set of heuristics that (coincidentally) cancels out most of this effect, and a desire not to associate with the sorts of people who form mobs.

If the barrier-to-entry is increased to "requires two minutes of unrewarded drudgery, where it's not intuitively obvious what needs to be done" in such a way that a short, well-worded "mob instruction" message can't bypass the effect, it's unlikely a mob will form around such actions.

Incidentally, I wonder whether programming for the mob is a field of social psychology.

Maybe restrict viewing to people with enough less wrong karma.

This is much better than nothing, but it would be much better still for a trusted person to hand-pick people who have strongly demonstrated both the ability to avoid posting pointlessly disreputable material and the unwillingness to use such material in reputational attacks.

I wonder what would happen if a forum had a GPT bot making half the posts, for plausible deniability. (It would probably make things worse. I'm not sure.)

I like the general idea, but I'd be wary of venturing so far in terms of privacy that the usability becomes terrible and no-one wants to use it.

Is a commitment to entertain controversial or unpopular or odious ideas (or to advocate for them) separate from or integral to rationalism? Is a mental health professional's preference to maintain enough anonymity so that their blog does not interfere with their practice or their safety separate from or integral to rationality? I phrase those as questions because I'm not sure. When it comes to the general idea that anonymity is needed to discuss certain or any topics, I'm more skeptical. People who use their real names on FB and Twitter spout off about anything and everything. Some of them will benefit from the reactions they get, some will suffer. Just like if they were sharing their views in person. A note of humility: I remember about 15 years ago noticing that anonymous comments on newspaper websites were a cess pool. I thought things would be much better if people had to put their names on their opinions. A lot of papers moved to a system where commenters used their FB IDs. I thought that would improve the discourse. I now think I was wrong, and that didn't make things much better. So if I suspect that anonymity is not the boon others may think it is, you can take that with a grain of salt.

Is a commitment to entertain controversial or unpopular or odious ideas (or to advocate for them) separate from or integral to rationalism?

Integral – for epistemological rationality anyways, and arguably too for instrumental rationality as well.

Is a mental health professional's preference to maintain enough anonymity so that their blog does not interfere with their practice or their safety separate from or integral to rationality?

I don't think it's "separate from" as much as 'mostly orthogonal'. Scott is largely to blame for his relative lack of pseudonymity – he's published, publicly, a lot of evidence of his identity. What's he trying to avoid is losing enough of what remains so that his (full) legal name is directly linked to Scott Alexander – in the top results of, e.g. a Google search result for his legal name.

When it comes to the general idea that anonymity is needed to discuss certain or any topics, I'm more skeptical.

You're right, it's not needed to discuss anything – at least not once. The entire issue is whether one can do so indefinitely. And, in that case, it sure seems like anonymity/pseudonymity is needed, in general.

I don't think there's a lot of anonymity here on LessWrong but it's certainly possible to be pseudonymous. I don't think most people bother to really try to maintain it particularly strictly. But I find the comments here to be much better than in anonymous/pseudonymous comments in other places, or even – as you seem to agree – on or via FaceBook or Twitter (or whatever). This place is special. And I think this place really is vulnerable to censorship, i.e. pressure NOT to discuss what's discussed here now. The people here – some of them anyways – really would refrain from discussing some things were they to suffer for it like they fear.

Anonymity is hard – the posts themselves, e.g. word choice, phrasing, etc., are all extremely vulnerable 'side channels' for breaking anonymity. Defending against those kinds of attacks are probably both impractical and extremely difficult, if not effectively impossible.

The most important step in defending against attacks is (at least roughly) determining a threat model. What kind of attacks do you expect? What attacks are worth defending against – and which aren't?

But as the OP notes, we need only defend against mobs and hard evidence. If, e.g., Scott writes pieces with throwaway accounts that are obviously in his style, he can always claim that's an imitator. The mob lives for easy outrage, so as long as we decrease that factor we have a partial solution.

The mob doesn't care if you have plausible deniability. 

I don't think that would work because a definite identity is needed for people to follow Scott. I don't think I could possibly track 'Scott', or anyone, and notice that there was a specific identity, if I couldn't track a named identity, i.e. a specific account.

Part of who Scott is to a lot of us is someone, i.e. a specific identity, that's worth following, tracking, or paying attention to over long periods. Using throwaway accounts makes that harder, even ignoring the kind of fuckery that someone could pull with something like GPT-3 at hand – maybe it's even (effectively) impossible now.

And to the degree that we all could continue to track Scott, even with him posting via throwaway accounts – and even assuming otherwise perfect opsec from Scott too – so could people who feel malice towards him track him similarly. We'd see 'likely Scott posts' lists. We wouldn't be able to prevent people from speculating as to which posts were his. Scott would only be protected to the degree that we couldn't tell it was him.

There's probably a math/compsci/physics theory that covers this, but there's no way for Scott to be able to maintain both his pseudonymity and his identity – to the degree he wants (or needs) – if his (full) legal name is linked to his pseudonym on the first page of web search results for the legal name.

The safer option would be for Scott to create a new pseudonym and try to maintain (very) plausible deniability to any connection to Scott Alexander. But that would mean starting over and would probably be very hard – and maybe impossible anyways. It might be impossible for anyone to write in a different enough style, consistently, and avoid detection, especially at the lengths he typically posts. It's probably even that much harder (or impossible) for people that have written as much as he has publicly.

I've seen some presentations about how to do style-matching off of GitHub repos to pretty-confidently ID anonymous coders. While set-up requires a sizable amount of compute and data, the results have gotten quite impressive. There are ways to work against this (stuff that deliberately obscures your coding style, usually by rewriting your code), but they're not that well known. And a similar thing can be done with writing style and writing samples.

Staying anonymous against high-effort attempts to discern your identity has gotten very hard, and is only likely to get harder.

At some point, all you can do is guard against the low-effort ones.

For #1, don't forget the power of reputation in transforming one-off interactions into iterated ones. I suspect that going with that would end up being a pyrrhic victory.

For simpler ideas:

  • How about just a website where all posts are encrypted/using it requires basic knowledge of, say, PGP?
  • Or using some other existing platform that's supposed to prevent censorship?

There's no purely technological solution to censorship, especially indirect forms like what's arguably happening to Scott.

Bruce Schneier, a famous cryptography expert and advocate, eventually realized that cryptography, while good, was often besides the point – it almost always made sense to attack the other parts of systems instead of try to break encryption algorithms. Here's a good essay by him about this and a matching XKCD:

To the degree that we want to avoid censorship, we need to either adopt sufficient opsec – much better than Scott and even better than Gwern, who wrote to someone blackmailing them (by threatening to dox them):

It would be annoying to have my name splashed all over, but I resigned myself to that back in ~2010 when I decided to set up my website; I see Gwern as a pen name now, and not a real pseudonym. I’m glad it managed to last to 2015.

– or we need to prevent censorship IRL. Obviously, we can (and should) do both to some extent.

But really, to avoid the kind of censorship that inspired this, one would need to remain strictly anonymous/pseudonymous, which is hard (and lonely – IRL meetups are a threat vector!).

I was suggesting crypto for a different reason: as a trivial inconvenience/barrier to entry.

I can't think of a way that could work that couldn't be automated away, e.g. to a barrier consisting solely of 'install this browser extension'. (Or not at least without being a relatively non-trivial annoyance to the 'trusted' users too.)

I don't think LessWrong is big enough to have a separate channel like that with enough users. It seems more effective if individual posts with higher need are published with throwaway accounts. 

The problem with that is that people who don't use throwaway accounts might be tarred by association, e.g. "I googled their name and it looks like they are a LWer; haven't you heard LW is a haven for white supremacists now? Here's a screenshot of a post on LW that got upvoted instead of removed..."

This is a question of how associated the publishing location happens to be. You can have your throwaway account on medium or steemit if you are worried about medium censorship.  

You get the problem of discoverability. I don't think you can add the discoverability without also adding association that can be attacked.

Additionally a TheWispher network that's linked to LessWrong karma still allows LessWrong to be tarnished by association.