It's time to worry about online privacy again

Malmesbury

As we all know, if you have nothing to hide, you have nothing to fear. Nobody cares about your private life. You are not an important geopolitical target. Nobody's going to spy on you to know what weird pornography you watch.

And so, around 2015, people gave up on online privacy. Everyone stopped worrying about corporations and governments having full access to their data. In hindsight, I have to admit that things didn't go as bad as some feared. But I don't think this will last.

1. Radioactive decay

Based on real-life events: you're a biologist at the Bad Pathogen Research Institute. You receive an email from a graduate student whose name sounds vaguely familiar. She needs to measure radio-labeled samples with a scientific instrument but, unfortunately, you used it yesterday and you forgot to log out. Now it's locked with your password and she can't connect or even reboot. She's asking you to come as soon as you can to unlock it – as radioactivity decays, the signal is vanishing every minute. Sadly, you are attending a talk on the other side of the city, it's 45 minutes by bike and it's snowing.

Obviously, you would never send your credentials by e-mail, right? Right?

This could, in principle, be phishing. Technically, a cunning spy could have stalked you, figured out your schedule, and crafted a deceptive e-mail to steal your password. But you know it's probably not the case, because nobody cares about your passwords enough to do something so complicated. So you send your credentials to the grad student using a one-time secret sharing link and everything is fine.

I like to think that I can't be scammed because I know the ways of 1337 h4xx0rs well enough so they can't reach me. Of course, this is not true. I could totally be scammed, attackers simply don't have any interest in deploying the amount of energy it takes to scam me.

That's why some people get phished and not others. It depends on two things:
🅰️. How much effort it takes to set up a scam so a given target falls for it
🅱️. How much effort an attacker is ready to dedicate to scamming that target

If 🅰️ is lower than 🅱️, the target gets scammed. If 🅱️ is lower than 🅰️, it's not worth it. On one end (high 🅰️, high 🅱️), you have hackers leaking e-mails from an important government official. On the other end (low 🅰️, low 🅱️), your grandfather receives an e-mail saying a hacker has caught him watching porn and he needs to send money otherwise the hacker will tell everyone. Your grandfather doesn't know much about Internet swindles, he's from a generation who's really ashamed to watch porn, and so he falls for it.

You, me, and most people are in between: too Internet-proof to fall for basic generic scams; not important enough to justify sophisticated personalized scams. Let me insist, you are safe not because hackers can't reach you, but because you are not important enough to justify the kind of attacks that would reach you.

2. The classic roast chicken scam

"Hi Alice, I hope you're having a good time at the concert. I just wanted to let you know that I'm at your apartment with a roast chicken that I bought at the farmer's market. My phone is out of battery, so I'm using my friend's phone to send you this message. Could you please send me your apartment door code so I can leave the chicken in front of your door?"

It took ChatGPT less time to write this than it took me to copy-paste it. Most of the personal context could be figured out based on localization data. Obviously, you would never let a website access your localization data unless strictly necessary, right? Right?

I don't know about you, but I'm scared. Artificial intelligence can totally automate the process of stalking someone. It can extract all the available information from all your accounts on the Internet, then a large language model can generate a perfectly realistic bait, tailored just for you. It's the Nigerian Prince all over again, except this time the Nigerian prince lives on the 5th floor and you had a beer with him last Friday. It could also be blackmail: I hope you never dared write anything politically incorrect on the Internet, because the AI will find it.

Remember 🅰️, the amount of effort required to scam normal people like you and me. What happens when AI makes it shrinks to zero? If impersonation and blackmail become a simple button push, most people will suddenly face attacks much more sophisticated than what they're used to. If unprepared, they'll think "nobody would make up such a complicated scheme just for me", and they will walk straight into the trap.

Like a puzzle, there are very strong network effects with stalking. I don't actually believe it's possible to pinpoint a person based on speech patterns alone, as film detectives do. However, if the same obscure link gets shared on Twitter and Facebook within a few minutes, and the Twitter handle is partially similar to an Instagram account that posted a picture of a monument whose location matches GPS data extracted from a random vuln guitar-tuning app, and on top of that the speech patterns are the same, you can connect the dots and draw some conclusions.

This means that, if you can scam the Median Joe automatically, then you might as well attack tens of thousands of targets in parallel, in a coordinated manner. Each message is filled with bespoke in-jokes and details about the target's whereabouts. Everyone falls for it at the same time.

Whether it's possible to push the world into chaos using social engineering depends on how much information is available about the average person. As people liberally leave more and more identifying information on tens of different platforms, a phase transition occurs, making it possible to fill all the gaps and know everything about everyone.

3. An appointment at Times Square

Who is this attacker we're talking about? It could be something boring like lone-wolf terrorists, Russian cybersoldiers or the Vikings conspiracy. Or it could be an autonomous AI that was programmed to manufacture as many trombones as possible and is now trying to gain power so it can turn Earth into a trombone factory.

But, unlike a lot of evil-AI-takeover stories, this one doesn't require any super-human intelligence. The oldfags among you might remember the 2011 involuntary flashmob, when Internet trolls lured a bunch of people into going to the same place at the same time, all believing they were going on a date. It established a new precedent for how much you can do remotely with an Internet connection.

This didn't involve any science-fictionesque protein-based nanorobots. Instead, it took a lot of effort from many human participants to maintain conversations with the targets over several days. If AI can fully automate conversations, then something as big as the involuntary flashmob can be done with a simple python script. It certainly opens a world of trolling possibilities.

4. When to get paranoid

That leaves us with two solutions: downstream and upstream paranoia. Downstream paranoia is when it's too late – you've already given up all the information it takes to scam you, so you need to be paranoid about every single online interaction, to make sure that every message you receive is not from an impersonator. This comes with a serious erosion of Internet trust, assuming there can be a working Internet at all in these conditions.

Upstream paranoia is what Richard Stallman has been telling us to do for forty years: make sure that you don't give away enough information so that AI can't do realistic scams in the first place. Privacy is like a plunger – you should get one before you need one.

As your prototypical nerd, I used to be really into FOSS, the EFF, blob-free GNU/Linux distros, XKeyScore, Echelon, INDECT, PRISM and other names most of us have forgotten what they were. Then, like most people, I gradually stopped caring, and now I'm leaving a trail of personal data wherever I go.

I guess it's time for me to go back to my pre-2015 technoparanoia. Don't get me wrong, at the society level, we are definitely past the phase transition point – even if you are secretive, most people are not and the information is out there. But I think it's still possible to protect yourself if you act early enough. There is no way the Internet remains the way it is now. It's hard to tell how much privacy will really be necessary on the "new" Internet, but given the pace of language models' progress, I'd rather err on the overkill side.

If you are going to engage in anonymity warfare, here are a few old-school tips for upstream paranoia (I can't promise that they are secure, let me know if you think they aren't):

A personal favourite: TrackMeNot. This doesn't prevent Google from spying on you, it just drowns Google in a flood of fake requests. So every time you search for something, TrackMeNot also searches for all kinds of random stuff, and now Google thinks you're a hunter from Siberia. I find this approach particularly promising.
I heard hosting your own Searx instance was pretty good. I haven't tried that yet.
Signal or Matrix.org instead of Whatsapp, Messenger, etc. The good thing is that you will no longer sound like a terrorist or a pedophile when you ask your friends to switch, so it might actually work (just point them to this post!)
In general, use free software (using Linux makes this much easier). Especially avoid anything that's funded by advertisement. "If it's free, you are the product."
Remove all public information on social media. Preferably use Mastodon. Exhume your old Internet pseudonym.
This is going to sound super extreme, but you might want to store music and films on a hard drive instead of streaming. It just sounds too easy for an AI to impersonate you when it knows what shows you watch.
For e-mail, I currently use Protonmail. It's claimed to be fairly confidential.
Newpipe and Invidious as front-ends for Youtube (I'm afraid Peertube is not the best thing for anonymity, as it's peer-to-peer)
If you want to dive all the way in, try Richard Stallman's lifestyle. Also, see Gwern on maintaining anonymity.

Let's make sure AI attacks encounter at least some resistance. This resistance starts with you (*epic music*).

Cross-posted from my blog.

This post would benefit from being clearer about its threat model, and the recommendations seem hard to square with what seems to me to be the most likely. Who are you worried about using your information to trick you? The government, Google/Amazon/Facebook/Apple/etc, independent scammers? Most of your examples seem like the third category, but then your recommendations are mostly about avoiding information being available to the second category.

One of your recommendations in particular, though, seems especially wrong given a wide range of potential privacy threats: "Preferably use Mastodon". The Fediverse has almost no ability to protect against bulk collection and retention of data, and while people who say "I'm going to be scraping things and archiving them" will get blocked, someone who does it quietly won't. (I'm strongly in favor of people switching to Mastodon, but privacy is not its strong point.)

Separately, I don't see a consideration of costs and benefits here. You've described some ways in which having more information about you on the public internet could be used to attack you, and advocated some large changes to what technologies and approaches people use, but without acknowledging that those changes have costs or attempting to argue that the costs are worth it. I'd especially be interested in arguments around how much your proposed changes would reduce someone's exposure by, since the benefits of decreasing information available to scammers aren't linear (ex: a complete decrease is probably worth much much more than 4x as much as a 25% decrease).

I use a pretty different approach here, and one that I think is a lot more robust: I expect privacy to continue to decay, and that measures that we thought were sufficient to keep things private will regularly turn out to not have been enough. This will happen retroactively, where actions you took revealed a lot more about yourself than you expected at the time (ex: revealing HN alt accounts. So I make "public" my default and operate on the assumption that people already can learn a lot about me if they want to. This means I can use whatever tool is best for the job (which may still be Linux or Mastodon!) and get the benefit of sharing information publicly, and am in a much better position for when it turns out that some bit of privacy protection had actually stopped working years ago.

In general, use free software ... If it's free, you are the product.

Wait, what? 😲

Probably supposed to be something like "If it's free [and not open source], you are the product."

I know, it just sounded very funny in the same paragraph.

(And there is a possible overlap, for example Android.)

Android is (partially) open source but it's not "free as in freedom", which is a technically narrower thing: https://itsfoss.com/what-is-foss/

I've heard this comment as, "if you're not paying for the product, you are the product."

If your main threat model are AI-enabled scams (as opposed to e.g. companies being extremely good at advertising to you), then I think this should influence which privacy measures you take. For example:

A personal favourite: TrackMeNot. This doesn't prevent Google from spying on you, it just drowns Google in a flood of fake requests.

Google knowing my search requests is perhaps one of the more worrying things from a customized ads perspective, but one of the least worrying from a scam perspective (I think basically the only way this could become an issue is if there was a data leak that revealed people's search histories?)

"Remove all public information on social media." seems like by far the most important point on your list from a scam perspective. I'd add something like "avoid giving data to smaller websites/apps/..." (like the GPS example you mention earlier). On the other hand, most of the typical privacy measures people talk about are geared at preventing big companies from getting your data, but that's only a secondary concern for AI-enabled scams. So I'm not convinced that things like switching away from gmail or WhatsApp, etc., make sense on those grounds.

(AI could also be used to make an argument for privacy from big companies, e.g. based on worries about recommender systems becoming too good at maximizing engagement, but that seems quite distinct.)

I see your point, and you're right. Data leaks from big companies or governments are not impossible though, they happen regularly!

A lot of this sounds AGI-complete, won't be a problem before AGIs could hack you physically using diamondoid nanotech anyway, which they won't if alignment works out.

Most of it is available with good tool-level ai, plus motivated human scammers which we know already exist in quantity. Don't think of it as AI-led, but AI-enabled-scaling.

I think scaling with sub-AGI capabilities won't produce that phase transition the post references. There might be noticeably more social engineering of the normal mass produced kind, with marginally better guesses and scripts, but not at all resembling something a talented professional backed by surveillance and hackers might be able to pull off at present. So if you are currently invincible to cheap scammers, you'll continue being so, perhaps adapting to become a bit more suspicious to unverified communications than now. It's not even clear if they would bother you more as they become more "productive", because their tools might figure out that you are probably invincible and so not worth bothering with.

I honestly don't know. Ability for scammers to get 1% success rater vs 0.01%, or susceptibility of 5% rather than 1% (wild guesses; it's hard to find un-motivated stats) of the population could easily be a big enough change to break the world. I don't know what things look like if single-digit percentages of adults can't have a bank account or credit card without getting scammed.

If you want to really go ham, check out Michael Bazzell's book and podcast.

I understand the arguments, but I'm not sure the suggested solutions make sense from the perspective of security:

As your prototypical nerd, I used to be really into FOSS, the EFF, blob-free GNU/Linux distros, XKeyScore, Echelon, INDECT, PRISM and other names most of us have forgotten what they were. Then, like most people, I gradually stopped caring, and now I'm leaving a trail of personal data wherever I go.
I guess it's time for me to go back to my pre-2015 technoparanoia.

Very popular vetted open-source software can be more secure than closed software, but it by no means has to be. And software by volunteers can't necessarily react as quickly to newly discovered security vulnerabilities. Using a privacy-conscious FOSS browser, for example, might improve your privacy but leave you temporarily more vulnerable to newly published security vulnerabilities, and the overall sign of such a trade-off is unclear to me.

Or: If secure code is sufficiently hard to write, the AI task of crawling the web to profile individual users seems way harder and less useful than unleashing exploit-finding AIs on Github. Can secure open-source software even exist in such a world?

I agree on the point that open source software doesn't have to be more secure. My understanding is that they are less likely to send user data to third parties as they're not trying to make ad money (or you could just remove that part from the source). For the exploits-finding AI, I can only hope that the white hats will outnumber the black hats.

I've adopted Richard Stallman lifestyle long ago, primarily by refusing to use a smartphone. I also don't have an account on the major social media, never buy anything online (to the point that someone once gave me an Amazon shopping voucher but I didn't start to use Amazon anyway), pay cash whenever possible, use Protonmail, don't store anything in the cloud, and obviously use Linux. Also, my browser is configured to throw away every cookie after every session. I am probably still far from Richard Stallman's level of attention, but so far I've been very happy with this lifestyle; the level of online annoyances is very low.

If you feel comfortable answering: what type of career are you able to sustain? When did you begin this career? If you did not already have your current set of occupational/professional contacts, how plausible would you guess it would be to come by them in the present day within your current lifestyle?

I am a post-doc in academia, which probably helps with the no-smartphone thing because people in academia are quite accustomed to communicate by email anyway (I've never worked outside academia). Absurd as it may seem, me not having a smartphone is surprisingly difficult to notice, because even if someone notices that I never extract smartphones from my pocket, no one is going to directly ask me why I don't obsessively check the phone. I know many people (colleagues included) who are probably still assuming that I had a smartphone, even if they just know that I don't use their favourite message app. And I don't deliberately lie to them; I just avoid to introduce myself as "weird person without smartphone" from day one.

As for the third question, most of the professional contacts in academia are from in-person networking at conferences, and of course I go to conferences (maybe I told a small lie, it's not literally true that I never buy anything online... I make a small exception for air travel tickets and reservations in foreign hotels, since they are very difficult to purchase offline and I can't avoid going to conferences, but these are expenses that I must report anyway to get the refunds).

I'm also a postdoc, and my institution more or less requires having a smartphone because you can't do anything without their proprietary 2-factors authentication. The other proprietary thing that seem mandatory is Zoom, have you found a way to escape from it?

Well, I actually don't have the first problem since my institution still uses the old username + password login procedure (for other things needing 2FA like Protonmail I use a free desktop program to generate the keys). They also prefer Google Meets or MS Teams to Zoom. Both are nonfree, but at least they work directly in the browser and I don't have to install nonfree software on my laptop.

Thanks for the post, it's an important update on the state of information warfare.

Privacy can be thought of as a shield. If you build a wall against small-arm spam, then it's ok, but if you try to build an underground bunker, then it's weird because only Certified Good Guys have access to advanced weapons. Why are you trying to protect yourself against the Certified Good Guys?

What changed is that thanks to AI advancements in the last few years, it become possible to create homemade heat-seeking infomissiles. Suddenly, there are other arguments for building bunkers.

The underground bunker is only weird if I’m vocal about it. Am I posting this from behind tor? Do I pass my messages through an anti-stylometry filter before sharing them? I’m already operating behind a nonsense pseudonym, but if I used a real-sounding name that wasn’t my legal name, would you question it? It’s often the case that effective privacy techniques simply aren’t easily detectable by 3rd or even 2nd parties: the goal is frequently to blend in (to become a part of some larger “anonymity set”).

Because many of those Certified Good Guys happen to not be from my country.