This is a request for cybersecurity advice. I motivate why Lesswrong would benefit from a repository of such advice and why that is relevant for alignment projects (I'm working on one) and propose a structure for answers. 

In Six Dimensions of Operational Adequacy in AGI Projects, Yudkowsky asks for

Strong opsec:  Operational security adequate to prevent the proliferation of code (or other information sufficient to recreate code within e.g. 1 year) due to e.g. Russian intelligence agencies grabbing the code.

And defines the following levels:

Token:  Random people are not allowed to wander through the building.

Improving:  Your little brother cannot steal the IP.  Stuff is encrypted.  Siloed project members sign NDAs.

Adequate:  Major governments cannot silently and unnoticeably steal the IP without a nonroutine effort.  All project members undergo government-security-clearance-style screening.  AGI code is not running on AWS, but in an airgapped server room.  There are cleared security guards in the server room.

Excellent:  Military-grade or national-security-grade security.  (It's hard to see how attempts to get this could avoid being counterproductive, considering the difficulty of obtaining trustworthy command and common good commitment with respect to any entity that can deploy such force, and the effect that trying would have on general mindsets.)

Security is also one of the Asilomar AI Principles:

6) Safety: AI systems should be safe and secure throughout their operational lifetime, and verifiably so where applicable and feasible.

I am a member of the AI Alignment project aintelope and work to properly secure the project as we grow. Right now, we have security at the token level (as defined above): A private Github project, a private Slack group, and some token vetting of people with access.  

How can we do better? What resources are there? First, I looked at Lesswrong. There are the tags Computer Security & Cryptography and Security Mindset, which have some relevant posts, including Security Mindset and Ordinary Paranoia, and Secure homes for digital people. There is also Work on Security Instead of Friendliness? which is about the relation between AI Alignment and Security. But there is little material that helps me secure our project. I think this community, with many people working on alignment projects, needs to have better resources for security. 


Please post best practice resources as answers in the below format. Select sources that you think have high quality. Use regular comments for discussion. Please upvote if you agree with the assessment based on quality. If you disagree with the level, use the factual agreement up/down.

Level: Token, Improving, Adequate, or Excellent

Linked Title (The title of the resource with a link to the source; for books, include the ISBN; consider linking to a review)

Description: A short description; may be copied from a review 

Alternative Material (optional, but bonus points if you provide these): Two alternative sources that you are familiar with that are not as good as your preferred source. This is in the spirit of the rules from The Best Textbooks on Every Subject

New Answer
New Comment

3 Answers sorted by



Start at the meta level - WHY do you care about these aspects of security, and how much are you willing to spend on it (in terms of money, effort, and interference with your actual mission)?  This will probably change over time - you likely don't have anything worth stealing or sabotaging early in a project, and only have a small number of people with reasonable trust levels.  As you grow, your value as a target increases, as do the number of people you don't personally know.  Figure out what the triggers or Schelling lines are where you will make security a serious focus, rather than a nice-to-have.

Then ignore Eliezer's levels - he's absolutely right that it's important, but his maturity-model approach is insufficient.  Security is adversarial, which means it's a pathological distribution of problems, not a smooth surface to approach generally.  Instead, focus on Threat Modeling ( is one resource), and the matching of hassle/expense of protection for yourself vs the risk/cost of loss and ease/likelihood of attack for the resources you're protecting.  

For most secrets, technological protection (encryption, access control/logging, penetration tests, etc.) is sufficient for casual or competitive attacks.  Government "attacks" like subpoenas are really only preventable by simply not having what they want (delete your logs if this is your worry) or being very careful not to be in the jurisdictions with those risks.  Targetted covert state-level or very organized criminal attacks are probably impossible to prevent.  If an org can kidnap 51% of the leadership and torture them for their passwords, they're in, no matter what.  So probably easier to not try to secure against that, or to have anything likely to be attacked that way.



(after writing this I noticed that you're asking for resources, not advice...)

How good are you at system admin stuff? A massive gaping potential issue is that you're using third party services for both hosting your code (which could potentially be extracted via newer versions of Copilot, not to mention the thousands of GitHub employees) and your discussions (how much do you trust Discord to secure your data? How afraid are you of man in the middle attacks?). Both are free services, which means you're the product. 

I have no idea as to whether they are better, but here is a random list of alternatives to Discord. Or you could just go with Signal, which seems more trustworthy, at the cost of convenience.

Hosting your own git server gets rid of the issue of Microsoft accessing your code, at the cost of a lot more bother with setting things up and ensuring you have a secure server. Though I'm guessing a simple OpenBSD + git setup would totally suffice, assuming that you only use that machine for git. This would limit the git attack vector to SSH/OpenBSD - both of which would require a lot of work, at least at non NSA levels.

The above would get you to improving levels when it comes to transmission of the data. Storage is another issue. I see there are some encryption projects for git, but I don't know anything about them. You could assume that remote access of the git server to be unlikely and leave it at that, but it's worth considering physical access, though a secure box with a lock would help a lot with that. Still not adequate, though. Adequate at a minimum would require you physically working from the same location and having discussions in person to minimize the risks of overhearing. Though it adds a nice juicy target for bugging, etc.

How do you run code? If it's on a remote server, then you're back to trusting whoever is hosting it. If you do it locally, then you have to trust yourself (or your distro) to setup the appropriate security. Are you more worried about something getting in, or something getting out?

This is asking more questions than giving specific advice on which specific practices are appropriate for (one of) the four levels. I will post an example of specific advice to get this started.



(this is intended as an example of the template to use and not as serious advice - I googled some results and made it up on the spot) 

Level: Token

How to Secure a Website from Hackers [13-Step Guide]


Describes specific steps to secure a website (e.g., blog or e-commerce webshop) that runs as an installed application on a server that you control. Requires an understanding of web applications, internet basics (URLs), possibly installing applications on Linux and running Linux commands.  

Alternatives: I googled "how to secure a webshop" and looked at the first hits. The above one was the best. Here are two others that are simpler: 

8 comments, sorted by Click to highlight new comments since:

In my view the field of cybersecurity currently is very far from what "theoretically perfect security" would look like. I am not sure how much ahead private knowledge is on the topic, but publicly cybersecurity seems to focus on defending against security holes already demonstrated to be exploitable, and providing some probabilistic defense against some other ones as well. This seems to work well in practice, I don't know why though. (Maybe highly motivated threat actors with sufficient resources simply don't exist?)

Conventional approaches work well if your adversary is limited, but Eliezer gives good arguments for why alignment orgs should seriously beef up security.

If you take this to the extreme and allow an unrestricted attacker (like an AGI), you pretty much turn security into the builder-breaker game. Let me demonstrate what I mean by playing it against myself:

  • B: We have a private Github project.
  • A: You don't know anything about GitHub, I could easily just hack them or have an insider at Microsoft.
  • B: Okay, we will self host a GitLab instance.
  • A: Where are you hosting it, Azure? Remember, I have an insider at MS. (And at every other cloud provider as well.)
  • B: Okay, we will buy our own server, stick it in a colocated data center.
  • A: How are you buying the server? I have active surveillance on you and can tamper with any package you receive. I will install a hardware backdoor in the server.
  • B: Fine, I will buy Raspberry Pi, and carefully inspect it for any hardware modifications, and I will compare every trace on the PCB to the schematics.
  • A: I have an insider at Broadcom giving me access to their hardware backdoor built into all of their SoCs.
  • B: Hm... ok, we will use a RISC-V SoC instead that is completely open source.
  • A: Back to the software side: GitLab is a large piece of software, you don't seriously believe that it doesn't have any 0-days?
  • B: True. Well GitLab is far too large to properly audit, so I will just sandbox the whole thing. I will put it in its own container, and only allow it to be accessed over our VPN.
  • A: Hm... well I can backdoor GitLab itself as well. I will coerce the maintainer of some obscure Go package that GitLab depends on to add my backdoor.
  • B: Well, but I can just restrict internet access to the GitLab container, what good is your backdoor then?
  • A: Right, right. Well... have you heard that CPUs can be used as radio transmitters?
  • B: Easy, just wrap the whole server in aluminium foil. Anything else?
  • A: We are not done with software yet. Linux has remotely exploitable 0-days as well. And good luck auditing the whole Linux codebase.
  • B: Okay, we will switch to Xen. (Or somehow turn seL4 into a hypervisor.) One domain runs GitLab with Linux without network access, another our custom audited ring 0 WireGuard stack.
  • A: Fine, let's assume for now that your software is secure. So have you heard of this company called TSMC that produces most of the world's chips...

And so on. Just like for alignment, I don't really know if this game can be won by the builder.

I know. But I'm not aiming for the Excellent level.

Yeah, I know you are looking for more practical advice here, that's why I posted this as a comment instead of an answer.

Eventually someone will have to aim for the "Excellent" level though (even if not against humans, surely against an AGI), and I just wanted to highlight that this is very much an unsolved problem.

Eliezer commented how someone in charge of security must have a natural security mindset, something most people, including technology experts, do not (which matches my observations). Your LinkedIn profile shows a wealth of technical expertise, but not much in terms of opsec. I wonder if this is an issue.

Valid point. I guess I have ordinary paranoia only. At my current company, where we do well in the pentests, I have a specialist for security, and using specialists will be my approach when we scale beyond the Improving level. 

Security experts are great at details and evaluation, but you can never delegate the question of priority.  Beyond the basics (improving level), there's not a lot of low-hanging fruit - mitigations aren't free, and threats aren't so dire as to be obviously necessary.  The question of what a mitigation costs and what shape of protection it brings becomes the main decision point.  

Once you've done the fairly standard basics (solid access control and logging, encryption in flight and at rest, regular and traceable processes for deployments and operational changes, etc.), there are only tradeoffs that remain - generally things that take time/energy away from your actual mission, or worse, things that get in your way and slow you down as you pursue your mission.  Many of these will be worthwhile for at least some of your data/operations.  But many won't be, or at least won't be this year.

How to choose a secure messenger for your project.

Here is an overview of the security features of all major messengers: 

From my reading of the table, I would classify the messengers as follows:

Token: Google, Apple, Amazon, Element/Riot, Viber

(those with encryption enabled by default and no active spying)

Improving: Signal, Threema, Wire

(those recommended in the article)

Acceptable: Session

(recommended in article and OSS and sign up anonymously)

Excellent: Maybe a self-hosted Matrix installation that is configured by a paranoid security professional.