--dangerously-skip-permissions

wingspan

I noticed that some AI-safety-focused people are very active users of coding agents, often letting them run completely unrestricted. I believe this is a bad standard to have.

I do not think that running Claude Code itself is catastrophic, but risk profiles are not binary and I believe that this new interface should be used more cautiously.

I will explain my thoughts by going over the standard reasons someone might give for --dangerously-skipping-permissions:

#1 It's not dangerous yet

Many people are using Claude Code, Codex, Cursor, etc. and there haven't been any catastrophic accidents yet, so it is a reasonable claim.

Even if today's models are safe, I expect they will get more dangerous down the line. The important thing is knowing where your red lines are. If the models get slightly better every two months, it is easy to get frog-boiled into riskier behavior by not changing your existing habits.

If you are fine with the small chance of all your files being deleted, that's okay; just define this explicitly as an acceptable risk. A year from now the consequences of a misbehaving agent could be worse and you might not notice.

#2 The benefits outweigh the risks

Modern-day agents are very, very useful, so you should consider their advantages for your productivity, especially if your work is important.

I think this is true, and I also think that agents are only going to get more useful from this point on. If your current workflow is autonomously running simulations and building new features, your future workflow might be autonomously writing complete research papers or building dozens of products. This will feel amazing, and be very effective, and could also help the world a lot in many important ways.

The reason people are worried about super-intelligence is not because it won't contribute to personal productivity.

The risks and benefits of these powerful, general-purpose tools both arise from the same set of capabilities, so a simple utilitarian calculus is hard to apply here. The benefits are concrete and quantifiable and the risks are invisible - they always will be. Instead of making sure it's more beneficial than harmful, I believe that when working with agents, it is better to set up hard limits on the blast radius, then maximize your productivity within these limits.

#3 If anyone builds it

There is the MIRI viewpoint of a big, discrete leap in capabilities that makes everything much more dangerous. Here is a simplified view:

Current models are not catastrophically dangerous as they are not capable of completely autonomous operations and recursive self-improvement.
At some point (1 year or 20 years from now), someone will build an AGI system that is capable of those things.
If that happens, everyone dies.

If we are in such a world, then how you use your AI models doesn't really matter:

If the model is not capable enough for self-preservation and self-improvement, then it won't cause catastrophic harms, even if we let it freely roam the internet.
If the model is capable enough, then we lost the moment it was activated. You can cut off its network access and put it in a box on the moon. As long as it has any communication with the outside world, it will figure out some way to beat us.

Given this view of the world, how your local user permissions are configured no longer seems relevant.

This is somewhat of a strawman, as MIRI didn't say that current models are not dangerous, and as far as I am aware, are not in favor of running today's coding agents unsupervised. But this mindset does seem common in the field - The only important thing is preparing for AGI, which is a singular moment with existential consequences, so lower risk profiles don't matter.

The endpoint probably does look like this (with extremely capable minds that can radically alter our society according to their values), but real issues begin much earlier. Unrestricted "non-AGI" agents may still run malicious code on your machine, steal sensitive information or acquire funds. Even if they won't destroy the world, we will have a lot of problems to deal with.

Similarly, people should still make sure their roofs are clean from asbestos even if their country is threatened by a nuclear-weapon state.

(Also, cyber security failures can contribute to existential risks. In case of a global pandemic for example, we would need good worldwide coordination to respond well, which would require resilient digital infrastructure.)

Conclusion

If you use Claude to autonomously run and test code on your computer, you'll probably be fine (as of January 2026). You should still be wary of doing that. Some risk is acceptable, but it should come with an explicit model of which risks are allowed.

Agent security is an emerging field (I have heard of several startups on this topic and a dozen more are probably setting up their LinkedIn page as we speak). I am not well-informed in this field, and this is why I didn't recommend any concrete suggestions for how to solve this. My point is that this is a relevant and important topic that you should consider even if your focus is on the longer-term consequences.

Thoughts that come to mind:

In the worldviews of gradual takeoff or near term risk from distributed LLM agents, it makes sense to have a worldwide norm of handling current agents responsibly.
People reading this are in a good position to understand the risks and push consumer demand for agent-security solutions or develop hacks and tools.
It might just be too small of an effect.
There are also minor gains in the repeated application of the security mindset.
People holding different worldviews should perhaps still take this seriously as different worldviews should work together and be more risk averse

The problem is that CC asks 100 permissions per hour and it is impossible to understand if any is dangerous without spending time on details. So I manually approve without looking deep into what it asks. It can be called "permission noice". A temporary solution would be to put another agent to manage permissions.

I haven't seen anyone arguing that users giving generous permissions to Claude Code are going to doom humanity.

It can and probably will mean that whatever system they're giving permissions on is going to be trashed. They should also expect that whatever information is within that system will be misused at some point, including made available to the Internet in general and every bad actor in it in particular.

It's reasonable to put it in a system that you don't care about, and control exactly what personal information you put into that system regardless of any permission settings. I don't just mean a VM within a system that you depend upon either, because high-level coding agents are already nearly as good as human security experts at finding exploits and that will only get worse.

I make an isolated sandbox for it using containerization and then run it with --dangerously-skip-permissions inside it. It only has access to what's inside the sandbox.

Made a Cmder plugin that auto-accepts Claude CLI, much like --dangerously-skip-permissions, but can edit which prompts it responds to, so you choose what Claude can auto-run vs what needs your confirmation (will need tinkering).

https://github.com/doctorfarhan/claude_auto_proceed