Sandboxing AI Agents in Linux

blog.senko.net

117 points by speckx 2 days ago

I use Leash [1] [2] for sandboxing my agents (to great effect!). I've been very happy with it, it provides strict policy-level control for all process-level + network-level activity, as well as full visibility and dynamic runtime controls via WebUI. Way better than bubblewrap imo.

I originally saw it here on HN and have been hooked ever since.

[1] Screenshot: https://camo.githubusercontent.com/99b9e199ffb820c27c4e977f2...

[2] https://github.com/strongdm/leash

Fun fact: Do you know what container / sandboxing system is in most widespread use? Not docker containers, certainly not bubblewrap, and not even full VMs or firecracker. It's Chrome tabs.

observationist 2 days ago

Using Chrome for anything seems like a security failure of itself. It's got great features, but damn do they come at a cost.
necovek 2 days ago

That's interesting, how does Chrome implement "sandboxing" in Windows and MacOS? For Linux, does it use the same underlying technology as Docker, Podman, LXD, LXC (cgroups, namespaces...)?
Or is a custom "sandboxing" implementation not relying on system level functions (eg. a VM with restricted functions)?
If the latter, I wonder if something like JRE or .NET CLR is still out there in larger numbers, but obviously, Chrome does have billions of users.
- spijdar 2 days ago
  
  Yes, Chromium has "native" sandboxing on all those platforms, Windows [0] Linux [1] and MacOS [2].
  Chromium uses both seccomp filtering as well as user namespaces (the technology that Docker/Podman use).
  The Windows and MacOS sandboxing strategies are more "interesting" because I've seen very few (open source) programs that use those APIs as extensively as Chromium. On Windows, it makes use of AppContainer [3] (among other things), while on MacOS it uses the sparsely documented sandbox API [4], which I think was based on code from TrustedBSD?
  [0] https://chromium.googlesource.com/chromium/src/+/HEAD/docs/d...
  [1] https://chromium.googlesource.com/chromium/src/+/HEAD/sandbo...
  [2] https://www.chromium.org/developers/design-documents/sandbox...
  [3] https://learn.microsoft.com/en-us/windows/win32/secauthz/app...
  [4] https://manp.gs/mac/7/sandbox
JCattheATM 2 days ago

> certainly not bubblewrap,
Eh, it might be bubblewrap given it's what flatpak uses.

sylvinus 2 days ago

This is the way to go! On my side I've build a very small `claude-vm` wrapper to run each instance in a VM with Lima: https://github.com/sylvinus/agent-vm

JeremyNT a day ago

I did similar with incus!
I'm convinced that VMs are the right primitive here, for now. Being able to give an agent full root and passing it in just the stuff you want it to have is super easy and it's extremely foolproof. I have my assistants free to install software, run docker, build their own nested VMs, etc. knowing that the boundary is sound and that no capabilities will ever be sacrificed.
I might switch to LXC to reduce the weight somewhat (easy with incus) but this requires providing a more limited set of tools (i.e. podman instead of docker).
bwrap is great, but you're stuck with the limitations of the environment, which depending on what you're doing may neuter the agent.

kernc 2 days ago

As a heads up and affirmation that the approach is correct, here's a small shell bubblewrap wrapper that boils the command line down to `sandbox-run claude --dangerously-skip-permissions`.

https://github.com/sandbox-utils/sandbox-run

KurSix a day ago

Unless you use --unshare-net, bwrap leaves the network wide open by default. The agent can not only accidentally delete a file, but also exfiltrate keys or download a malicious package

As a next step I'd add a network namespace (--unshare-net) and spin up a local HTTP proxy (mitmproxy) inside the sandbox to allow access only to Anthropic APIs and maybe PyPI/NPM, while blocking everything else

virtualritz 2 days ago

This one was posted here recently; works quite well for me:

https://github.com/lukehinds/nono

ATechGuy 2 days ago

I will ask what I've asked before: how to know what resources to make available to agents and what policies to enforce? The agent behavior is not predefined; it may need access to a number of files & web domains.

For example, you said: > I don't expose entire /etc, just the bare minimum How is "bare minimum" defined?

> Inspecting the log you can spot which files are needed and bind them as needed. This requires manual inspection.

senko 2 days ago

Article author here. I used trial and error - manual inspection it is.
This took me a few minutes but I feel more in control of what's being exposed and how. The AI recommended just exposing the entire /etc for example. It's probably okay in my case, but I wanted to go more precise.
On the network access part, I let it fully loose (no restrictions, it can access anything). I might want to tighten that in the future (or at least disallow 192.168/16 and 10/8), for now I'm not very concerned.
So there's levels of how tight you want to set it.
- ATechGuy 2 days ago
  
  > I feel more in control of what's being exposed and how
  Makes complete sense. Thanks for your insights!
aflag 2 days ago

Ask the agent to bubblewrap itself

athrowaway3z 2 days ago

I'm launching a SaaS to create yet another solution to the AI Sandboxing problem in linux.

My friends and I have spent a lot of time quietly injecting support down into the kernel without anybody raising a flag, and we finally have the infrastructure in place to solve this problem.

We have also poisoned all the LLMs training data with our approach, so our marketing is primed and we wont even need to learn Claude to use our tool.

We’re planning a soft launch this month, or maybe next month. Depending on how "in the vibe" (our new word for flow :) our team gets.

We’re calling it `useradd`.

Yes, the man page is intimidating, and the documentation is terrible. But once you're over the learning curve, it puts your machine into a kind of 'main frame' mode where multiple 'virtual teletypes' and users can operate on the same machine.

DM me if you want a beta key.

---

Sorry for the snark, but i cringe at the monuments to complexity I see people building, at least this solution is relative simple and free. Still, dont really see what it buys me.

tasuki 2 days ago

Well done. It took me all the way up to `useradd`...
Edit: too bad about your edit. The comment was just fine without it.
- athrowaway3z 2 days ago
  
  I wrote my comment to vent my disdain for all the circus projects filled with marketing blurbs and features lists for their overengineered vibeslop.
  OP is just sharing the cool utility he found, and how it solved a problem for him.
  It felt bad to leave them with the message they shouldn't have, or that he's a big part of the problem.
  
  senko 2 days ago
  
  OP here, no worries, loved the comment and appreciate the feeling :)
CuriouslyC 2 days ago

I get where this is coming from, and it's not a terrible solution, but VMs are still better in terms of security and isolation. Typical workstation systems are not designed to be secure from their own users, and frontier models are going to get scary good at cracking systems soon.
- carsoon 2 days ago
  
  Fully sandboxed VMs are more secure but not everyone is looking for the most secure option. They are looking for the option that works the best for them. I want to be able to share my development environment with the agent, I have a project with 30 1gb and one 30gb sqlite database. I back it up daily and they can all be reconstructed from the code but it takes a long time. When things change I don't want to have to copy them into a separate vm bloating my storage and using excess resources and then having to rectify them, I want to be sharing the same environment with my agent so I can work side-by-side.
  I would rather just have the agent not accidentally delete files outside of its working environment but I am not worried about malicious prompt injection or someone stealing my code.
  For me I see the LLM as a dumb but positive actor that is trying to do its best but sometimes makes mistakes, so I want to put training wheels on it while still allowing it to share my working space.
mystifyingpoi 2 days ago

`useradd` doesn't restrict network access.
- kaffekaka 2 days ago
  
  I have used a separate user, but lately I have been using rootless podman containers instead for this reason. But I know too little about container escapes. So I am thinking about a combination.
  Would a podman container run by a separate user provide any benefit over the two by themselves?
- eikenberry 2 days ago
  
  Without any credentials does network access matter?
senko 2 days ago

I love using different users for separating services I run on the same box!
For development, I want to be able to access/run/modify/delete the files alongside the AI agent. This can be done if groups and group permissions are set correctly (and the agent correctly chmods everything...), but that feels more fiddly than just isolating it with bubblewrap, systemd, or whatever, and preserving the uid/gid.
Just my 2c - it's great that we have options!
- necovek 2 days ago
  
  Hey Senko, did you consider using ZFS or BTRFS snapshotting feature to simplify some of the things you need?
  For GH auth tokens, you could also pull that outside the sandbox, and have the agent push to a local clone exposed to the host, and local host with no agent automatically push on inotify inside the repo — eg. agent has access to your /agents/scratchpad/my-git-repo, and sync to actual git hosting service like GH (or Launchpad ;) happens with simple script outside it.

aflag 2 days ago

I don't know if I want to create an ad-hoc list of permissions. What I would like would be something like take a snapshot of my current workspace in a VM. Run claude there and let it go wild. After the end of the session, kill the box. The only downside is potentially syncing the claude sessions/projects. But I don't think that'd be too difficult.

secure 2 days ago

I recently blogged about how I do this using MicroVMs on NixOS: https://michael.stapelberg.ch/posts/2026-02-01-coding-agent-...
senko 2 days ago

> take a snapshot of my current workspace in a VM. Run claude there
Sounds like docker + overlayfs might fit the bill, as long as there's a base image that is close enough to what you need.
I don't think there should be One True Way how to run these, everyone can set it up in a way that best fits their workflow.
- ushakov 2 days ago
  
  both Docker and bubblewrap are not secure sandboxes. the only way to have actually isolated sandboxes is by using VMs
  disclaimer: i work on secure sandboxes at E2B
  
  senko 2 days ago
  
  No disagreement from me. From the article:
  > Bubblewrap and Docker are not hardened security isolation mechanisms, but that's okay with me.
  Edit to add: my understanding is the major flaw in this approach is potential bugs in Linux kernel that would allow sandbox escape. Would appreciate your insight if there are some easier/more probable attack vectors.
  
  gf000 2 days ago
  
  What about cgroups? I know they are not exactly analogous, but to me that seems like a pretty decent solution.
  
  its-summertime 2 days ago
  
  Do you have more information on how to set up such VMs?
  
  ushakov 2 days ago
  
  for personal use, many ways: Vargant, Docker Sandbox, NixOS VMs, Lima, OrbStack.
  if you want multi-tenant: E2B (open-source, self-hosted)
  
  eikenberry 2 days ago
  
  Hashicorp has mostly abandoned Vagrant, so I'd avoid it.
fsflover 2 days ago

> What I would like would be something like take a snapshot of my current workspace in a VM.
Sounds like you may be interested in Qubes OS, which runs everything in VMs.

schmuhblaster 2 days ago

My attempt at a portable solution: Linux VM inside WASM for sandboxed execution: http://agentvm.deepclause.ai

Minimal dependencies, but not as fast as containers or bubblewrap.

ashishb 2 days ago

I ended up writing my own sandbox so that it works on Mac OS as well and can be used for other tools (but just AI agents) as well

https://github.com/ashishb/amazing-sandbox

ATechGuy 2 days ago

Curious to know what made you DIY this?
- ashishb 2 days ago
  
  Tell me a better alternative that allows me to run, say, 'markdown lint', an npm package, on the current directory without giving access to the full system on Mac OS?
  
  ATechGuy 2 days ago
  
  sandbox-exec -f curr_dir_access_profile.sb markdownlint
  
  ashishb 2 days ago
  
  So you have to install npm package markdownlint on your machine and let it run it's potentially dangerous postinstall step?
  
  ATechGuy 2 days ago
  
  You can customize curr_dir_access_profile.sb to block access to network/fs/etc. Why is this not enough?
  
  ashishb 2 days ago
  
  Some tools do require Internet access.
  Further, I don't even want to take the risk of running 'npm install markdownlint' anymore on my machine.
  
  ATechGuy 2 days ago
  
  I understand the concern. However, you can customize the profile (e.g., allowlist) to only allow network access to required domains. Also, looks like your sandboxing solution is Docker based, which uses VMs on a Mac machine, but will not use VMs on a Linux machine (weak security).
  
  ashishb 2 days ago
  
  That's why I wrote my own sandbox. Everyone hand waives these concerns.
  Further, I don't know why docker is weak security on Linux. Are you telling me that one can exploit docker?
  
  KurSix a day ago
  
  dockerd is a massive root-privileged daemon just sitting there, waiting for its moment. For local dev it’s often just unnecessary attack surface - one subtle kernel bug or namespace flaw, and it’s "hello, container escape". bwrap is much more honest in that regard: it’s just a syscall with no background processes and zero required privileges. If an agent tries to break out, it has to hit the kernel head-on instead of hunting for holes in a bloated docker API

jauntywundrkind 2 days ago

Really well targeted!

I'd been thinking of using toolbox or devcontainers going forward, but having to craft containers with all my stuff sounds so painful, feels like it would become another full-time job to make containers

Bubblewrap & passing in a bunch of the current system sounds like a great compromise!

I do wonder what isolation something like systemd-run can offer, if that is enough.

Part #2 to me, I also want observability as to what the agent changed. That was one place where containers are such a clear & huge advantage! Having an overlay that contains the changes to the filesystem is so explicit. There's also works like agentfs, that offer a FUSE filesystem backed by Turso DB (sqlite compatible).

dgl 2 days ago

> Part #2 to me, I also want observability as to what the agent changed.
You could potentially combine https://github.com/binpash/try with bubblewrap (I'm not sure how well they compose and as the docs say it isn't a full sandbox).
The good (and bad because it's confusing and can lead to surprises if misconfigured) thing about Linux containers is all the pieces of containers can be used independently. The "try" tool lets you use the overlay part of containers on your host system, just like Bubblewrap lets you combine the namespacing parts of containers with your host system.
eikenberry 2 days ago

Bubblewrap supports overlayfs mounts [1]. Seems like you should be able to replicate that flow with it.
[1] https://github.com/containers/bubblewrap/issues/412

enum 2 days ago

I just have an unprivileged secondary local account and do ssh dummy@localhost.

Is this wrong?

waerhert 2 days ago

Nice approach! On Ubuntu 24.04 I had to loosen some AppArmor protections by creating a file:

  > cat /etc/apparmor.d/bwrap 
  #include <tunables/global>                                                       
                                                                                  
  /usr/bin/bwrap flags=(unconfined) {                                              
    userns,                                                                        
  }

amluto 2 days ago
I despise AppArmor and SELinux, especially in cases where they actively get in the way of security like this.
But you shouldn't need to make a global change. Do this:
```
    if [[ -f /proc/$$/attr/exec ]]; then
        # AppArmor is active.  Request "unconfined" for our next exec.
        echo 'exec unconfined' 2>/dev/null >/proc/$$/attr/exec
    fi
    exec ...
```
Or I think you can do this:
```
    $ setpriv --apparmor-profile=unconfined [command]
```
(You'd think I'd be more sure of the exact circumstances under which the latter works given that I literally wrote setpriv... At the very least, it will error out if apparmor is not running, which is mildly obnoxious.)

kwar13 2 days ago

I went exactly the same route: https://kaveh.page/blog/claude-code-sandbox

HorizonXP 2 days ago

Is this BSD jails' time to shine?

jhancock 2 days ago

I've started using a container (podman) which is just for the AI tools. I start it up for Codex etc and let it access to the appropriate code directory outside the container.

Anyone else using this approach? Ideas on improvements?

muggesmuds 2 days ago

Would love this for MacOS

davidcann 2 days ago

My app does this on macOS! https://multitui.com
senko 2 days ago

There's https://code.claude.com/docs/en/sandboxing that uses something called Seatbelt on Mac and bubblewrap (the same thing I used here) on Linux.
No idea how customizable that is.
ashishb 2 days ago

Try https://github.com/ashishb/amazing-sandbox

Jayakumark 2 days ago

Saw something last week using bubblewrap as well in hn github.com/Use-Tusk/fence

charcircuit 2 days ago

If you have ssh installed, with network access it can ssh localhost to escape the sandbox.

qwertox 2 days ago

You can consider these agents criminals, or treat them like babies. Both can do harm for a while, but one offers a future.
senko 2 days ago

Don't give it access to your ssh keys!
- charcircuit 2 days ago
  
  Yes, it should have its own dedicated key instead of sharing one of your own.
dist-epoch 2 days ago

`ssh localhost` doesn't work for me. maybe because I have enabled only key-based ssh and my user key is not in authorized_keys? am I missing something?
- charcircuit 2 days ago
  
  You are right in that it would still need to authenticate.

longtermop 2 days ago

[dead]

aktuel 2 days ago

I like this approach for Nix: https://dev.to/andersonjoseph/how-i-run-llm-agents-in-a-secu... It makes it also easy to give the agent only access to the tools it actually needs.