Show HN: GPU-accelerated sandboxes for running AI coding agents in parallel [video]

1 points by lewq 9 hours ago

Hey HN! I've been building a system to run multiple AI coding agents in parallel, each in their own GPU-accelerated sandbox. The problem I kept running into: when you're working with AI assistants on real projects, you want them doing multiple things at once (refactoring module A, implementing feature B, writing tests for C), but coordinating all that is a mess. You lose track of what's happening, agents step on each other, and you end up babysitting instead of building. So I built HADES - think of it as a desktop environment for each agent, running on GPU-accelerated VDI backends. You write high-level specs, spin up multiple agents in isolated environments, and they each work on their own branch with their own MCP tools and RAG sources. The demo shows me creating two Snake games simultaneously (Python and Rust versions), then jumping into one with Moonlight streaming to pair-program with the agent in real-time. The latency is wild - sub-20ms, feels completely local. Each agent writes out structured spec files as it works, so you get this 30,000-foot view of everything happening in parallel. When they're done, they make PRs that humans review and merge. Built this on top of our HPC infrastructure (we normally deploy ML workloads on Slurm clusters for supercomputing customers). Turns out the same tech that schedules thousands of GPU jobs is pretty good at orchestrating AI agents. Currently running private beta. Also shout-out to the folks in our Discord who've already got this running on macOS - that's not even something I've tested yet! Tech stack: Slurm, NVIDIA Container Toolkit, Wayland screencopy, Moonlight protocol, Docker/Harbor. Would love feedback, especially from folks doing multi-agent development or running dev environments at scale.