Show HN: Mix – Open-source multimodal agents SDK

6 points by avaibhav54 19 hours ago

Why we built it: • Claude Code: great for coding, but no video/audio support, localhost only • OpenAI SDK: single-model, no native multimedia tools • Both: no integrated DevTools for debugging agent reasoning

So, we built Mix as an alternative for multimodal applications. • Native video/audio/PDF analysis tools (via Gemini for vision, Claude for reasoning) • Multi-model routing instead of single-provider lock-in • One-command Supabase setup for cloud deployment (vs localhost-only) • HTTP architecture that enables visual DevTools alongside agent workflows • Go backend: 50-80% lower memory footprint than Node.js—efficient for concurrent agent sessions. Python and typescript clients are available

Example use cases in the demo video: portfolio analyzer that reads Excel and generates charts, YouTube search agent that finds and edits video clips.

GitHub: https://github.com/recreate-run/mix Demo video: https://youtu.be/IwgKt68wQSc

Would appreciate feedback, especially from folks building multimodal agents.