Ask HN: If I cancel Codex today whats the next best local inference agent?

bigyabai 10 hours ago

For local inference? It entirely depends on what your hardware is.

Check llmfit

verdverm 10 hours ago

OpenCode + vllm, model will depend on your hardware, but OpenCode also has a killer $10/m plan with quotas for some top tier open weight models.

I'm using qwen3.6 on a DGX spark, llama-cpp has prompt cache bugs for qwen/gemma models (among more being reported). Using my OpenCode-go sub when I want a bigger / more capable model