Show HN: Spec-Driven Development Workflow for Claude Code

20 points by sermakarevich 4 days ago

Spec Driven Development approach allows to squeeze more from coding agents thanks to few strong concepts: - decomposition across two dimensions. first you generate specs in multiple steps (requirements, code analysis, design), than you split task into multiple subtasks and implement them one by one - you clear context between every step - after spec generation and after subtask implementation. this helps keep cost low and context clear and focused which boost performance - specs written to disk help with information persistency - delivering specs layer by layer help to catch early when agent got you wrong

Repo with claude plugin for spec driven development: https://github.com/sermakarevich/sddw

siliconc0w 4 days ago

I've been using agent flywheel workflow which is similar. Still not completely sold - it feels a bit like using power tools to shape wood but the final product needs a lot of sanding and polishing.

I thought initially this meant that the spec wasn't detailed enough but the problem is more agent adherence and laziness.

  • dwb 4 days ago

    Exactly. A detailed-enough spec is just code that you can’t run. If models and agents got to a point where doing a good job in Claude Code plan mode meant that I didn’t have to keep an eye on them in implementation, then I would be interested in some bigger spec-driven thing like this. That is still far from the case today for me.

  • sermakarevich 4 days ago

    Agentic coding works especially great for me when application is platform-like. You have core and you extend it with a standardized plugins. When few plugins are already there - its hard to distinguish if next plugin is written by agent or by a human.

    Also sddw works nicely with fleet of agent: https://news.ycombinator.com/item?id=48226033. I just insert the sequence of sddw steps into the queue and take a nap.

  • sermakarevich 3 days ago

    I am trying to simplify and decompose task, keep my context clean/focused and validate my instructions in this case.

    I look at this as if there is a boundary of complexity behind which agents become behaving funky - we just need to find it. Its obvious that with simple tasks and clear instructions, agents don't have issues with adherence. This starts happening at some point when complexity is too high. We need to find this boundary and try to push it with approaches available on our side

sermakarevich 2 days ago

I started using sddw with fleet - another app I wrote to orchestrate a fleet of coding agents https://news.ycombinator.com/item?id=48256389. I just submit a sequence of tasks chained into a sequential exectuion:

- fleet config set model=sonnet coder=claude

- fleet bd create --title "/sddw:requirements <task-name> task description is in TASK.md --auto"

- fleet bd create --title "/sddw:code_analysis <task-name> --auto" --deps <prev task id>

- fleet bd create --title "/sddw:design <task-name> --auto" --deps <prev task id>

- fleet bd create --title "/sddw:implement <task-name> --task 1 --auto" --deps <prev task id>

aaronbrethorst 4 days ago

I'd love to see a comparison with other spec-driven development tools for Claude, like OpenSpec and Superpowers. How does this compare and contrast with them?

  • sermakarevich 4 days ago

    I think those tools would be good as well. The point of sddw for me was to be able to adjust sdd to typical size of your projects. GSD was great but probably for gigantic projects only. For mid - its overkill of tokens.

NBenkovich 4 days ago

Decomposition is definitely needed as tasks become more complicated. I’d prefer to define the desired state, get a decomposed breakdown of the gap between the current and desired states, and let agents figure out how to close it themselves, rather than manually operating each intermediate task. As a project owner, I’d love to work at the desired-state level.

zihotki 4 days ago

Are there any benchmarks/evals to see if this particular one is doing anything good comparing to, let's say, plan mode? How do you measure it actually works and you don't waste tokens and your personal time?

I fail to see any backing for claims 'boosting performance' and 'keeping costs low'

  • sermakarevich 4 days ago

    fair

    here are slides explaining it in more details: https://docs.google.com/presentation/d/1SjKXF7hkoqyiN9-3tBGY...

    when plan + code mode works - no need to change it. when it does not, because feature is complicated - than we need something else. Thats when sdd is applicable. I use it for mid + size projects only.

    Measuring is a bit of subjective thing here. But when plan mode + code does not work and sdd works (because of double decomposition) - you get what you need.

    Tokens consumption is lower because you can wipe your context after every step or subtask implemented. The scope to deliver specs is bigger however. But confusion is way lower as your context is focused per single step or subtask.