Show HN: I made a drop-in Voice Mode for AI startups

3 points by andupotorac 4 months ago

Text inputs are too slow for complex prompting if you're vibe coding or generating media. I built a full-stack Voice Mode component (UI + logic + transcription) for React/Next.js. It handles the awkward browser audio stuff so you don't have to.

Also used Gemini 3 to generate that entire page in one prompt. :-)

andupotorac 4 months ago

Added support for Playwright, so that the LLM can get the screenshot of the page the user wants to copy. While this seems like a small thing, what is required to build it, and what the difference in the output is are quite large.

Here are some examples of different tools, compared to Same.dev that also captures screenshots of pages - and all the others.

https://x.com/andupoto/status/1992928743925690382