Show HN: Libretto – Making AI browser automations deterministic

github.com

92 points by muchael 13 hours ago

Libretto (https://libretto.sh) is a Skill+CLI that makes it easy for your coding agent to generate deterministic browser automations and debug existing ones. Key shift is going from “give an agent a prompt at runtime and hope it figures things out” to: “Use coding agents to generate real scripts you can inspect, run, and debug”.

Here’s a demo: https://www.youtube.com/watch?v=0cDpIntmHAM. Docs start at https://libretto.sh/docs/get-started/introduction.

We spent a year building and maintaining browser automations for EHR and payer portal integrations at our healthcare startup. Building these automations and debugging failed ones was incredibly time-consuming.

There’s lots of tools that use runtime AI like Browseruse and Stagehand which we tried, but (1) they’re reliant on custom DOM parsing that's unreliable on older and complicated websites (including all of healthcare). Using a website’s internal network calls is faster and more reliable when possible. (2) They can be expensive since they rely on lots of AI calls and for workflows with complicated logic you can’t always rely on caching actions to make sure it will work. (3) They’re at runtime so it’s not interpretable what the agent is going to do. You kind of hope you prompted it correctly to do the right thing, but legacy workflows are often unintuitive and inconsistent across sites so you can’t trust an agent to just figure it out at runtime. (4) They don’t really help you generate new automations or help you debug automation failures.

We wanted a way to reliably generate and maintain browser automations in messy, high-stakes environments, without relying on fragile runtime agents.

Libretto is different because instead of runtime agents it uses “development-time AI”: scripts are generated ahead of time as actual code you can read and control, not opaque agent behavior at runtime. Instead of a black box, you own the code and can inspect, modify, version, and debug everything.

Rather than relying on runtime DOM parsing, Libretto takes a hybrid approach combining Playwright UI automation with direct network/API requests within the browser session for better reliability and bot detection evasion.

It records manual user actions to help agents generate and update scripts, supports step-through debugging, has an optional read-only mode to prevent agents from accidentally submitting or modifying data, and generates code that follows all the abstractions and conventions you have already in your coding repo.

Would love to hear how others are building and maintaining browser automations in practice, and any feedback on the approach we’ve taken here.

anthuswilliams 8 hours ago

I literally _just_ put up an announcement on our internal Slack of a tool I had spent a few weeks trying to get right. Strange to post the announcement and, literally the same day, see a better, publicly available toolkit to do enable that very workflow!

I'm also using Playwright, to automate a platform that has a maze of iframes, referer links, etc. Hopefully I can replace the internals with a script I get from this project.

  • muchael 8 hours ago

    Haha that's wild, let me know if you run into any issues with it!

z3ugma 10 hours ago

Love it! Do you have a BAA with Claude though? Otherwise, your demo is likely exposing PHI to 3rd parties and exposing you to risk related to HIPAA

  • muchael 10 hours ago

    It's a good callout. We have a BAA + ZDR with Anthropic and OpenAI, and if you want to use libretto for healthcare use cases having a BAA is essential. Was using Codex in the demo, and we've seen that both Claude and Codex work pretty well

  • tanishqkanc 2 hours ago

    just adding to michael's reply - we took care to make sure no PHI was exposed in our demo video as well.

boriskurikhin 8 hours ago

I like the pre-gen approach! Curious how it responds to JS that changes how components are rendered at run-time.

  • muchael 7 hours ago

    There are a couple ways to handle JS components rendered at runtime:

    - Libretto prefers network requests over DOM interaction when possible, so this will circumvent a lot of complex JS rendering issues

    - When you do need the DOM, playwright can handle a lot of the complexity out of the box: playwright will re-query the live DOM at action time and automatically wait for elements to populate. Libretto is also set up to pick selectors like data-testid, aria-label, role, id over class names or positional stuff that's likely to be dynamic.

    - At the end of the day the files still live as code so you could always just throw a browser agent at it to handle a part of a workflow if nothing else works

heyitsaamir 9 hours ago

I built something very similar for my company internally. The idea was that that the maintenance of the code is on the agent and the code is purely an optimization. If it breaks the agent runs it iteratively, fixes the code for next time. Happy to replace my tool with this and see how it does!

  • muchael 9 hours ago

    Super cool! Please let me know how it goes. Since agents are so good at writing code, we think letting the agent rewrite/test the code on failure is better than just using a prompt at runtime

yehia2amer 5 hours ago
  • tanishqkanc 2 hours ago

    we started using stagehand initially! But it doesn't follow the same model of pre-generating deterministic code. Your code is meant to look like this:

    // Let AI click await stagehand.act("click on the comments link for the top story");

    the issue with this is that there's now runtime non-determinism. We move the AI work during dev-time: AI explores and crawls the website first, and generates a deterministic legible script.

    Tangentially, Stagehand's model may have worked 2 years ago when humans still wrote the code, but it's no longer the case. We want to empower agents to do the heavy lifting of building a browser automation for us but reap the benefits of running deterministic, fast, cheap, straightforward code.

etwigg 11 hours ago

Thanks for this! We have clear answers for things that are 100% and 0% automated, but it’s always that 80%-99% automated slice where the frontier is, great idea.

  • canarias_mate 10 hours ago

    script maintenance is exactly where that middle slice bites - the app keeps evolving and the scripts lag behind. we took the angle of having the agent re-explore from scratch each run with autonoma (https://github.com/autonoma-ai/autonoma) for e2e qa, no maintained scripts, adapts naturally - different goal than libretto but same core intuition

messh 12 hours ago

how does it differ from playwright-cli?

  • muchael 11 hours ago

    At its core, libretto generates, validates, and helps with debugging RPA scripts. As far as I understand tools like playwright CLI are more focused on letting your agent use playwright to perform one-off automations.

    The implementation is also pretty different:

    - libretto gives your agent a single exec tool (instead of different tools for each action) so it can write arbitrary playwright/javascript and is more context efficient

    - Also we gave libretto instructions on bot detection avoidance so that it will prefer using network requests for automation (something that other tools don’t support), but will fall back to playwright if it identifies network requests as too risky

  • tanishqkanc 2 hours ago

    playwright-cli is very simple and meant for humans - it basically generates a first draft of a script, and was originally meant for writing e2e tests. You need to do a lot of post-processing on it to get it to be a reliable automation.

    libretto gives a similar ability for agents for building scripts but:

    - agents automatically run, debug, and test the integrations they write - they have a much better understanding of the semantics of the actions you take (vs. playwright auto-assuming based on where you clicked) - they can parse network requests and use those to make direct API calls instead

    there's fundamentally a mismatch where playwright-cli is for building e2e test scripts for your own app but libretto is for building robust web automations

seagull 12 hours ago

I've wanted something like this for ages, excited to try this out!

  • tanishqkanc 2 hours ago

    glad to hear! Please reach out on Discord or Github issues you run into issues!

daveguy 10 hours ago

What is the license?

Edit: nevermind. I see from the website it is MIT. Probably should add a COPYING.md or LICENSE.md to the repository itself.

  • tanishqkanc 2 hours ago

    Sorry! Yes, MIT. Forgot to lift it up when I converted to a monorepo, but it's in packages/libretto

gbibas 11 hours ago

Cool. Thank you for sharing. While AI tools are extremely powerful, packages like this help create some good standards and stepping stones for connectivity that the models haven’t gotten around to yet. Thanks again.

  • tanishqkanc 2 hours ago

    Ofc! Please try it out. Stop by in the Discord or Github Issues if you have any questions!

arpadav 11 hours ago

this looks awesome

  • tanishqkanc 2 hours ago

    Thanks! Please try it out. Stop by in the Discord or Github Issues if you have any questions!

devstatic 12 hours ago

this is interesting

  • tanishqkanc 2 hours ago

    Thanks! Please try it out. Stop by in the Discord or Github Issues if you have any questions!

surgical_fire 11 hours ago

[flagged]

  • muchael 10 hours ago

    Lol sorry for the misleading click. We named it libretto after the term in theater, inspired by Playwright. No retro gaming here, just browser automation!

  • dang 8 hours ago

    Ok, but please don't post unsubstantive comments to HN.

alexbike 10 hours ago

[flagged]

  • muchael 9 hours ago

    Right now libretto only captures HTTP requests, which the coding agent can use to determine how to perform the automation.

    For more complex cases where libretto can't validate that the network approach would produce the right data (like sites that rely on WebSockets or heavy client-side logic) it falls back to using the DOM with playwright