Show HN: Morph – Videos of AI testing your PR, embedded in GitHub

34 points by bhaktatejas922 a day ago

I review PRs all day and I've basically stopped reading them. Someone opens a 2000-line PR, I scroll, see it's mostly AI-generated React components, leave a comment, merge. I felt bad about it until I realized everyone on my team does the same thing.

The problem is diffs are the wrong format. A PR might change how three buttons behave. Staring at green and red lines to understand that is crazy.

The core reason we built this is that we feel that products today are built with assumptions from the past. 100x code with the same review systems means 100x human attention. Human attention cannot scale to fit that need, so we built something different. Humans are provably more engaged with video content than text.

So we RL trained and built an agent that watches your preview deployment when you open a PR, clicks around the stuff that changed, and posts a video in the PR itself.

Hardest part was figuring out where changed code actually lives in the running app. A diff could say Button.tsx line 47 changed, but that doesn't tell you how to find that button. We walk React's Fiber tree where each node maps back to source files, so we can trace changes to bounding boxes for the DOM elements. We then reward the model for showing and interacting within it.

This obviously only works with React so we have to get more clever when generalizing to all languages.

We trained an RL agent to interact with those components. Simple reward: points for getting modified stuff into viewport, double for clicking/typing. About 30% of what it does is weird, partial form submits, hitting escape mid-modal, because real users do that stuff and polite AI models won't test it on their own.

This catches things unit tests miss completely: z-index bugs where something renders but you can't click it, scroll containers that trap you, handlers that fail silently.

What's janky right now: feature flags, storing different user states, and anything that requires context not provided.

Free to try: https://morphllm.com/dashboard/integrations/github

Demo: https://www.youtube.com/watch?v=Tc66RMA0nCY

cmeacham98 20 hours ago

While the product sounds mildly interesting, I see it as a major red flag you think it's ok for either a submitter or reviewer to not even read the code they are working with and ship thousand line diffs of LLM-generated code.

That's the lack of professionalism I give my random PoC personal projects where the only user I can break is myself - at work I am reading every line of every PR I submit or review, even if I used an LLM to assist writing the code.

dandigangi 18 hours ago

I understand what you mean here but I'd consider this something that is configurable. I'd 100% have this product type as the default to require manual intervention/actions. Forcibly require someone to turn it off w/ some very explicit conversations. Maybe even provide some educational content about best practices.
DhruvBhatia0 19 hours ago

Most of my previous companies required attaching a loom/screen recording of visual features cause the code really only communicates the logic. I've found that even for the PRs where you want to be super thorough and read every single line of code, watching the PR get tested brings you up to speed a lot faster.
- dandigangi 18 hours ago
  
  Couple jobs had this too and I really liked it. The QA team we had it as a hard requirement. Optional depending on context for the devs.
bhaktatejas922 19 hours ago

unfortunately, this is the reality we're in today. I see it at every scale. this is a swing at helping.

toastal 2 hours ago

Only works on a proprietary forge. Wouldn’t it be better to build tooling to get OFF that godforsaken platform?

StrangeSound 21 hours ago

Forgive me if this is a stupid question, but why does the introduction of AI mean that you now allow 2000 line PRs?

bhaktatejas922 19 hours ago

its more of a comment on human nature. on average when we did monitoring at scale reviewers averaged around 150 lines of code actually reviewed. thats the normal, day to day human limit. human attention is woefully unable to scale to meet the demands of how much AI slop is getting pushed and this is a swing and helping solve this by valuing human attention spans. This isn't going to solve it all at once but we're working on it!

purplecats 20 hours ago

seems trivial for antigravity to do

pillbitsHQ 5 hours ago

[dead]

pillbitsHQ 7 hours ago

[dead]

pillbitsHQ 12 hours ago

[dead]

pillbitsHQ 9 hours ago

[dead]

blemasle 20 hours ago

How a show HN leading to a sign up without any information on what's behind that manage to get to the front page is beyond understanding...

bhaktatejas922 19 hours ago

woops this was a mistake I just made. fixed it - shouldnt require any auth

pillbitsHQ 13 hours ago

[dead]

pillbitsHQ 18 hours ago

[dead]

pillbitsHQ 17 hours ago

[dead]

throwaway613746 18 hours ago

[dead]