I am excited to see another project that works as a stand-alone tool without any kind of server and without being an IDE plugin. I am highly curious about how you actually approach making modifications, so I am mostly interested in the transcript you posted. https://pastebin.com/FcGKdPbU
1. I notice that it seems to look at all files in examples, then opts to scan examples/react/start-basic-auth/src/main.tsx, then says "I apologize for the confusion" and seems to move on to something else (lines 13-22). What happened here? This happens again on line 77.
2. It looks (lines 25 to 37) like it has difficulty actually working with source files, due to file size. This seems to happen several more times in the h3 section (line 85). It might be worth building something like Aider's repomap to handle larger files.
It does seem to get to a useful conclusion without a lot of looping around, so overall it looks promising!
> I notice that it seems to look at all files in examples, then opts to scan examples/react/start-basic-auth/src/main.tsx, then says "I apologize for the confusion" and seems to move on to something else (lines 13-22). What happened here? This happens again on line 77.
You mean this response? "Looking at "examples/react/start-basic-auth/src/main.tsx", Checking main file for useSession usage"
I think there was a miscommunication between the agent and the sub-agent, the proper path should have had "app" not "src". I am working on better sub-agent prompting as there is some information loss sometimes, and then wrong assumptions made.
On line 77 it made the same wrong assumption about "src" instead of "app."
> It looks (lines 25 to 37) like it has difficulty actually working with source files, due to file size. This seems to happen several more times in the h3 section (line 85).
I did add a 10K limit to the characters of a file that will be read at a time, and I tried to tell it was only a partial read. The issue is that a single file can be huge sometimes and I didn't want to blow the context window.
While I appreciate the emergence of new AI coding tools, I've observed that many of them fail to offer significant improvements over Aider's existing capabilities (but I want to be proven wrong).
Aider is in development for almost 2 years now. Publishing another tool that "offers significant improvements" will have to catch up first.
Having options is a good thing, and approaching the dev agent problem space from a different perspective will help with pushing ideas in other products as well.
There is always room for more tools. How many database exist? Front end frameworks? Languages? Backend frameworks? Analytics packages?
To think that in this space there is only one solution and all others are just outright failures or not worth doing is weird thinking as that isn't normally how it works. There are usually multiple niches and success/revenue strategies.
I strongly think this is the future of software development. And thus there will be many winners here.
Wait, are you saying that, because you know Typescript but not Python, you can't make modifications on a software intended to develop for you using AI?
Auto-coders, which is what I call this tech, are great but they screw up complex tasks, so you need to be able to step in when they are screwing one up. I view it as a team of junior devs.
This will probably change at some point, but they require supervision this point and corrections.
If you do not actually know what you are doing, these things can create a mess. But that is just the next challenge to overcome and I suspect we'll get there relatively soon.
except that the future of LLM assisted programming means I can also make my own implementation of aider pretty easily. So theres going to be an explosion of software that does basically the same thing but it's private or just not widely shared. not because I don't want to share but because starting and supporting an open source project is a pita and I just want to build this one little cool thing and be done with it.
Pretty neat how compact it is. I'm trying to poke around to see what the capabilities are, but, more importantly, I'm interested in the restrictions. In `ExecuteShellCommand` it seems that it's basically unrestricted. I think I want at least a naive safeguard like a whitelist of directories that it can act on.
When I use it heavily for a work day it costs around $25 a day.
It will write whole features and debug things and write tests and docs. It is that valuable.
I’ve started to run two sessions at once to be more productive. I will be moving it to the cloud soon so I can run dozens of sessions at the same time.
Tech like this replaces human developers for the most part so yes it is worth it. $25/day is cheaper than hiring another dev or two.
Also, Claude is already so good at coding, that anyone could clone your "MyCoder" in less than a day. Which makes sense, given you yourself made it using Claude.
Basically, any dev with a small amount of experience can make their own coding software now, without much issue, because the AI tools are so good.
I saw that, I was more interested in other examples you might have since you said that you have started to run two sessions at once to be more productive. And this bug doesn't look like such workload.
The issue is I am working on non-public projects so it is harder to share. But I have a mono-repo that has 8 packages in the packages/* folder and it is all managed by pnpm. I had two sessions going locally in two separate terminal windows in cursor. One was modifying 2 packages to add a feature and another session was modifying another package that was orthogonal to the first two.
This isn't really professional - it felt wrong - one shouldn't be adding two separate features at once, but because they were isolated I could check them in separately into Git as two separate commits.
It was as if two coders were working on the same checked out code.
I need to move this into a cloud service and that is coming soon.
Ping me at ben@benhouston3d.com and I can show you some live demos. It works incredibly well.
I don't see anything wrong with that tbh. If it works that's great since this means that you're bound by the compute power you have and not the amount of devs available at that moment. I guess this is what the wet dream is about.
Working on two features at once is not weird pre-llms too. Though it's a scheduling problem rather than parallelism. CI tests are running for feature A while you work on feature B. If feature A fails, at some point you context switch back and work on it more etc
Of course, business incentive is to have people complete as much stuff as they can per time unit. I don't think I ever had a luck to work in an environment where anyone in the team wouldn't be working on multiple things at the same time. Truth to be told these weren't the traditional scrum or any other bs driven environment. More like high performance teams where pretty much anyone was quite exquisite.
I spent about five hours using Claude Code heavily yesterday to upgrade and enhance a four year old React web app. This app is widely used to reference anatomical nomenclature.
I was able to internationalize it for 45 major languages across the world (still subject to human testing). That allows it to be accessible for 85-90% of the world's population.
It cost me about $50. It saved me months of work on a "labor of love" project and allowed me to add lots of quality of life features in a single day that I just never would have gotten to otherwise.
I'm building a startup. Saving hours for singular dollars is incredibly valuable.
A session in the parent comment is like building a set of changes that would usually take a few hours (but instead takes like 30 minutes of using Claude Code (reviewing, prompting etc) and 30 minutes of cleaning up).
A few hours also being what it would have taken using some combination of building it entirely myself and/or copy/pasting with Claude where it would save time.
On average I've been spending $25 a day on Claude credits once this was up and fully running. That is cheaper than hiring another developer in just about any country and it greatly boosts my productivity.
If you use threads / chains of messages in any form, I strongly encourage you to checkout caching. The cost savings are crazy. ($0.05 / cache read 1M tokens instead of $3 / 1M input tokens)
It actually works with bun, pnpm, yarn, etc - any standard Node package manager.
I use pnpm personally and that is evident in the repo setup itself, but npm is sort of the standard so I put in that in the docs, rather than mentioning a long list of alternatives.
I am excited to see another project that works as a stand-alone tool without any kind of server and without being an IDE plugin. I am highly curious about how you actually approach making modifications, so I am mostly interested in the transcript you posted. https://pastebin.com/FcGKdPbU
1. I notice that it seems to look at all files in examples, then opts to scan examples/react/start-basic-auth/src/main.tsx, then says "I apologize for the confusion" and seems to move on to something else (lines 13-22). What happened here? This happens again on line 77.
2. It looks (lines 25 to 37) like it has difficulty actually working with source files, due to file size. This seems to happen several more times in the h3 section (line 85). It might be worth building something like Aider's repomap to handle larger files.
It does seem to get to a useful conclusion without a lot of looping around, so overall it looks promising!
Thanks for the feedback!
> I notice that it seems to look at all files in examples, then opts to scan examples/react/start-basic-auth/src/main.tsx, then says "I apologize for the confusion" and seems to move on to something else (lines 13-22). What happened here? This happens again on line 77.
You mean this response? "Looking at "examples/react/start-basic-auth/src/main.tsx", Checking main file for useSession usage"
I think there was a miscommunication between the agent and the sub-agent, the proper path should have had "app" not "src". I am working on better sub-agent prompting as there is some information loss sometimes, and then wrong assumptions made.
On line 77 it made the same wrong assumption about "src" instead of "app."
> It looks (lines 25 to 37) like it has difficulty actually working with source files, due to file size. This seems to happen several more times in the h3 section (line 85).
I did add a 10K limit to the characters of a file that will be read at a time, and I tried to tell it was only a partial read. The issue is that a single file can be huge sometimes and I didn't want to blow the context window.
https://github.com/drivecore/mycoder/blob/main/src/tools/io/...
> It might be worth building something like Aider's repomap to handle larger files.
I will have a look at that idea.
While I appreciate the emergence of new AI coding tools, I've observed that many of them fail to offer significant improvements over Aider's existing capabilities (but I want to be proven wrong).
Aider is in development for almost 2 years now. Publishing another tool that "offers significant improvements" will have to catch up first.
Having options is a good thing, and approaching the dev agent problem space from a different perspective will help with pushing ideas in other products as well.
There is always room for more tools. How many database exist? Front end frameworks? Languages? Backend frameworks? Analytics packages?
To think that in this space there is only one solution and all others are just outright failures or not worth doing is weird thinking as that isn't normally how it works. There are usually multiple niches and success/revenue strategies.
I strongly think this is the future of software development. And thus there will be many winners here.
I will investigate aider. I wrote this tool from idea to now in just four weeks without reference to existing tools so now I need to do that.
Yeeah you're going to find out you should have just aider I'm afraid...
Aider is python. That is annoying for me as I like to modify things. This is typescript.
Wait, are you saying that, because you know Typescript but not Python, you can't make modifications on a software intended to develop for you using AI?
Auto-coders, which is what I call this tech, are great but they screw up complex tasks, so you need to be able to step in when they are screwing one up. I view it as a team of junior devs.
This will probably change at some point, but they require supervision this point and corrections.
If you do not actually know what you are doing, these things can create a mess. But that is just the next challenge to overcome and I suspect we'll get there relatively soon.
except that the future of LLM assisted programming means I can also make my own implementation of aider pretty easily. So theres going to be an explosion of software that does basically the same thing but it's private or just not widely shared. not because I don't want to share but because starting and supporting an open source project is a pita and I just want to build this one little cool thing and be done with it.
I will be launching a version of this on GitHub as an app to help open source developers. So open source is also going to get a boost.
I've been using aider for a while now. It can get pretty expensive with sonnet but I guess it's no different from claude-code
Please also take a look at my omnipotent Claudine. It is good at self-modifying to develop new tools:
https://github.com/xemantic/claudine/
Pretty neat how compact it is. I'm trying to poke around to see what the capabilities are, but, more importantly, I'm interested in the restrictions. In `ExecuteShellCommand` it seems that it's basically unrestricted. I think I want at least a naive safeguard like a whitelist of directories that it can act on.
I will have a look! Thx!
Whats the difference between Claude Code and Aider?
Aider is python. Claude code is closed source and this is open source and typescript.
Are you leveraging caching? (It didn't seem like it from initial investigation, so figured I'd ask)
Recent sessions with Claude Code come out to a few $ each and like 90% of tokens are cached reads. Which would be hundreds of $ without.
I do not use caching yet. An average run costs less than a dollar I find. I think the most expensive so far may have been 2 dollars or so.
I had similar costs. With caching it went down to 20-40 cents.
Well worth it, if you are using this regularly.
Wait, running this once costs "less than a dollar" but up to 2 dollars?
So if I used this all day it could cost me 100s of dollars? How is that a good deal when Claude costs 20 dollars per month?
If using Anthropic professionally with, e.g., Cline, it's easy to spend $1500 a week.
When I use it heavily for a work day it costs around $25 a day.
It will write whole features and debug things and write tests and docs. It is that valuable.
I’ve started to run two sessions at once to be more productive. I will be moving it to the cloud soon so I can run dozens of sessions at the same time.
Tech like this replaces human developers for the most part so yes it is worth it. $25/day is cheaper than hiring another dev or two.
Development teams are coming to an end right now.
Claude already does all those things.
Also, Claude is already so good at coding, that anyone could clone your "MyCoder" in less than a day. Which makes sense, given you yourself made it using Claude.
Basically, any dev with a small amount of experience can make their own coding software now, without much issue, because the AI tools are so good.
So why should anyone use "MyCoder"?
Can you say more about the type of work? How big are those repositories? Thanks
I had it fix a bug recently in the medium sized tanstack mono repo. Transcript here: https://news.ycombinator.com/item?id=43178222
I saw that, I was more interested in other examples you might have since you said that you have started to run two sessions at once to be more productive. And this bug doesn't look like such workload.
The issue is I am working on non-public projects so it is harder to share. But I have a mono-repo that has 8 packages in the packages/* folder and it is all managed by pnpm. I had two sessions going locally in two separate terminal windows in cursor. One was modifying 2 packages to add a feature and another session was modifying another package that was orthogonal to the first two.
This isn't really professional - it felt wrong - one shouldn't be adding two separate features at once, but because they were isolated I could check them in separately into Git as two separate commits.
It was as if two coders were working on the same checked out code.
I need to move this into a cloud service and that is coming soon.
Ping me at ben@benhouston3d.com and I can show you some live demos. It works incredibly well.
> This isn't really professional - it felt wrong
I don't see anything wrong with that tbh. If it works that's great since this means that you're bound by the compute power you have and not the amount of devs available at that moment. I guess this is what the wet dream is about.
Working on two features at once is not weird pre-llms too. Though it's a scheduling problem rather than parallelism. CI tests are running for feature A while you work on feature B. If feature A fails, at some point you context switch back and work on it more etc
Of course, business incentive is to have people complete as much stuff as they can per time unit. I don't think I ever had a luck to work in an environment where anyone in the team wouldn't be working on multiple things at the same time. Truth to be told these weren't the traditional scrum or any other bs driven environment. More like high performance teams where pretty much anyone was quite exquisite.
I really should have just checked out the code twice locally or once per auto coder.
A few dollars??? What are those sessions doing ? How do you define a "session"
How can it be worth it in anyway except for FAANG engineers in the US ?
I spent about five hours using Claude Code heavily yesterday to upgrade and enhance a four year old React web app. This app is widely used to reference anatomical nomenclature.
I was able to internationalize it for 45 major languages across the world (still subject to human testing). That allows it to be accessible for 85-90% of the world's population.
It cost me about $50. It saved me months of work on a "labor of love" project and allowed me to add lots of quality of life features in a single day that I just never would have gotten to otherwise.
That's an enormous value for me.
I'm building a startup. Saving hours for singular dollars is incredibly valuable.
A session in the parent comment is like building a set of changes that would usually take a few hours (but instead takes like 30 minutes of using Claude Code (reviewing, prompting etc) and 30 minutes of cleaning up).
A few hours also being what it would have taken using some combination of building it entirely myself and/or copy/pasting with Claude where it would save time.
On average I've been spending $25 a day on Claude credits once this was up and fully running. That is cheaper than hiring another developer in just about any country and it greatly boosts my productivity.
Isn't it much cheaper to just use CoPilot with GPT-o?
If you use threads / chains of messages in any form, I strongly encourage you to checkout caching. The cost savings are crazy. ($0.05 / cache read 1M tokens instead of $3 / 1M input tokens)
Can you put a screenshot of what it is going to look like into the repo's readme?
I've pushed a quick image here: https://github.com/drivecore/mycoder/blob/main/docs/Screensh...
Here is run that debugged an issue I ran into with TanStack Start: https://pastebin.com/FcGKdPbU
It solved this issue that I reported autonomously: https://github.com/TanStack/router/issues/3492
Tangent: some of Expensify's github issues have a price tied to em eg. [$250] issue name... would be funny for bots to start completing them
Oh crap, it wants npm. No way in hell.
It actually works with bun, pnpm, yarn, etc - any standard Node package manager.
I use pnpm personally and that is evident in the repo setup itself, but npm is sort of the standard so I put in that in the docs, rather than mentioning a long list of alternatives.
GP is likely referring to this concern, "The Great npm Garbage Patch":
https://news.ycombinator.com/item?id=41178258