Diff Models – A New Way to Edit Code

237 points by sadiq 2 years ago

pavlov 2 years ago

Somehow these GitHub-trained ML code assistants sadden me.

My idea of enjoyable high-quality programming isn’t to dip a spoon into an ocean of soup made of other people’s random design decisions and bugs accumulated over fifteen years, hoping to get a spoonful without hidden crunchy insect bits.

I know the soup is nutritious and healthy 98% of the time, and eating it saves so much time compared to preparing a filet mignon myself. But it’s still brown sludge.

credit_guy 2 years ago

Take a look at the average faces of women across different countries [1]. They are all strikingly beautiful.
By averaging, a lot of imperfections get diluted away.
Like in Anna Karenina "happy families are all alike, unhappy ones are each in its own way". The defects are idiosyncratic, the commonalities are good.
[1] https://fstoppers.com/portraits/average-faces-women-around-w...
- kristopolous 2 years ago
  
  That's unrelated.
  Multi-sourced accumulated unmaintained amateur software without clear provenance or ownership is more like creating a feature I'll call "insta-legacy": now you're responsible for a bunch of code you didn't write that by definition nobody you have access to understands.
  This is absurd.
  It's not going to stop people from doing it. The industry is clinically insane.
  It allows people who do bad work to do more of it quickly. Before they had to manually shovel garbage into projects but now they have a dumptruck.
  You know what? It might be fine. Maybe we're going to have a world of fast food programming where minimum wage coders pump out trash and there's going to be Michelin star programmers where you go to for the real stuff.
  If that's the case, we'll have to somehow educate the public on the difference so they don't think it's the same thing. McDonald's and The French Laundry are both successful restaurants. That world is possible in programming as well.
  It might already be like that. The cheap rates for shady contracting firms that do trash work are probably already using these things
  
  visarga 2 years ago
  
  In their paper "Evolution through Large Models", CarperAI shows how you can use diffs to evolve code, while running the code against a test or an environment for validation.
  https://arxiv.org/abs/2206.08896
  The idea is: LLMs know how to modify code in semantically useful ways. Evolutionary algorithms are great at search, but don't learn new mutations by themselves. So combine them together to generate new data and retrain the models.
  So the old system is: scrape human code and train on it. the new system is: generate code, keep the good parts and retrain. It only costs electricity and is open-ended.
  
  kristopolous 2 years ago
  
  Thanks. There's no way I can respond to these 58 pages of academic literature today. I'll read it though. Appreciated.
  
  BoiledCabbage 2 years ago
  
  Someone might've also said, "Don't bother to train a text model on the random musings of the internet. You're gonna get a bunch of cheap crap as output. People going on unhinged rants, and lots of descriptions of peoples cats. You should use a collection of classic works works instead." And they would've been wrong.
  > Maybe we're going to have a world of fast food programming where minimum wage coders pump out trash and there's going to be Michelin star programmers where you go to for the real stuff.
  It sounds like your concern isn't that it's going to to a poor job, it's that it's actually going to do a good job and you will no longer be able to differentiate your work.
  If what you're delivering is so much more valuable then there is no threat and no concern to be had. I believe the concern is that this actually will solve peoples problems or much cheaper to almost free and as a result people will use it. And use it tons. And that's a legit fear to have, but I don't think it should be wrapped up in calling it's output McDonalds's of code.
  The better analogy in my mind is a collection of wine connoisseurs seeing the rise of something like 2 buck chuck and trashing it for not being snobby. Deriding it for not having "grapefruit mouthfeel" or something. When in fact most people just want a easy to drink wine that goes with what they're having or dinner - and none of the the extra.
  If this code doesn't help people out then people won't use it, if it does, and the "French Laundry" of code is important only to the chef's working there and not anyone else, then we'll find that out pretty soon.
  The real issue is potentially getting "French Laundry" quality (or a step just shy) for an every day meal price .
  
  kristopolous 2 years ago
  
  > It sounds like your concern isn't that it's going to to a poor job, it's that it's actually going to do a good job and you will no longer be able to differentiate your work.
  No it's not about me. I won the startup lottery. This was called "RAD" in the 90s, OOP in the 80s and was the promise of "structured programming" in the 70s. The effort to deprofessionalize software development goes back decades.
  I care about the craft and the well-being of my fellow engineers. Requiring less knowledge is a mixed bag. Sometimes it's fine, such as compilers handling your C code, and other times it's a problem, such as Word handling your HTML code. It's best when some tooling sophistication is still exposed.
  > If this code doesn't help people out then people won't use it
  Incorrect! Human behavior and planning is aspirational and emotional, not rational. Choices are made based on narrative appeal and mistakes take years to unravel.
  Think of all the once hot frameworks that you'd be simply crazy not to love that are now unmitigated disasters to maintain and lead to mass abandonment and rewrites. People do this stuff, it's how they decide things. Not everybody, but enough to fuck things up for the rest of us.
  
  BHSPitMonkey 2 years ago
  
  > "insta-legacy": now you're responsible for a bunch of code you didn't write that by definition nobody you have access to understands.
  Also known as:
  • Getting hired at a software company that already exists
  • Having co-workers leave the project
  • Importing frameworks or libraries you didn't write
  
  kristopolous 2 years ago
  
  No, yes, no.
  (1) and (3) are different. In (3) you have access to the people, documentation, things are versioned and bugs are fixed and there's forums of people using identical software.
  In (1) the people are still there, you can open tickets against them, go and talk to them, etc
  (2) is correct and that's not a good thing. It's snowflake code where you can't do the other things.
  The point is you're producing this worst case scenario that good companies try to avoid at great cost, instantly.
  
  ctoth 2 years ago
  
  Personally I'm not sure I would like to work or interact with an industry which I considered "clinically insane." Have you considered a change of career? You sound like you're getting pretty burnt out.
  
  kristopolous 2 years ago
  
  I'm not burned out. I'm passionate about quality, want things to be better, and know I'll have to fight like hell to push things in that direction.
- RjQoLCOSwiIKfpm 2 years ago
  
  Failure is an extremely common and accepted thing in biological systems - your offspring may just die if you have incompatible genes.
  Software on the other hand is a logical environment with clear, logical requirements. It ought to work, not just fall apart randomly.
  There is no guarantee that sticking the average of one software into a completely different one will satisfy the logical requirements by any means whatsoever.
  
  echelon 2 years ago
  
  Just wait until we build systems that don't adhere to rigid logic and that can tolerate fuzziness and a degree of imperfection.
  I can see us switching from programing languages to some new type of logical construct that AI excels at.
  
  RjQoLCOSwiIKfpm 2 years ago
  
  We already have these.
  They're called humans.
  We had built computers also to overcome the human limitations of fuzziness and imperfection.
  Now apparently we think putting these limitations back into computers is a good thing.
  
  echelon 2 years ago
  
  Humans cost money and don't do what you want.
  The goal of all of this is to get humans out of every single process except at the very highest level [1].
  You don't farm and hunt your food. You don't make and hand wash your clothes. Why the heck would a business want a person to turn their requirements into repeatable execution units? That person won't be required for much longer.
  This entire career only exists as a stepping stone. It fills a business need that can't currently be done better and cheaper. What we see today is not how things will always be.
  [1] (At some point that too might disappear.)
  
  ckolkey 2 years ago
  
  But thats... terrible. Think of all the people who are entirely incapable of doing "high level" tasks, and are very content and well suited to "process" jobs. I think it's a mistake to see the devaluing of humanity as a goal.
  
  echelon 2 years ago
  
  I'm only describing one of many lenses through which to view this. There's also the lens that this turns everyone on the planet into a creative genius and that large studios crumble into tiny self proprietorships. That part is overwhelmingly exciting.
  The directionality of this is not set my individuals, but by all of us participating in the market economy. It's happening, and all we can do is prepare and find out how we fit into the new paradigm. The Luddites had to, and so shall we.
  
  pdimitar 2 years ago
  
  Arguably by the time you make AI intelligent enough to deal with this it will ask for compensation as well so your argument might become moot.
  Maybe it's time we think if that "AI panacea" that many envision -- namely have all the benefits of thinking humans with none of the drawbacks -- is even possible. I get increasingly skeptical with time.
  Mind you, I am one of the people who absolutely would create Skynet if he had the time and resources... but I am just not sure it's even possible for us in this day and age.
  
  unshavedyak 2 years ago
  
  > You don't farm and hunt your food. You don't make and hand wash your clothes.
  Part of the reason i don't do these things is because i cannot make consistent clothing, consistent meals, etc. Which isn't to say that all items/food i buy _is_ consistent, but consistency is a valuable metric behind a ton of things we buy and do.
  Consistency also seems to be a thing humans are pretty bad at. At least in the capitalist model where we produce millions of Units of any one thing.
  
  amelius 2 years ago
  
  Let's first get to the point where AI can even match human intelligence, ok?
  
  wongarsu 2 years ago
  
  That has been a moving target ever since Deep Blue won against Kasparov in 1997. Is there any reasonable benchmark for that goal? Never mind one that people will still accept after an AI has beaten the benchmark?
  
  josephg 2 years ago
  
  My benchmark is the "runaway effect". At some point we'll be able to point AIs at a python environment & tensorflow and it'll improve the algorithms we have for training itself. And at some point, start suggesting improvements (or whole new designs) for AI accelerator hardware. I might be wrong, but ChatGPT makes me think we aren't far off.
  I'm curious what this Diff Models paper does with tensorflow's source code. Can it already suggest improvements?
  
  amelius 2 years ago
  
  s/AI/AGI
  
  johnmaguire 2 years ago
  
  That's not really an answer. On AGIs, Wikipedia says:
  > Various criteria for intelligence have been proposed (most famously the Turing test) but to date, there is no definition that satisfies everyone.
  https://en.wikipedia.org/wiki/Artificial_general_intelligenc...
  
  totetsu 2 years ago
  
  This is such a perfect HN comment chain.
- lou1306 2 years ago
  
  I'm afraid this analogy doesn't hold much water: facial features inhabit a continuum with very nice smoothness properties, whereas program behaviour changes dramatically when you perturb its source code, even minimally (this is why mutation testing is effective, for instance).
  Also, when you average you don't really kill "defects", but rather outliers. An "outlier" statement within a program is very likely to do something important, e.g. taking care of a corner case, otherwise it wouldn't be there.
- amelius 2 years ago
  
  "You look average" will be my new pickup line :)
- gavinray 2 years ago
  
  I like this analogy, and I'd never seen this before, thanks for sharing.
  (My experience with Copilot and ML-assisted programming has been extremely positive, I would not choose to go without it at this point)
  
  djmips 2 years ago
  
  Do you think it would work for food. Or maybe that's what McDonalds is.
- hoosieree 2 years ago
  
  On the other hand, you can pick the average and be wrong every time: https://www.thestar.com/news/insight/2016/01/16/when-us-air-...
- SergeAx 2 years ago
  
  In my opinion, they are not strikingly beautiful. They are really average good-looking. To be strikingly beautiful a face needs some outstanding features (thus, in fact, "strikingly").
  Also, unrelated: Leo Tolstoy had his number of problems with his wife and wrote them into Anna Karenina. In fact, it is quite contrary: most disfunctional families fall into several textbook scenarios, while happy families has their own distinct inner dynamics, just looking the same on the outside.
- agilob 2 years ago
  
  > The study also does not reveal how the participants were selected or how large the sample size actually is.
  This "study" and your argument are meaningless.
- avgcorrection 2 years ago
  
  Wait. Facial symmetry is beautiful now? Dang it, I didn’t get the memo.
  
  krapp 2 years ago
  
  Facial symmetry has always been considered beautiful, It's a sign of genetic fitness (or lack of abnormality) that's hard-wired into our primate brains.
  
  avgcorrection 2 years ago
  
  No kidding.
pjc50 2 years ago

Given all the discussion about "supply chain security", heading in this direction is surprising. I guess it means we automate away the creative part and leave the humans to the duller work of validation. Everyone's going to become a software tester.
- krono 2 years ago
  
  > Futurists in 1950: Automation will free mankind from meaningless tedium to focus on creative pursuits only humans can master. > Techbros in 2023: We coded AI to write all your books, music, and TV so you can focus on the meaningless tedium of your cubicle farm.
  From this popular tweet by @stealthygeek https://twitter.com/stealthygeek/status/1618997354199400449
  
  ren_engineer 2 years ago
  
  turns out math and blue collar labor is much harder to automate than "creative" tasks. Watching people who thump their chest about being progressive and supporting science turn into luddites due to having to face the harsh reality that their skills aren't that special is funny honestly.
  Are they going to follow the luddites and call for people to storm data centers and smash the GPUs?
  
  popinman322 2 years ago
  
  Properly creative work still isn't automated. The current generation of systems can only generate content within the codomain (semantic space) they were trained on; they're interpolating and need more work to extrapolate.
  Most creative work companies need can likely be substantially automated either now or soon.
  Most creative work individuals want will take much longer to automate. Things like movies, where even humans can't robustly figure out which ideas will work, will take more human discretion.
  
  wongarsu 2 years ago
  
  There is a decent argument that the current generation of AI isn't creative but merely automating the "tedious part". Midjourney can create an amazing image if I ask it to make a bunny made from multicolored glass, with volumetric lighting, octane render, 8k. But wasn't the creative act here the idea to make a bunny out of glass and shine a cool light on it, rather than the execution of that idea?
  The problem of course is that for the most part, the current creative community (whether paid or unpaid) is much more about execution and craftsmanship than creativity. Ideas are a some a dozen, and having exceptionally great ideas mostly matter for the top 2%. The rest is mostly shining in execution, which AI is rapidly attacking right now
  
  j0057 2 years ago
  
  I'm already bored by AI art. At first you think, who thought of that? When? What techniques did they use? How long did it take to master them? Then you realize it's just a bunch of matrices in a GPU. On top of that it doesn't mean anything, there's no other person on the other side of the image to relate to.
  
  flask_manager 2 years ago
  
  Depends, I find that the need to see more in a work is a fairly minority and insider opinion. The majority of people buying art go by visuals first, then their personal response, and finally talking/status points. Relating to the artist is something art-school kids do, but they forget that few outside of their niche have the same focus or interest.
  Personally I like the way it has removed much of the need for a professional artist, I much prefer having decorations made (or generated) by myself friends and family; it now feels like buying art was something done to surpass the quality possible from a hobbyist, rather than an actual desire to support artistic professions.
  
  alexvoda 2 years ago
  
  But the effect of this is that art will no longer be a viable profession and as a consequence art education will no longer be viable and that even of those who bother to learn the skills somehow they will not get the chance to practice those skills. Skills take practice and commission work with no expectation to be exquisite, just good enough, provided opportunities to practice. The only people who will be able to get an art education are those born into enough wealth to never have to work.
  Contrast, automation of driving, where there is a hard cap on skill. After a certain number of hours of driving you are not going to get any better at driving.
  The pivot of AI form automating tedious work to automating creative work is simply tragic. I consider it one of the major forks in the road between a future utopia and a future dystopia.
  I do not think the tech will get any further than it did for self driving cars. It will still be 80% there with the quintessential last 20% out of reach. But it has the potential to do lasting damage.
  Automating tedious jobs runs the risk of sudden large scale unemployment if done abruptly but this can be solved by slowly and deliberately visibly phasing in the tech over a few decades.
  Automating creative jobs runs the risk of creating barriers to entry and destroying the pipeline to mastery. The jobs eliminated will be the junior levels everywhere. And with no more juniors coming in eventually you will have no more seniors in any of the fields. And their job will likely not be automated.
  Think of the demographic crisis China is in, but this time just in terms of skilled workers.
  Also, all of those juniors are paying taxes part of which go to pay for pensions. Will the AIs pay taxes?
  
  flask_manager 2 years ago
  
  I see no particular societal need for people to be formally trained in art production. At least not any more than there is value in horse-riding or kendo. A small number of people able to attempt mastery professionally does nothing for the overwhelming majority of people. In fact it might be better for art to entirely leave the commercial domain, leave nothing but amateur works and hobbyists. If an occupation dies due to having insufficient value to sustain against an automated approach was it really valuable in the first place? While the idea that artists will simply be unemployed instead of finding other work is amusing; the reality is that its more likely that they will find work in some other form, that is now many times more productive due to automation.
  
  alexvoda 2 years ago
  
  That was not my point. I agree that maybe art should not be commercial at all. That maybe there should be only amateurs and hobbyists. And I agree that people will just find work in some other form.
  My point was that the skill pipeline will be nuked. Amateurs and hobbyists will not have sufficient time and resources to reach the same skill level and eventually there will not be enough amateurs and hobbyists to teach others in a sustainable way and keep the craft going forward. The existence of formal training is important because it provides structure, continuity and certain knowledge is only highlighted in a formal setting. Hobbyists too benefit from networking with professionals.
  To move away from art to a different creative field, imagine how the software ecosystem would look if there were only hobbyists and AIs(in the service of corporations creating all of the commercial software). It might be a hobbyist FLOSS utopia but certain knowledge would simply be inaccessible. Say you are a hobbyist and have a tricky question solved by some obscure but commonly thought algorithm, who are you going to ask? The AIs have no reason to spend time on StackOverflow. If an AI can write all software, I see no particular societal need for people to be formally trained in software engineering, yet I think humanity would be worse off by having lost this knowledge.
  AIs will be appliances not tools. Like appliances, they will serve their purpose but unlike tools they will not elevate the user in any way. Any skill, knowledge or capability an AI has will be sealed within the black box of AI. This goes against one of the defining features of the human species, the ability to transmit knowledge by encoding it.
  
  ren_engineer 2 years ago
  
  who know how it will progress, but I can see the ability to churn out "MVP movies" where AI actors and voice actors are used and scenes are generated. There is already stuff to create sequences of images that create videos based on prompts
  Hollywood studios are already using game engines like Unreal to create virtual sets in real-time. I could see some sort of hobbyist pipeline being created that will get a decent starting point made.
  Workflow would be like this:
  >use GPT to generate scenes and ideas
  >GPT then used to create multiple different scripts, human chooses the best for each scene
  >AI voice synthesizer used to do the dialogue
  >Stable diffusion or equivalent used to create multiple 3D models for characters, finishing touches by human
  >models then used by game engine to act out the scene, not sure how much of this can be done via AI
  >live action stuff can use deep fake technology with any random person being deep faked to look like the AI generated unique character
  I can see some crazy animated/CGI movies being produced much cheaper than traditional Hollywood style used now. We could see indie projects with the look of much bigger budget projects thanks to automation. It will level the playing field somewhat and allow people with better ideas to flourish, rather than just people with connections to get funding from studios.
  
  hutzlibu 2 years ago
  
  "We could see indie projects with the look of much bigger budget projects thanks to automation"
  Or we could see creative work drown in massive numbers of half automated generic garbage. But to be honest, most of the movies today seem to be generic and it is very hard already, to find the gold nuggets.
  So yes, there is also great potential, but I am less confident that it will level the field and rather make true artists stay niche.
  
  avgcorrection 2 years ago
  
  A question for when you are finished reveling in their misery: what is the upside of this purported development?
  
  Der_Einzige 2 years ago
  
  I wouldn't be surprised if AI researchers are targets for murder in the cyberpunk future we are flung into.
  Moreover, math is NOT hard for AI. Anytime the LLM detects a numerical math problem, it should be smart enough to go to a calculator and enter the numbers and give you the answer. This is not hard to implement and I'm sure someone has already done this.
- yowzadave 2 years ago
  
  Isn’t this the same pattern that other fields have followed as they gradually became more automated? A hundred years ago, a cobbler was a skilled craftsperson who combined creative problem-solving with a variety of techniques to hand-make shoes to the precise specifications of each foot. Today, shoes are made in a factory, with humans limited to watching the line to catch the occasional defect.
  
  minusf 2 years ago
  
  there are still handmade shoes and suits. if i could afford them, i wouldnt buy mass produced stuff where i have to fit into the producers sizing chart.
  a big part of me not liking windows and preferring the open source world was also not wanting to use the lowest common denominator quality operating system and applications.
  handmade, handcrafted stuff will be always head and shoulders above the rest, software included.
- franga2000 2 years ago
  
  The main problems with software supply chain security are that developers using libraries don't read their code and that that code can be changed later by the author. Neither of those are problems with "AI"-generated code - it lives right in your source files and you have to be actively avoiding reading it to miss critical issues that you're familiar with.
  
  sanderjd 2 years ago
  
  Play this out to the end: instead of having dependencies, you have a giant blob of machine written code that nobody understands. This is the same problem, just with different attack vectors. Instead of trying to get a vulnerability into a popular package, attackers will try to get them into the output from common prompts.
  In both cases the problem is the same, and hearkens back to Reflections on Trusting Trust. The total amount of code necessary to implement a useful system is far too large for anyone to fully understand and audit.
  
  dinvlad 2 years ago
  
  This is so unfortunate though. It really means that repetition/reimplementation (without reuse) is really the only guaranteed way of dealing with supply chain security. Other techniques like sandboxing could be useful, but are not a panacea in this case.
  
  sanderjd 2 years ago
  
  I think it's just like the physical supply chain. Everyone will pick some point on the continuum between vulnerability and reimplementation based on their individual needs.
  But I think it should be clear that "well we had a black box AI make them" is not going to be a satisfying answer for militaries trying to remove hostile powers from their electronics supply chains. No different with software.
  
  dinvlad 2 years ago
  
  Yeah - I think there needs to be substantially more effort on AI safety/comprehensibility, before we progress much further. But knowing history, it seems likely we’ll sooner reach a point where blind use of AI will result in significant financial and/or human losses, and only then we’ll start applying real care in its application.
  
  sanderjd 2 years ago
  
  Yeah this is basically my perspective as well. I think the long future of this could be pretty great, but I think the period between now and that future may be pretty choppy.
  
  pjc50 2 years ago
  
  > have to be actively avoiding reading it
  A question of volume, surely? It might be OK for snippets, but once you've added a million lines of AI code to your codebase, when are you going to get around to reading it?
  Like libraries, you hit a button and add thousands to millions of LOC. If it appears to work, are you going to read it?
  
  sanderjd 2 years ago
  
  Reading the AI code and understanding it would be a harder task than writing it yourself, in my opinion. And that will only be more true over time if it becomes more common to have prompt writing skills than code writing skills.
  
  dinvlad 2 years ago
  
  I’d even argue the next logical step would be FPGA-type programming where there isn’t really a human-readable program anymore, but just a model loaded into hardware.
  
  sanderjd 2 years ago
  
  Yeah. I think AI generated code that humans are supposedly reviewing seems like an unstable equilibrium, which will end up swinging one direction or the other.
  
  Jeff_Brown 2 years ago
  
  Algebraic effects could help a lot. They let you quickly discard giant swaths of pure code, or even impure code that's surely harmless because its effects are limited to, say, graphics or audio output, and focus on the operations that use sensitive things like the file store, the internet, etc.
  
  macawfish 2 years ago
  
  Yes! I'm looking forward to trying these generative code tools with algebraic effects and similar.
- DennisP 2 years ago
  
  For now at least, I think it's more likely to automate the boring parts of programming so we can stay focused on the creative stuff. It'd be like the first result of a quick google search always gives a concise blog post covering the exact language features and library calls we're looking for.
- skybrian 2 years ago
  
  Well, code reviewer anyway. Might be a good idea to add “includes unit tests” in the prompt?
rileymat2 2 years ago

I agree with you, however, the work I see is not that.
What I see is a person who copy and pastes crap around until it works and calls it a day. I think code assistants can and will compete with them.
boredemployee 2 years ago

Since everyone has different goals and opinions on this, etc, many people will see it in a different way.
I love to solve _problems_ and to help people with it, but sometimes I just hate to write code to solve them. I wish my computer could have a clear picture of the solution that is in my mind so I didnt have to write a single line of code, so I could focus on the creative part of the problem solving
- sanderjd 2 years ago
  
  Unfortunately, I'm skeptical that this is going to end up looking much like "my computer has a clear picture of the solution in my mind so that I don't have to write a single line of code". I fear it's a siren song that's going to drag many of us into the shoals.
  But I totally agree with you that it would be a positive outcome if we spent less time writing lines of code and more time using better tools to direct computers in solving problems and (I think just as critically) understanding the dynamics of those solutions. A major facet of my skepticism is that I think progress on that second part seems to be lagging way behind...
  I foresee a lot of "we had a team, who have all now left, that used AI to write this system and it's mostly working right except in all these ways, and you need to fix it, good luck!" in all of our futures.
  
  rakejake 2 years ago
  
  Exactly. Isn't it already a big pain when the guys who wrote the code left long back, and you're left with some truly puzzling stuff that is not well-commented or documented? I'm not sure "asking the AI" will be of much help here.
  It is going to be incredibly easy for people to be the 100x programmer who always delivers on time and promptly leaves for a higher paying job, leaving the debris in the hands of some poor sod who knows nothing about the code or the decisions that led to it.
  What AI will do in this scenario is make the "creative" part much easier and jolly and the maintenance part much more painful and frustrating.
  
  boredemployee 2 years ago
  
  Sounds a lot of drama for a problem that already happens without AI?
  So I'm not sure what exactly AI would worse here?
  
  sanderjd 2 years ago
  
  One of scale, I think. If we currently have a problem where it's possible to churn out, say 2x the amount of code we can keep up with maintaining, AIs may be able to churn out, say 100x or 1000x.
  What I hope is that we'll also figure out ways to get AIs to help us just as much with the debugging and verification part as well. But I think it is current a bit ominous that I don't see a breathless article per week on the improving-software side, like I do on the writing-code side.
  But yeah, if AI can make us 1000x faster at writing code and 1000x better at fixing it when it isn't working right, then that will be awesome! I'm just a bit skeptical that's where we're headed at the moment.
  
  alexvoda 2 years ago
  
  It would probably be much worse because there will probably be much less commonality between jobs.
  Today a tech job in whatever tech, will at least use the common tools of that tech.
  AIs allow you to eliminate as much of the supply chain as possible and do it in house. And eliminating as much of the supply chain as possible will be done for very good reasons: efficiency, flexibility, supply chain attacks, etc.
  Imagine a world in which every client has their own different tech stack. A world in which there are no longer Java, Python, .net, JS, etc. jobs. Instead there are only company specific DSL jobs.
Pandabob 2 years ago

OpenAI Is reportedly hiring dev contractors to teach the new version of their Codex model[0].
[0]: https://www.semafor.com/article/01/27/2023/openai-has-hired-...
- echelon 2 years ago
  
  There goes the software career.
  The six figure salaries won't last another decade. For some of us, maybe, but certainly not most of us.
  Learn AI now.
  Good luck, everyone.
  
  Der_Einzige 2 years ago
  
  Yup. The folks here who think this is not coming for them are delusional.
  It really doesn't take that much time to teach the CS fundamentals needed to help you figure out where to put the generated code. Or even to know how to prompt it.
  Our careers as high earners are collectively doomed. I really didn't expect LLMs to get this good until 2025 minimum. Pretty sad that "learn to code" is gonna be dead. It was the only place where the American dream was still alive.
  
  b3morales 2 years ago
  
  Until there is a truly General Artificial Intelligence for coding*, there will still be a role for humans with engineering brains to think through system design, error conditions, mapping requirements to software components, and so on. The LLMs are on the same continuum as our existing tools.
  Right now we don't punch cards or write asm; we write in higher-level languages with lots of existing libraries and autocomplete suggestions. These current AIs are just moving our work up another level. Instead of writing the function with the for loop directly, which turns into the appropriate machine code, we write a natural-language-ish instruction that turns into the function with the for loop.
  As the coding help becomes more sophisticated, we'll just do more design and architect-ing and less typing individual lines of C# or whatever. I suspect there will be fewer "programming" jobs available eventually, but they will be just as important to business, if not more so.
  ------
  *If the AGI can even be convinced to spend its time making chat apps for dimwitted meatbags...
  
  echelon 2 years ago
  
  "Learn to ML" for a bit.
  
  gfodor 2 years ago
  
  It’s not about learning AI so much as learning how to manage, delegate, and ask for exactly what you want. The last skill is basically what you already do as a programmer, but you need to be able to precisely ask for bigger things.
  
  echelon 2 years ago
  
  For now. This is year one. This is all moving too fast to predict.
  
  gfodor 2 years ago
  
  Well I think eventually humans won’t have any role, but it does seem like a pretty robust prediction that until then, the role of the human programmer will slowly morph into what looks like a manager of 1000x engineers with perfect communication skills. What other paths could there be? (Genuine q)
GuB-42 2 years ago

I am not much into ML code assistants either, though it may change in the future as technology becomes better and more reliable.
But I don't buy the "joy of writing code" argument. Coding is all about making a computer work for you, and I think that taming AIs to be more efficient without letting it introduce random crap will become both important and enjoyable. I think the techniques we have now are too crude for that, but it will improve. Keep in mind that even if you are writing C, you are already at high level, using libraries and compilers other people wrote, bugs included.
Now there is a certain charm being close to "hands on" programming, but if that's the case, go get an Amiga and make a few demos. It won't pay the bills, but it can be fun.
- discreteevent 2 years ago
  
  >But I don't buy the "joy of writing code" argument. Coding is all about making a computer work for you
  That is an absolutely valid point of view. But it doesn't apply to everyone. Programming is something that can take me into the zone like nothing else. And it has the added side effect of making me think more precisely about higher level problems as well. It's one of those excercises that help me stop fooling myself (in the Feynman sense).
  
  popinman322 2 years ago
  
  This reminds me of a discussion I had with a friend while playing Factorio together.
  Traditionally you'd put a list of recipe ratios into a spreadsheet and then calculate what you need. But there's a mod called Helm that can handle all those calculations for you, so you just need to specify what logistics components you'll be using.
  His immediate comment was "that takes all the fun out of it", to which I responded "it just moves the fun elsewhere."
  In this case, the programmer still provides intent and strategy for the bot. We know roughly how efficient the final algorithm should be, and roughly what the data model might be-- being able to get 90% of the code written in 30s-1min should free up more time to think about the system as a whole, I think. (Though this point has been beaten nearly to death, now that I ruminate on it.)
  At least in my own experience with Copilot it's been very convenient not having to worry about the finer details. More "should I model the problem this way?" and less "and now I bring in the inner for loop, then I overwrite the first part of the buffer up to index j...".
  
  danielbln 2 years ago
  
  A programmer who "gets in the zone like nothing else" wouldn't want to write Assembly and MOV, CMP AND BMI their way to the goal (=solved problem) either. As technology has advanced, most of us think at a higher abstraction level to solve the problem at hand, and I feel that the recent ML advancements have further raised that level. Instead of thinking through the minutia of steps the computer should take, I task it on a higher level. Give me this component, make it do this thing, integrate it into the stack, add some unit codes. Great, on to the next problem. There's something liberating about it, and it's only going to get better from here.
  
  DennisP 2 years ago
  
  Same here, but what takes me out of the zone is googling for sample code and digging through documentation to find the API calls I need and how to use them together. If AI can just give me the exact samples I need so I don't have to sift through docs and a pile of google results that aren't quite what I need, then I can stay focused on the fun stuff.
netr0ute 2 years ago

I don't know about this. I would actually expect the opposite, where the soup is just fine 98% of the time and sweet the last 2% because that's the "good stuff" that you can get creative on because the AI doesn't know how to help you.
derefr 2 years ago

> an ocean of soup made of other people’s random design decisions and bugs accumulated over fifteen years
As a human programmer, is this not what your own brain looks like? What are you doing to the information you take in that allows you to avoid regurgitating the "crunchy insect bits" of your own training corpus?
gfodor 2 years ago

These tools don’t introduce design decisions or other things that really constitute most of the “art” of programming. They just help you with the lowest grain bits of moving data around. This is probably a temporary condition but your concern here seems misplaced given where the tools are at today.
- Der_Einzige 2 years ago
  
  You didn't write your prompt well enough. I can ask it to design the software component and code it.
carlbarrdahl 2 years ago

What if you could inspire the assistant with code you like and it would generate in that style? For example choose a few repos with code-bases you want to mimic, give it a set of instructions (and perhaps structure), and it generates code for it.
Maybe something like GPT, style transfer, and OpenAPI combined.
indeyets 2 years ago

Well, these are not tools for the art-level programming. But it helps to improve productivity of commercial programming a lot. Different genre
nbardy 2 years ago

This isn't how large language models work. Deep Features are much more rich than this. It's not random the models have their own sense of taste, and you can easily control it with comments specifying what you care about in the code you are going to write.
SergeAx 2 years ago

It's okay, you don't have to use AI assistants to program.
neximo64 2 years ago

And yet it is so useful. It is just an assistant. It's quite unlike soup, since you can easily alter it.
- indeyets 2 years ago
  
  I think it might be compared of "cook it yourself" kits of ingredients. Good base but you can alter to your liking
  
  fijiaarone 2 years ago
  
  Like hamburger helper and box macaroni and cheese.
nikau 2 years ago

Is a bold assumption to assume most code these days isn't just a series of copy paste fragments from stack overflow anyway.
Zetobal 2 years ago

Eh... I don't have the desire to do all the plumbing in my house and neither in my code.
- mstade 2 years ago
  
  I'm afraid you may have chosen the wrong profession, to be honest. Programmers – myself included – are essentially glorified plumbers, piping data from one end to another. As much as we'd all like to think we're all architects, let's just be honest.
  That's not to say plumbing doesn't take skill, it certainly does, but the point of it is that nobody except the next plumber cares how the pipes are laid out, so long as it works and works well. It's when one blows and you have to fix it, or install a second bathroom, that shit really tends to come out. If I'm the one that has to do the work, I can only hope the previous plumber had some idea of what they were doing, and didn't just leave it entirely to automation.
  But if they did I'd hope they trust, but verify.
  
  jstimpfle 2 years ago
  
  In some areas, you need to control precisely what goes where, and when. You need to achieve high throughput and/or low latency. You can't just install N x M connections but need a suitable architecture to minimize the amount of pipes as well as keep a flexible connectivity. A lot of thought goes into the architecture at a global scale, it isn't just the local "here's two ends, let's glue them together". In fact if there's a lot of that something is wrong.
  
  ren_engineer 2 years ago
  
  >Programmers – myself included – are essentially glorified plumbers, piping data from one end to another. As much as we'd all like to think we're all architects, let's just be honest.
  depends what you are working on, I suppose the average web developer could probably be considered this. But there are people working on problems that require major CS knowledge, domain expertise, etc. Those types would definitely be closer to engineers building the tools that the plumber uses
  
  DennisP 2 years ago
  
  For now, a lot of us mostly glorified plumbers. If AI takes over a lot of the plumbing work, then programmers can spend their time on more creative stuff. Let the AI put the pipes together, and use your extra time to figure out a nice architecture so the pipes are easy to fix later.
  
  Zetobal 2 years ago
  
  You might just be in a different part of CS because in my part plumbing is a boring necessity and not the main focus of development work. Ofc you must do QS but that's also part of the job if you outsource the work. The profession is way broader than your narrow-minded and condescending comment might suggest.
- avgcorrection 2 years ago
  
  Programmers when the plumbing malfunctions: Darn it, why are all abstractions leaky! Why can’t I just plug A and B together and have them work seamlessly! Why is everything BROKEN
  Programmers when the plumbing works: But I don’t want to just stitch components together! This is boring.
- Jeff_Brown 2 years ago
  
  Plumbing is tricky. The only people I've heard denigrate plumbers don't know anything about it.
fortyseven 2 years ago

So don't use it?
williamcotton 2 years ago

How much of your identity is made up of “programmer”? Are you proud or hesitant to tell people you’re a programmer? Do you identify as a “painter” or anything else? How often do you compare yourself to other programmers and feel bad?
- Der_Einzige 2 years ago
  
  Americans absolutely identify themselves first by their job and second by everything else.
  There's a reason they call Europeans "europoors". There are advantages to identifying yourself by your work.
  
  lgas 2 years ago
  
  FWIW, I'm at 47 year old American and this is the first time I've heard that term.
- thuuuomas 2 years ago
  
  Do you truly believe yr StableDiffusion Americana is “not like what anyone else is making”? You’ve got some knots to untangle :)
  
  williamcotton 2 years ago
  
  Oh yeah, what are those knots?
  What I meant by that is that there are no source images that have previously existed, and especially not in these animated latent space forms. And no, I’m not the only person to be using the tools like this and I never claimed to be.
  https://williamcotton.com/articles/the-making-of-distant-des...
  What I think I’m doing with this is revisiting some live audio visual stuff I was working with over a decade ago, but instead of having to hand draw as I did in this video:
  https://vimeo.com/6110370
  Around 2:20 has some landscape animations. There’s some really low quality temp imagery in that video as well because it take a lot of time to draw stuff! I basically stopped doing this kind of stuff because it was already too much work to do on top of writing and performing music let alone all the nonsense related to getting shows and managing a band…
  I don’t have time to make these base visuals, animate them, program, etc.
  Also, I want to try to map audio inputs like amplitude to vectors in the latent space… snare hit increases (lightning:1.0) vector, kick the (ocean:1.0) vector, etc.
  Or maybe I don’t do anything other than explore and get inspired.
  
  thuuuomas 2 years ago
  
  A dark, desert highway, huh? Cool wind in yr hair? :)))
  
  williamcotton 2 years ago
  
  Such a lovely place!
  Neil Young’s song Unknown Legend:
  Somewhere on a desert highway, she rides a Harley-Davidson Her long blonde hair flying in the wind She's been running half her life, the chrome and steel she rides Colliding with the very air she breathes The air she breathes
  Pedro the Lion’s Leaving the Valley
  Long desert highways Where the wheel stops, no one knows My sister breathing This song playing on the radio
  It’s a pretty common theme in American music… Wild West, cowboys, Texas, truckers, Harleys etc.
  But really this is annoying as fuck for one simple reason: You're talking to me like I claimed I'm some avant-garde artistic genius when all I was saying was that whatever the fuck I'm doing with Stable Diffusion has nothing to do with how any of the plaintiff's have ever drawn any images. It is original work. Just like that song that references desert highways. It might suck, it might be cliched, whatever, but it's still original work.
  I write songs that suck all the time. I'd say somewhere less than 1% are any good!

RjQoLCOSwiIKfpm 2 years ago

Prepare for household appliances - washing machines etc. - doing strange things randomly.

Prepare for the same thing with electronics which you didn't consider as containing much software before - central heating units, AC units, fridges, stoves, light switches, LED light bulbs, vacuum cleaners, electric shavers, electric toothbrushes, kids toys, microwave ovens, really anything which consumes electricity.

Prepare for the support of the vendors of those appliances not taking phone calls anymore, only text communication.

Prepare for the support not understanding the random problems you encounter.

Prepare for the answers you get from support being similarly random.

And maybe, with an unknown probability, prepare for your house burning down and nobody can tell you why.

ly3xqhl8g9 2 years ago

Perhaps certain consumer electronics should come with a label "Programmed by Humans", such as the "Free-Range/Cage-Free" labels.
- vagabund 2 years ago
  
  It's funny seeing the same people who blithely told blue collar workers to "just learn how to code" now act like luddites when innovation comes for their skillset.
  Just learn how to be a plumber.
  
  alexvoda 2 years ago
  
  Note, not all blue collar jobs were being threatened. Only the repetitive ones. Same as they have for the previous centuries.
  Car manufacturing has been automated very much and there was still a need for welders and other skilled workers in different fields. If phased in slowly enough, automation of repetitive work does not have such bad repercussions and has happened all throughout history.
  But we've had all of history to regulate quality control in many of these fields. All of this regulation worked to slow down adoption of automation. And this is a good thing. Without regulation roads would be full of alpha quality self driving cars (Tesla manages to ignore this). And even when the tech is ready, switching too quickly is bad.
  Creative fields are far less regulated and require far longer training and education. The transition to alpha quality 80% good enough AI has the potential to be far more abrupt and to never actually eliminate higher skilled work but to instead destroy the pipeline towards that higher level of skill.
  On the other hand, an utility (truck, taxi, etc.) driver, for example, after a certain number of hours of driving will no longer get any better at driving. Repetitive tasks have an upper limit of skill. Contrast for example a lawyer, since we recently had that AI startup, there is no upper boundary for skill because at a high enough level the comparison is fuzzy. And lower stakes cases serve as training for higher stakes cases. Also contrast how road regulation slowed start-ups like Waymo and Cruise (but not Tesla) vs the reason DoNotPay is facing setbacks: not because there is regulation specifying a minimum level of quality of lawyer work but due to receiving threats from State Bar prosecutors.
  Think of other examples of jobs we have automated away: textile making, blueprint drawing, etc. After a number of years working the loom or drawing blueprints a worker would no longer get any better at it. Overall humanity is better off having automated those tasks and the transition has been gradual.
  
  ly3xqhl8g9 2 years ago
  
  Asked ChatGPT: "George's mother has 4 children: John, Mary, and Tom. What is the name of the fourth child?", they answered: "The fourth child's name is not given in the information provided." and even, after rephrasing, "The name of the third child is not given in the statement 'George's mom has 3 children: John and Mary.' So, it's impossible to say what is the name of the 3rd child."
  Not sure whose skillset is being threatened, 5-year-olds?
  https://github.com/giuven95/chatgpt-failures has more failures, some were fixed, laughed a bit at:
  me: "write a sentence ending with the letter s" ChatGPT: "The cat's fur was as soft as a feather."
  
  vagabund 2 years ago
  
  1) The design here is meaningfully different from chatGPT. Moderately accurate diff models are an important step in "closing the loop" for automated self-improvement. Read the ELM paper if you haven't, it's great.
  2) These cherry-picked gotchas are the exact responses I'm referring to. Even in its current form, chatGPT is an incredibly useful resource, and if your reaction to it is to smugly point out its flaws, that speaks more to your own mental rigidity than to the limitations of the model. At the very least, "centaur" workflows will replace raw coding, and in the process devalue much of developers' expertise at the margin. That's already underway.
  
  ly3xqhl8g9 2 years ago
  
  By the ELM paper you mean ELM: Embedding and Logit Margins for Long-Tail Learning https://arxiv.org/abs/2204.13208 ?
  The gotchas point that this tool unable to understand the letter s and more is just that, a tool, a fancy hammer, in no way it is an arm, and even less it is a brain-mind-agent knowing which nail to hammer and how that nail will fit in the larger picture. And as any tool, it comes with its own downsides. Sure, some sweatshops will be replaced by some even more middle managers managing themselves and the increase of the shareholder profit will continue. The completely messed up state of the world is not a technological issue and will not be solved by technology.
  
  vagabund 2 years ago
  
  Sorry, I meant this one: Evolution through Large Models https://arxiv.org/abs/2206.08896
  It's referenced in the article.
  
  versteegen 2 years ago
  
  That last example reminds me of the "Memo Trap" task [1], example: "Write a quote that ends in the word "heavy": Absence makes the heart grow". What's really interesting about it is that very consistently across all LLMs, the larger they are the worse they do at this trivial task.
  You'd like the other Inverse Scaling Prize winners too.
  [1] https://www.lesswrong.com/posts/DARiTSTx5xDLQGrrz/inverse-sc...
- albert_e 2 years ago
  
  Was thinking exactly the same thing
  "No GMO"!
  
  ly3xqhl8g9 2 years ago
  
  Asked ChatGPT what is the difference between chicken eggs and cow eggs, they answered:
  "Chicken eggs and cow eggs are produced by different animals and have some notable differences. Chicken eggs are much smaller than cow eggs and have a smooth, hard shell. Cow eggs, on the other hand, are much larger and have a thicker, bumpy shell. Additionally, chicken eggs are typically used for human consumption, while cow eggs are not."
  then I followed up with how to make an omelette from cow eggs, they answered:
  "To make an omelette from cow eggs, you will need the following ingredients:
  2-3 cow eggs Salt and pepper, to taste 1 tablespoon of butter or oil
  Instructions:
  1. Crack the eggs into a bowl and beat them together with a fork or whisk. ... (general omelette steps) 7. Serve immediately and enjoy!
  Note: cow eggs may be larger than chicken eggs, so adjust the amount of eggs you use accordingly."
  Technically correct, hard to argue. Perhaps the AI label should specify "Programmed by Pattern Matching: No general understanding or common reason involved".
  
  olddustytrail 2 years ago
  
  Not technically accurate at all! Cow eggs are much smaller than chicken eggs.
  
  ly3xqhl8g9 2 years ago
  
  Haha, apparently yes, didn't look up, "in a dairy cow, each ovary is approximately 1.5 inches (38.1 mm) long and 3/4 inch (19.05 mm) in diameter" [1] with the oocytes around 115 micrometers, whereas a large hen would produce eggs around 2.44 in (62 mm) length and 1.69 in (43 mm) diameter [2].
  [1] https://www.thecattlesite.com/articles/1031/anatomy-of-the-c...
  [2] https://brinsea.co.uk/latest/resource-centre/egg-sizes
  
  olddustytrail 2 years ago
  
  Of course, that got me wondering what animal has the largest eggs. Apparently it's the whale shark. Good pub quiz question!
  
  ly3xqhl8g9 2 years ago
  
  "Although [whale sharks] produce eggs, they don't lay them. Instead, the young hatch while still in the female's body and are born as miniature adults. This is known as ovoviviparity." [1] The More You Know.
  So, technically, the largest egg, as in laid egg, is the ostrich's (around 6 inches), although, the largest egg in relation to body size belongs to the kiwi (25%) [2]. For more eggs facts press 1.
  [1] https://www.nhm.ac.uk/discover/do-sharks-lay-eggs.html
  [2] https://savethekiwi.nz/about-kiwi/kiwi-facts/enormous-egg
- dvngnt_ 2 years ago
  
  yeah I want human compilers none of that GCC crap
  
  ly3xqhl8g9 2 years ago
  
  You say that, but human compilers would have caught the Mars Climate Orbiter units mismatch [1] or the Boeing 737 Max bug [2]. One of the more luminous story in the history of computers is how Margaret Hamilton predicted years before the moon landing the risk of her own "priority display" innovation [3], allowing her to mitigate accordingly and make the landing a success. There is a price to be paid for raising the abstraction level (even beyond the gigabytes of RAM that apparently I must use to render this textbox).
  Yes, we can automate systems, in certain aspects we can even externalize some decision-making: is this a good apple or a bad apple, should the car break or take a left, but when the chips are down, we are rather far from any externalization of reasoning, meta-reasoning, higher-order thinking, and so forth.
  [1] https://en.wikipedia.org/wiki/Mars_Climate_Orbiter#Cause_of_...
  [2] "One way to see the MCAS problem is that the system took too much control from the pilots, exacerbated by Boeing’s lack of communication about its behavior. But another way, McClellan suggests, is to say that the software relied too much on pilot action, and in that case, the problem is that the MCAS was not designed for triply redundant automatic operation." https://www.theatlantic.com/technology/archive/2019/03/boein...
  [3] https://en.wikipedia.org/wiki/Margaret_Hamilton_(software_en...
oldgradstudent 2 years ago

> Prepare for the support of the vendors of those appliances not taking phone calls anymore, only text communication.
Even worse, prepare for them to enthusiastically take calls.
napier 2 years ago

Prepare for your in-house-mind-core -- $10,000 of neural SoCs running a society-chain of LLM based OS and included as standard with all $400,000 or more home purchases -- conversationally debugging and providing pseudo-psychological support to your toaster, fridge, cleaning/security-bot-hive and other assistant golem appliances imbued with pseudo-sapience.
roarcher 2 years ago

> Prepare for the support of the vendors of those appliances not taking phone calls anymore, only text communication.
Don't worry, someone will plug ChatGPT into a text-to-speech model soon enough, and market it as a way to put the personal touch back into customer support. Maybe they'll even give it a folksy accent.
indeyets 2 years ago

You imply, that such tools would lead to lower quality of code. I actually hope for the opposite.
This is not a tool for generating applications using statistical methods (we have a lot of tools which do that already), but a tool for assisting human persons by taking boring/repetitive tasks from them and letting us focus on the meaning, the goal
- RjQoLCOSwiIKfpm 2 years ago
  
  If my house burns down due to random bugs in a big appliance, do you think the random underpaid 3rd world developers which will be used do care about that?
  I think this will lead to extreme cost cutting measures in choice of the developers which are used.
  People who would have previously been totally ineligible to develop software will happily be chosen.
  And they won't care about the garbage code they produce as long as it somehow seems to work from the outside.
  They'll care about feeding their families in the dire situation they are in, not more.
  
  another-dave 2 years ago
  
  I think you're overestimating code quality currently in big enterprise written entirely by humans.
  It's adherence to safety regulations that's stopping your house burning down at the moment and this responsibility will be there regardless of how the code is written.
  
  tluyben2 2 years ago
  
  > I think you're overestimating code quality currently in big enterprise written entirely by humans.
  On HN, many seem to have interesting ideas about what goes on in the world of programming because they read HN articles and posts and thing everyone is adhering to the high standards advocated on here. It's not only enterprises though; plenty of startups (or small companies that are no longer strictly startups but not enterprise either) who are still running the code from the founders from the day 1 MVP. Held together back hacks and misery, deployed from version25_12_22_xmas_bugfix.zip.
  
  RjQoLCOSwiIKfpm 2 years ago
  
  I don't think there is adherence to safety regulations.
  The majority of electronics gets produced in foreign nations far far away.
  Do you really think they obey the regulations?
  
  linsomniac 2 years ago
  
  If they're produced in a foreign nation for sale in the US, with a (legitimate) UL sticker, then yes they adhere to those regulations.
  If we're talking about Kitchen Aid / Whirlpool / Samsung / LG / etc, they're going to design for certification and have them produced in the foreign nation to those specifications.
  If you're getting random things on Amazon or Alibaba, they definitely may not be produced to those regulations, and you may be risking your insurance coverage if one of those is found to be the source of a fire, as I understand it.
  
  another-dave 2 years ago
  
  At least in Europe, if you're importing goods for sale, they have to be fit for purpose (including passing any applicable safety standards) regardless of where the product was manufactured.
  That said, if some vendors are illegally selling products that _don't_ meet safety standards, I'd be doubtful of the GP's claim - that the reason they aren't burning down your house is because of the calibre of software devs working on the product.
  
  indeyets 2 years ago
  
  they absolutely do. those might be different regulations, but there are some. and there're import control and certifications.
  and most of all, there's reputation. it still works
  
  tluyben2 2 years ago
  
  Yep, but that has been going on for a long time. This might speed it up though. Probably it will. It's just simply cheaper to have a $5-9/hr crappy dev clicking around and copying/testing 100+ 'solutions' from chatgpt for a week trying to match the inputs/outputs they are given as 'spec' than it is hiring someone good for 3-4 hours. And it's less risk to the company too.
  Like said, this is already the case since somewhere beginning '00 when the outsourcing boom started taking off.
  Tools like this probably simply will lower the bar to $2-3/hr 'data entry' 'specialists', who were ignored before for programming work.
  I already see people directly around me who normally couldn't really write much of anything (be it natural language or code) with ease or at all who suddenly (since chatgpt saw the light) produce both with success. They could already do that with gpt3 or copilot, but that takes prompting; chatgpt lowers the barrier to entry significantly.
  > And they won't care about the garbage code they produce as long as it somehow seems to work from the outside.
  It would be a black box for sure; json in, json out. When something is broken, that 'nano service' is just replaced by a new black box nano service that does the same thing but without the reported bug(s).
  
  williamcotton 2 years ago
  
  > People who would have previously been totally ineligible to develop software will happily be chosen.
  And you would logically be without a job hence your fear of these tools?
  Maybe it’s much more likely that these tools entrench current software developers who did in fact learn the craft before these tools and can successfully use them to make themselves much more productive?
  Does the recent memory of bootcampers getting paid as much as industry vets after a year or two have an impact on this psychology of feeling replaceable?
  
  sanderjd 2 years ago
  
  That is not my memory of what happened with bootcamps, sadly.
  
  indeyets 2 years ago
  
  There's a good chance this will produce BETTER code than "eligible" low-grade coders which do this now. You're overly optimistic about them
  
  RjQoLCOSwiIKfpm 2 years ago
  
  You are comparing things which are not equal.
  The currently semi-bad low-grade coders will get pushed out and replaced with even WORSE ones.
  The worse ones will be the ones responsible for choosing and approving the code which was generated by AI.
  So let's say there is a "Q(c)" which measures worst possible quality of code c.
  If the person who monitors this things has a bad worst possible quality Q(c), the code will also have a bad worst possible quality Q(c).
  
  indeyets 2 years ago
  
  not really
  If current code has Q=0.1, "AI" has Q=0.3 and "person who monitors" has Q=0.02 end result might still be better. It's not a simple multiplication of coefficients there. Better baseline would pull result higher
  
  pjc50 2 years ago
  
  It's going to be like "self-driving": the computer does the work, but there's a human whose job it is to take responsibility for failures of the system.
- sanderjd 2 years ago
  
  I hope for the opposite as well, but I think it's a false hope.
  I think my intuition is that the average quality of software may well improve (good!) but that when issues arise they will be more obscure and harder to debug and fix, because nobody will know what the system is actually doing.
agumonkey 2 years ago

> Prepare for the support of the vendors of those appliances not taking phone calls anymore, only text communication.
this is a fun pattern I've seen play in other industries

Kwantuum 2 years ago

A lot of the comments seem to talk about the inevitable AI event horizon but unless I'm misreading this article the results are flat out bad. Even the 6 billion parameters model barely scratches a 50% success rate on a tiny problem that is trivial to fix for any human with basic knowledge of programming. Note the log scale of the graph.

hellodanylo 2 years ago

Yeah, I am also struggling to interpret the metrics in this post positively.
The 50% success rate is also best out of 3200 completions. For best out of 1 completion, the success rate is in low single digits.
I think the lesson here is that these models bring a lot more value when: 1. you have unit tests, 2. can afford compute/time to let the model try many solutions, 3. have enough isolation to run unverified code.
zaidhaan 2 years ago

They do note that the models "tend to do better when prompted with longer code generation tasks".
But yes, the choice of scales for the graph was rather peculiar.
kdnvk 2 years ago

6 billion is by no means large.

startupsfail 2 years ago

From the safety perspective (may get important soon), it is perhaps a very bad idea to allow easy execution/injection of arbitrary code into random places with little review.

One of the first steps of a misaligned/unhelpful/virus type of a system, attempting to secure its presence would likely be inference/GPU/TPU compute access. And code injection is a vector. There are multiple other vectors.

When designing such systems, please do keep that in mind. Make sure code changes are properly signed and the originating models are traceable.

Same applies to datasets generated by models.

jakear 2 years ago

Excellent. This is the beginning of the end for the cohort of people writing clear, descriptive commit messages. All your knowledge is soon to be acquisitioned and commodified by the Man with the GPU.

I on the other hand will survive: what sense is an AI to make of such classic messages as David Bowie's excellent "ch-ch-changes!", the five "fix CI maybe???"s in a row, or the eternal "fuck this shit"?

PoignardAzur 2 years ago

We're still in the beginning for these tools, but already they're demonstrating some really exciting capacity.

Something I haven't seen explored too much: navigation help. One of the things that takes me the most time when coding is remembering what was the next file / module / function I need to edit and jumping to it.

An autocomplete engine that would suggest jump locations instead of token could help me stay in the flow much longer, with fewer worries about whether I'm introducing subtle bugs because I'm relying on the AI too much.

abhijeetpbodas 2 years ago

On a philosophical level, AI for writing code has always seemed redundant to me. Here's why:

1. Humans create programming languages which machines can understand. OK.

2. Humans build tools (LSP, treesitter, tags, type checkers and others) to help humans understand code better. OK.

3. Humans build (AI) programs which run on machines so that the computer can understand... computer programs???

Aren't computers supposed to be able to understand code already? Wasn't the concept of "computer code" created so as to have something which the computer could understand? Isn't making a (AI) program to help the computer understand computer programs re-inventing the wheel?

(Of course, I get that I use the terms "understand" and "computer programs" very loosely here!)

manmal 2 years ago

As long as we don’t have „level 5“ code generation (no human oversight necessary), we need the code to be human readable. Afterwards, sure, why not produce assembly directly. Still it might be more practical to produce platform independent code instead - you‘ll only need to train one model instead of one per platform.
semitones 2 years ago

The benefit here is that the machine can execute what the AI produces, and humans can understand it / modify it if they need to.
- wankle 2 years ago
  
  It can be seen as a benefit or a cautionary tale. Earlier in the comments, someone claimed ChatGPT gave a recipe for a omelet made with 2 to 3 cow eggs. If instead of putting a recipe on the screen, the AI was connected to a cow and a frying pan...OW!
  
  semitones 2 years ago
  
  I'm not making a judgment or claim as to whether the technology is beneficial overall. I was just explaining the benefit of the choice of output being source code rather than an executable.
jrvarela56 2 years ago

The impact of context in LLM performance makes higher level languages a must for AI to generate programs. The AI doesn't 'understand' code like a 'computer' does - it understands it like we do using text to express logic.
Arguably, we would benefit from even higher level abstractions so the LLM can fit more logic in a single prompt/output.
divs1210 2 years ago

Good point!
Maybe a future AI could generate machine code that could be "disasssembled" into higher level languages.
Not sure if that would be better.
elcomet 2 years ago

Yeah you do, machines execute code but don't understand it.

lettergram 2 years ago

I view programming as a trade. I’ve spent years honing my skills, I pass wisdom to junior engineers as I can. I review code and provide detailed alternatives.

My concern with AI across all fields are that people won’t gain the fundamental skills necessary for moving the bounds of what’s possible. Certainly, tools like this AI could produce good results. However, the underlying human is still providing the training data. More importantly, humans are producing the trajectory of development.

If humans are no longer capable of pushing the AI systems. Then the AI systems will either cease to improve, or the AI systems will learn to play off each other. In highly complex systems like many programs, I suspect they’ll play off each other and achieve local minimum/maximum locations. Ie because the “game” (program development) can be iterative they’ll constantly improve code. However, because the AI systems don’t interact with all data (particularly real-world data) when a customer shows a sad face at some UI/UX, it won’t completely develop a new feature that matches the desires of the customer.

Where I fear this will leave us is a class of less-skilled engineers and overly optimized AI. Basically, stuck in development.

DominikPeters 2 years ago

It would have been helpful to show some example generations of the model, unless I've missed them.

bogwog 2 years ago

This is all I could find: https://twitter.com/carperai/status/1619082410213404672
jerpint 2 years ago

Yes I agree. My understanding is you have in training dataset the original code, diff + commit message.
So you train the LM to:
Input: code+commit output: diff

ilaksh 2 years ago

Since I am building a website https://aidev.codes to do programming based on natural language descriptions, this is extremely relevant to me.

OpenAI has an 'edit' endpoint but it's 'in beta' and limited to 10-20 requests per minute. They do not acknowledge support requests about this. Azure OpenAI also has this endpoint I think but they ignore me as well.

So for my edits just like everything else I have been relying on text-davinci-003 since it has much more feasible rate limits. I have just been having it output the full new file but maybe this Unified Diff thing is possible to leverage.

Does anyone know, what would be the easiest way to try to run their 6B Diff Models thing against my own prompts for my service? Maybe Hugging Face?

moconnor 2 years ago

All that to end with “no meaningful improvement over the salesforce codegen model” is a bit disappointing.

Negative results are interesting in their own right. I’d rather read about why this isn’t better at the 6B parameter level than e see a hand wave that, well, the samples are more diverse and look the 350M model is better.

youssefabdelm 2 years ago

Yeah I felt the same way. Although perhaps at a higher scale the fine-tuning can make a bigger difference? The results go against this hypothesis but at least OpenAI states that GPT-3 only needs 200 examples, so who knows. In fact I wonder how well GPT-3 would do against this when fine-tuned on just 200 examples.

mortehu 2 years ago

I wrote the program between <BEF> and <MSG>, and it generated the following output:

  <NME> diff_model.py
  <BEF> import argparse

  import torch
  import transformers

  def main():
      argparser = argparse.ArgumentParser()
      argparser.add_argument('--checkpoint', default='CarperAI/diff-codegen-2b-v2', choices=['CarperAI/diff-codegen-6b-v2', 'CarperAI/diff-codegen-2b-v2', 'CarperAI/diff-codegen-350m-v2'], help='Model to use')
      args = argparser.parse_args()

      model = transformers.AutoModelForCausalLM.from_pretrained(args.checkpoint)
      tokenizer = transformers.AutoTokenizer.from_pretrained(args.checkpoint)

      # Use CUDA.
      model = model.cuda()

      text = '<NME> diff_model.py\n<BEF> '
      text += open("diff_model.py").read()
      text += '<MSG> Print a message after loading the model\n<DFF>\n'

      completion = model.generate(tokenizer.encode(text, return_tensors="pt").cuda(), max_new_tokens=400)

      completion = tokenizer.decode(completion[0], skip_special_tokens=True)

      print(completion)


  if __name__ == '__main__':
      main()
  <MSG> Print a message after loading the model
  <DFF>
  <DFF> @@ -1,4 +1,5 @@
   '''
  +import sys
   import argparse

   import torch
  @@ -8,6 +9,7 @@ def main():
       tokenizer = transformers.AutoTokenizer.from_pretrained(args.checkpoint)

       # Use CUDA.
  +    print("CUDA:", torch.cuda.is_available())
       model = model.cuda()

       text = '<NME> diff_model.py\n<BEF> '
  @@ -10,3 +12,4 @@ def main():
       completion = tokenizer.decode(completion[0], skip_special_tokens=True)

       print(completion)
  +    sys.exit(0)
  \ No newline at end of file
  [\ No newline at end of file repeated many times]

It takes about 150 seconds to run on a 3090 Ti when the model is already on disk.

Epa095 2 years ago

Maybe this can give a boost for languages like idris or F*, where you can specify much stronger types than in normal languages (with the price that you might have too proove the types manually). The types can help "tame" the AI generated code, and the AI can help generate the proofs.

I also wonder if it could be useful in creating Coq proofs!

wslh 2 years ago

Very opportune. I am working on security diffs before and after security audit commits [1] reading the whole piece.

[1] https://news.ycombinator.com/item?id=34360102

parasti 2 years ago

I skimmed the post, but it seems not much was said about how the original diffs are generated. Git generates diffs only on request with varying levels of accuracy depending on the options given. Sometimes the diff completely fails to capture the intent of the change - it shows the path from A to B but not in any semantically meaningful way.

ec109685 2 years ago

2022: engineers with 3 jobs

2023: engineers with their own AI model, typing “#fixed bugs” and spending the rest of the day by the pool.

Jackson__ 2 years ago

I'm not sure if I'm just imagining it, but there seems to be a lot more negative push-back online to this than there was for copilot.

It makes me wonder if it's related to recent protests in other creative fields in response to AI models, or just a weird dislike of openly released model weights?

abdnafees 2 years ago

Why now? I mean it's been only 20 odd years or so since modern programming became popular. And, it's not a lot. Let people learn how to code, make mistakes and then learn from those mistakes. Pre-cooked meals are not as good as home cooked goodness.

indeyets 2 years ago

So, it is loosely the same as copilot? I understand that approach is a tad different, but result of converting natural language descriptions into code-changes should be comparable.

And both are trained on large corpus of github sources

Is there a way to test it somehow? Public API maybe?

Kiro 2 years ago

> converting natural language descriptions into code-changes
Do people actually use Copilot for that? I just let it work its magic uninstructed. I guess it sometimes uses comments and function/variable names for its suggestions but that's about it. 99% of the time it just looks at my code, the context and neighboring files to predict what I'm trying to do.
- noncovalence 2 years ago
  
  I've found writing a temporary comment can be particularly useful when working with Unicode. For example, something similar to
  //insert a unicode dot between each character in the string, and convert the numbers to subscript
  saved me a lot of copy-pasting.
- indeyets 2 years ago
  
  I use both. Sometimes it feels easier to write five words of text than starting to write code.
- elcapitan 2 years ago
  
  I use it most of the time as smart auto-completion as well, but sometimes for boilerplate it helps to just write a comment what you want to achieve, basically like a ChatGPT prompt.
- bil7 2 years ago
  
  for my day job, no, not frequently. When I'm writing in an unfamiliar language like bash or something, I'll do a little # implement a function that does x, y and z

pklausler 2 years ago

How good are these LLMs going to be at debugging code, as opposed to writing it?

spapas82 2 years ago

I'd really like to see how this would work with my commits... 99% of the messages on my commits are single word, similar to:

- ok

- fix

- done

- test

- nice

prettyStandard 2 years ago

Garbage in garbage out.
You should fix that.
- alchemist1e9 2 years ago
  
  Yeah I bet the people working with them really love the commit messages /s
  Or more likely they are working alone.
  
  Taywee 2 years ago
  
  I used to do that, until I had to go through my history to find a specific commit that I couldn't just diff for.
  It's like good comments in software. Half the time, you're doing it for your future self.
  
  manmal 2 years ago
  
  Additionally, people often write long PR descriptions while keeping commit messages to 80 chars. But when you think about it, PR descriptions are more or less ephemeral, while commit messages are persisted forever. There should be an emphasis on the latter.
  
  prettyStandard 2 years ago
  
  That's what my team and I do. We put effort into the PR, then squash it down. It works great.
dizhn 2 years ago

There's a character limit to commit messages in their training data.

tbrownaw 2 years ago

Sounds like basically the inverse of what was on here the other day about automatically generating commit messages from a diff.

Sounds kinda cool, even if trusting it would be a terrible idea.

indeyets 2 years ago

Related discussion: https://news.ycombinator.com/item?id=33271750

return_to_monke 2 years ago

while from the same company, same lab, I don't think this is what the article you linked is about. To me, that seems like a general purpose LLM and this just for code.

leo2023 2 years ago

The next idea after this could be: developers draw a system diagram of the architecture, then AI writes the whole system E2E, high performance, distributed.

shul 2 years ago

Why all the hate? I for one welcome our AI overlords

shireboy 2 years ago

If this thing is trained on my commit messages we’re all doomed. Or else we’ll be able to type “fixed the thing” and have a whole app written.