Amazon holds engineering meeting following AI-related outages

www.ft.com

116 points by petethomas 2 months ago

dang 2 months ago

Related ongoing thread:

After outages, Amazon to make senior engineers sign off on AI-assisted changes - https://news.ycombinator.com/item?id=47323017 - March 2026 (194 comments)

I'm not going to merge the current thread thither, because it's so bad. Interesting specimen of how much worse the comments get when there isn't a readable, substantive article to backstop the thread.

(Not a criticism of the submitter! - ft.com was the original source for this story and there are workarounds available, like the archive link and the google trick described at https://news.ycombinator.com/item?id=47319643.)

ChrisArchitect 2 months ago

If you're going to criticize a submitter, criticize the one who submitted that dupe of a story with multiple threads like 12 hours later. Why reward Ars and the submitter for being late? Shrug.
- dang 2 months ago
  
  Have I criticized a submitter?
rendaw 2 months ago

What's so bad about the comments here? They look qualitatively similar to the comments in the other submission.

o10449366 2 months ago

Paywalled

techterrier 2 months ago

paste headline into google, click first link
- kqr 2 months ago
  
  Huh, it has to be Google, specifically, too! There used to be a shortcut for this action on HN (a link under the submission saying "web" or something?), but it seems that has been removed.

urban_winter 2 months ago

https://archive.ph/wXvF3

dang 2 months ago

Thanks! added to toptext now.

nwmcsween 2 months ago

> Junior and mid-level engineers will now require more senior engineers to sign off any AI-assisted changes, Treadwell added.

Beatings will continue until senior engineers leave?

rsynnott 2 months ago

I wonder what senior means here. Like, unless it’s fairly junior seniors, the ratios are going to make that impossible.
- abroszka33 2 months ago
  
  Even if they have an army of senior engineers, reviewing AI generated code is fundamentally different than reviewing code written without AI. The change is usually larger, looks good on the surface and there are stupid mistakes in it. It's like reviewing someone's code whose only goal is to get an approval on the change.
  
  lamontcg 2 months ago
  
  So exactly like reviewing a code change from an outsourced development team in another country?
  
  abroszka33 2 months ago
  
  I'm not sure. How common is it to review the outsourced development team's code? My guess is that there is rarely any review. They usually ship the whole software and are responsible for it.
  
  lamontcg 2 months ago
  
  I spent years doing it.
cratermoon 2 months ago

Too bad Amazon laid off a bunch of their senior engineers over the last few months.

stavarotti 2 months ago

I'm curious whether this is due to their insistence on using home grown tools ie, Kiro and not Codex/Cursor/Claude et all? I tried Kiro and quickly left.

alecco 2 months ago

They are trying to cure the symptom. The actual cause: one of the most toxic environments to work as a developer.

Another case of AI as a scapegoat.

mediumsmart 2 months ago

Is it only 45 dollars for the subscription? Does that cover the AI-related outages too or just the engineering meeting

bravetraveler 2 months ago

When you hear "left behind", remember: is 'it' going to places you want?

MOSI3 2 months ago

And if it's going to get easier and easier for my work to be performed by AI, then what does it mean for me to "keep up"? Do I just need to create more slop than anyone else?
- bravetraveler 2 months ago
  
  Excellent consideration, probably. Sounds like a lot to do for very little in return. I'll leave with this, a sort of sick joke given context:
  Quit when the work is done
  
  MOSI3 2 months ago
  
  Fortunately my job is not based on generating plausible-sounding bs, so I should be safe.
  
  bravetraveler 2 months ago
  
  Hear, hear. Wish you the best! There's a whole list of other silly games to avoid, unfortunately. Namely, "up or out".
  For instance, I want to engineer [more]. Closer to management or sales due to scope creep. At this rate, by career-end, I'll be operating a small country by myself.

jcgrillo 2 months ago

> Junior and mid-level engineers will now require more senior engineers to sign off any AI-assisted changes, Treadwell added.

Lol. Lmao. You have got to be joking. Seniors leaving in droves is how that plays out.

Rohunyyy 2 months ago

Nope. NO ONE is quitting in the current market because they got asked to review extra PRs.
- kakacik 2 months ago
  
  Top people definitely do if they feel like it, why the heck shouldn't they. There is no shortage of work for those. But its fine if company, via its actions, claims it doesn't want to even retain its top talent. Just market forces and all that.
- rsynnott 2 months ago
  
  If you’re a senior at Amazon and your whole job becomes reviewing slop, well, you can likely get another job which does not revolve around reviewing slop. The current market is not great, but it’s disproportionately painful for juniors.
onei 2 months ago

I read that line and thought "so, the solution is code review?". What has to happen to your processes that code review is not only missing, but unironically claimed to be the solution?
I know there are some companies that never did code review, but this is Amazon. They should know better.
- rendaw 2 months ago
  
  It's _more_ code review. They already had senior code review.
wrxd 2 months ago

This is going to end either with seniors rubber-stamping absolutely everything without even reading or with seniors blocking most of the slop for no overall productivity gain
- Ekaros 2 months ago
  
  Or if review is actually done I think there will be productivity loss. Juniors with help of AI can generate more code than seniors have time to review in full working day. So they won't have time left for any other work...
- jcgrillo 2 months ago
  
  It will not be the second thing. If you hold up "velocity" you're the problem and that problem will get fixed.
- QuiEgo 2 months ago
  
  No, it’s Amazon. If a senior blocks the slop they will be told they are not disagreeing and committing enough. If bugs get through they will be told they didn’t dive deep enough. It’s classic Amazon blame game. Someone gets left holding the bag for impossible asks (hint: it’s never the person doing the ask), and then gets piped and fired.

shruubi 2 months ago

Amazon - Where the beatings and layoffs will continue until AI usage improves.

irishtel 2 months ago

The framing of "AI-related outages" is going to drive the wrong conversation. The question isn't whether the AI made a mistake — it's whether the review process, change controls, and blast radius gates were designed to handle agents acting at the speed and confidence of a senior engineer.

1,500 engineers flagged the tool wasn't ready. Leadership made adoption a KPI. Those two things together are the incident. The AI just executed.

When the process hasn't caught up to the blast radius the tooling can create, this is what you get. It's not new — Knight Capital, CrowdStrike — the pattern is always the same. Automation moves faster than governance.

pinkmuffinere 2 months ago

> The group has disputed the claim that headcount cuts were responsible for an increase in recent outages.

It's a bit hard to believe this.

rhubarbtree 2 months ago

Some engineers will point to this and say, hey, AI is not gonna work. It doesn’t reason very well and it leads to these problems.

But what they’re missing is all code quality is going to tank, and we are just going to accept that. Just as artisanal goods were replaced in the Industrial Revolution with mass produced inferior ones.

People will accept bad code if it is cheap enough.

We’ve gotten used to aiming for great, even if we often only hit functional. The new bar is going to be so much lower. Welcome to the era of cheap bad code. Lots more software, lots more value overall, but much worse reliability. Every day the apps I use get buggier.

gtsop 2 months ago

You are almost right. As I say since the beginning of this ai circus, this is the equivalent of flipping mcdonalds burgers (no insult intended for those workers). It is a thing, and people buy and eat them. But high quality burgers made by talented chefs will always be out there. That's my analogy, and i dont intend to be on the side of flipping mcdonalds burgers
- rhubarbtree 2 months ago
  
  There are a lot of McDonalds and very few Michelin starred restaurants.
  Safety critical engineering and infrastructure layers will (eventually again) be rigorous. Everything else is headed to slop.
  My craft died. I’m sad. Time to move on.
  
  kakacik 2 months ago
  
  Where I live, gourmet high quality burger joints definitely, and massively overwhelm McDonalds in number (Geneva, Switzerland). Even if I count in burger king. Shows that sometimes people pay for the quality even if they don't desperately need it. And its trivial to make better burgers than mcd, heck I can surpass them trivially at home with every ingredient, they are really the lowest level of quality, taste, looks, or (lack of) healthy components. You don't need Michelin * for that, far from it. Plus food is often cold outside of peak hours, something that never happened to me in proper restaurant.
  Also, mcd ain't at the end much cheaper, just marginally, the choice of drinks is pathetic, usually no beer. The main reason folks go there because its easier/faster than getting table in real restaurant. But also the environment in mcd is absolute soulless cheap fugly shit. (there are kids corners to be fair, but they are often disgustingly dirty).
  Its a very good analogy at the end IMHO, maybe just not tilting the way you intended, at least not here.
  
  rhubarbtree 2 months ago
  
  Haha well not my analogy, analogy is not a good way to reason, but Geneva in this situation is exactly the exception that proves the rule. Thanks for emphasising my point.
- nottorp 2 months ago
  
  > high quality burgers
  There is also, you know, actual food. Done by real chefs.
- rsynnott 2 months ago
  
  It’s really not. McDonald’s’ whole thing is consistency. It’s never going to be good, but not is it going to be that terrible.
  That is, ah, very much not the case for AI slop.
- ajross 2 months ago
  
  > i dont intend to be on the side of flipping mcdonalds burgers
  So say the kitchen staff at every Denny's too. And yet...
  The analogy is apt, but your coping strategy falls down because of numbers. There aren't a lot of spots for those "chefs" to get paid like they expect.
  Most HN commenters might have gotten by over the past decades thinking they were "talented chefs", but were really more like the "short order cooks" whose jobs got eaten by fast food.
idiocratic 2 months ago

The economics of software are very different from physical goods. Margins on software (products) are orders of magnitude higher. Any cost shaving done at coding time is economically irrelevant in the long run, detrimental to quality/reputation and could almost be seen as a risk. Furthermore, assuming the bottleneck in this process has so far been coding is pure BS.
- rhubarbtree 2 months ago
  
  The cope island of objections will continue to shrink.
  Being able to easy create apps means huge supply, which means commodification of software just like the commodification of physical goods. Mass supply means low prices. It won’t be economic to have artisan coders any more than to have artisan goods makers.
  
  yladiz 2 months ago
  
  And yet people still want artisan goods, artwork, high end food, things that aren’t “economic”.
  
  rhubarbtree 2 months ago
  
  Very, very, few people buy these things.
  
  yladiz 2 months ago
  
  Okay? It doesn't refute my point.
  
  rhubarbtree 2 months ago
  
  The point is that artisanal code is to a first approximation a thing of the past. Most engineers will not have a job writing code in these niches that survive, and thus coding as a career is effectively dead.
- Ravus 2 months ago
  
  > assuming the bottleneck in this process has so far been coding is pure BS.
  This is the core insight for most businesses.
  When evaluating the impact of AI on velocity, the first thing to consider is how long it takes for a one-line code change to get into production, including initial analysis and specs.
  You can't get faster than this.
- rhubarbtree 2 months ago
  
  Margins on software will no longer be what they were, that’s the point. Commoditisation means software values will head to zero. Margins will depend on factors unconnected to the software itself. For example, brand, distribution, network effects, lock on, proprietary data.
  It doesn’t matter whether coding was “the bottleneck”. It’s irrelevant. Fact is it used to be expensive to create software and now you’ll be to create it for super cheap. Yes, it won’t be as good. But the price will be so low it won’t matter. This is what commoditisation means. Forget the economics of software as you know it, that has ended.
ozgrakkurt 2 months ago

You are comparing code to a tshirt but it is more similar to infrastructure like roads/bridges/buildings. It is like a platform that you build other stuff on top of
- rhubarbtree 2 months ago
  
  AWS yes. Most code no.
rendaw 2 months ago

I thought this too, but it's still weird.
Machines that make e.g. paper are great. They are immensely more efficient, but extremely consistent and superhuman (try making that perfectly smooth letter paper by hand).
Human written software is the same. Where you had N people copying data from spreadsheets for M suppliers into an internal database or whatever, you now have one program doing it. It can be scaled infinitely for a fraction of the cost. It _never_ messes up. The cost of the software developer is trivial in comparison. Software was a space where the marginal cost for quality was extremely cheap.
I don't get how AI fits in here. Software already had massive scale. You aren't replacing a massive data entry team with AI, you're replacing a reliable piece of software written by a human with a reliable (?) piece of software written by AI controlled by a human. There's no increase in scale. Until the reliability issues are fixed a very noticeable decrease in reliability (sure, some software was bad already, but now the good developers are also writing bad code).
This doesn't seem like a natural step to me at all. The best explanation I can come up with is AI is just being used as an excuse for destructive penny pinching.
- palmotea 2 months ago
  
  > I don't get how AI fits in here. Software already had massive scale. You aren't replacing a massive data entry team with AI, you're replacing a reliable piece of software written by a human with a reliable (?) piece of software written by AI controlled by a human. There's no increase in scale. Until the reliability issues are fixed a very noticeable decrease in reliability (sure, some software was bad already, but now the good developers are also writing bad code).
  > This doesn't seem like a natural step to me at all. The best explanation I can come up with is AI is just being used as an excuse for destructive penny pinching.
  I think a big part of the explanation is business leaders aren't actually as smart or thoughtful as they're made out to be. They may have an inaccurate and unrealistic view of LLMs, and making policy and decisions based on that view.
  Also business leaders and management are often extremely tolerate and even seem to encourage bad code. The code is a black box that they usually don't even have to use, bugginess and unreliability are often hard to quantify and can be swept under the run. "Saving" $5,000 by not fixing a bug could over time lead to say $10,000 in unquantified costs, but it looks like the "smart" decision on a budget spreadsheet. They only really pay attention once quality is disaster level, or there's some unusually high-profile problem.
rsynnott 2 months ago

I don’t totally buy this. If you’re Amazon, there’s only so buggy you can get before you start losing huge amounts of money.
- rhubarbtree 2 months ago
  
  99% of software is not Amazon.
  
  rsynnott 2 months ago
  
  This article is about Amazon.
  
  rhubarbtree 2 months ago
  
  Yes, and my point was about the dangers of generalising from this instance.
red-iron-pine 2 months ago

> We’ve gotten used to aiming for great, even if we often only hit functional.
bro how long have you been a dev or in IT? there is so much crap out there already. AI is only gonna make it worse.
Amazon having these problems is especially bad since so much of their own, as well as other's, infra lives on those systems.

palmotea 2 months ago

> Amazon’s ecommerce business has summoned a large group of engineers to a meeting on Tuesday for a “deep dive” into a spate of outages, including incidents tied to the use of AI coding tools.

> The online retail giant said there had been a “trend of incidents” in recent months, characterised by a “high blast radius” and “Gen-AI assisted changes” among other factors, according to a briefing note for the meeting seen by the FT.

> Under “contributing factors” the note included “novel GenAI usage for which best practices and safeguards are not yet fully established”.

> “Folks, as you likely know, the availability of the site and related infrastructure has not been good recently,” Dave Treadwell, a senior vice-president at the group, told employees in an email, also seen by the FT.

VirusNewbie 2 months ago

GenAI at fault, and nothing to do with amazon laying off 30k people and having an overall shitty culture where people mostly don’t want to stay?
- jiggawatts 2 months ago
  
  Also, managers are incentivised to force AI onto the remaining staff to “boost productivity” but of course they won’t accept any of the responsibility or blame for that decision.
  
  zihotki 2 months ago
  
  Just tell the employees to make AI fully adopted in SDLC and make it secure and reliable. Don't make mistakes.
  If it works for models, why not humans? /s
- aerhardt 2 months ago
  
  Maybe both, and possibly other causes too, but allow us a moment to revel in the schadenfreude of AI code slop at hyperscale, will you?
- applfanboysbgon 2 months ago
  
  > GenAI at fault, and nothing to do with amazon laying off 30k people
  GenAI is literally the direct reasoning they used for laying off 30k people.
  > “As we roll out more Generative AI and agents, it should change the way our work is done. We will need fewer people doing some of the jobs that are being done today, and more people doing other types of jobs,” [Amazon CEO Andy Jassy] bluntly admitted.
  
  spwa4 2 months ago
  
  There is a long history of people blaming AI for not being able something totally unfair and me and I do believe quite a lot of probably somewhat older ML practitioners are seriously tired of that constantly happening. Amazon is prioritizing investment into data center expansion over paying employees. And ML ... is present in the building, and about as involved in the firings as the cleaning staff is, only people are scared of AI and so it gets blamed for everything. The firings are driven by imho misguided financial engineering, and it sure as hell is not being being done by ML.
  But what is reported? Management firing people? ML. Engineering screwing up the uptime? ML. Someone screws up their job? ML.
  Don't you know? ML is killing people in Iran today. Not mullahs. Not the military. ML. Obviously that's where the responsibility lies ...
  Usually blaming ML is like suddenly coming up with conspiracy theories like here, or impossible suddenly added requirements, and usually utterly ridiculous ones (like criticizing Deep Blue for not being able to play poker, yes I realize I'm old, but it's a bit like criticizing the very best competition canoe on the planet for it's disappointing spaceflight capabilities)
  Like here: large blast radius AI-assisted outages ... we've all written software, and we all know the problem here: THEIR TESTS SUCK. Probably because they fired all the good SREs for insisting software teams spend time on tests, or demanding integration test failures are fixed before shipping the software.
  By the way: I'd like to point out that in most/all industries where jobs are lost on a large scale the situation is like the Amazon situation: ML is not even remotely involved. So while I get the criticism, it doesn't work like that. The Auto industry first got blasted with very traditional engineering, which worked and depended on very old style mathematics. What's happening in factory automation is 99.9% 3d geometry (to the point that ML, is actually a simplification of the problem). Then the auto industry got blasted with what every industry got blasted with: stuck in demand-limited markets. Every car company can easily build 10x more cars next year, but there's no point: nobody will buy them. So the only thing worth doing for these companies is to produce cheaper ... and that means getting rid of people (when end-to-end taxes on income in Europe are 60-85% and actually rising). With only a few exceptions, these companies find ML too expensive for projects.
  So while I understand "we're defending our jobs", it's misguided ... the big job losses in the west have nothing to do with ML. MAYBE those are coming, but large job losses have been predicted in the last 50 AI "revolutions". 49 times that was wrong. And the actual problem is really a return to 99.9% of history: when it comes to doing what is needed to keep society going 10%, maybe even 1% of people can do it. That means you need something for the other 90% or 99% to do.
  The solution is the only thing that has helped in the past: having the government put on huge public works. From building the pyramids to the Sagrada Familia (and yes, wars. But let's please not do that), or ridiculous engineering projects like Europe and America's rail networks. There's a stable in the Italian alps that has a private rail connection. So fix the problem. I don't know: build a large cathedral in Washington or something. Hell, hire people to make sure it has a depiction of the last supper where every square micrometer of the painting was designed by an AI with 1000-member engineering team, so people can spend their entire life looking at the painting with a microscope and find new details every day. Let's do something "great", in the sense of an enormous effort. Fly 100 missions to Alpha Centauri. Fix the demand-limited issue the economy has. "Do more with more". And stop blaming ML. Hell, I'm currently in an old European city filled with 200-year old buildings. Quaint. Cool. Except ... not really. 90% of these buildings suck. Can we just rebuild 95% of ... all European capitals? Every building that is way too old and has no reason whatsoever to be preserved other than it's currently slightly cheaper ... can we please just rebuild them better? Do stuff like that.
  
  surgical_fire 2 months ago
  
  It's not, and in the latest round of 14k people laid off they were more transparent that it was a result of previously having overhired.
  
  malfist 2 months ago
  
  > previously having overhired
  Funny way of saying that Jassy told people he doesn't like the culture of a larger amazon.
  Also, if we overhired in 2020-2022, why the hell are we still correcting it in 2026? Did none of the layoffs in 2023 on do the job?
  Just an all around failure of leadership with no ownership.
  
  surgical_fire 2 months ago
  
  Oh believe me, I am not defending the prick.
  We are in a thread of Amazon holding engineering meetings after AI-related outages after laying off 30k people.
  If anything this highlights gross incompetence of a moronic leadership. It should be them being laid off.
  If overhiring indeed happened, it is also a failure of leadership. Hiring too many people and then firing a bunch of people causes friction, loss of knowledge, decreased morale, etc.
  
  malfist 2 months ago
  
  But like, what if we did the layoffs bit by bit and tell people each time there will be more and stay tuned. Surely that's a sign of strong leadership. Just like "muscle confusion" for workouts! Can't let people feel to safe or stable.
  
  keeda 2 months ago
  
  > Did none of the layoffs in 2023 on do the job?
  No, because the calculus of layoffs shifted. Briefly, there is always a natural attrition rate A%, but whenever companies do an X% layoff they expect a smaller Y% additional attrition (due to morale etc.) So they expect an overall (A + X + Y)% reduction in headcount within a few months of the layoffs.
  However, the job market swung so rapidly from pro-employee to pro-employer in that timeframe that the Y% never happened, and in fact there was even a drop in A%. And so companies still ended up with more employees than planned and had to scramble to achieve their headcount goals using other means (RTO mandates, shifting headcount offshore, further layoffs with AI washing, etc.)
  A bit more detail on the calculus in this comment: https://news.ycombinator.com/item?id=46142948
- nixass 2 months ago
  
  Absolutely correct. Now let's drop anothet few billions to make AI better and avoid such mistakes in the future. And we might lay off some more folks to make room in a budget for more AI
- cratermoon 2 months ago
  
  Those two facts are not mutually exclusive. Laying off 30K people and pushing the remaining engineers to use the ensloppenator for everything, this is the expected result.
hansmayer 2 months ago

> “Folks, as you likely know, the availability of the site and related infrastructure has not been good recently,” Dave Treadwell, a senior vice-president at the group, told employees in an email, also seen by the FT.
Also some SVP over there: '"folks", we'll measure your performance and bonus based on how much you use Gen AI:)'
- rsynnott 2 months ago
  
  Yeah, “you must use LLMs, but also pls don’t use them for important stuff” is a difficult circle to square.
  
  Gud 2 months ago
  
  Who said you can’t use it for important stuff? Just because SOME people are screwing up doesn’t mean everyone is.
  
  hansmayer 2 months ago
  
  Of course you can use them for whatever you want. Its also not disputable that some people will be more careful than the other. The issue however is that the idiots who pushed for widespread usage of AIs in the companies, i.e. clueless MBAs, have also pushed them onto exactly the types you are mentioning - the ones who will screw things over because they are incompetent or don't care, or most likely - are both of those things. So it's not a criticism of people who are careful in their usage of LLMs in critical scenarios - it's a criticism of the morons who bought into the AI hype and really believe an LLM will produce equally great terraform code previously written by 10 engineers at the 1% of the cost.
  
  kshacker 2 months ago
  
  Absolutely. We need to get a Hello, World equivalent of something a person should be able to do with AI before they are allowed to decide AI projects.

andyjohnson0 2 months ago

https://archive.ph/wXvF3

jqpabc123 2 months ago

Summary: AWS has voluteered to serve as a crash test dummy for vibe coding.

But don't tell anyone --- and if you do, don't blame AI because it's all the humans fault for not shaping their questions in the "right way".

arjie 2 months ago

For this particular experiment, regardless of phrasing, I think the guys with the most appetite for risk have to be Cloudflare. They're shipping at an astonishing pace but I think there have been far more outages than there were before in jgc era. Perhaps Anthropic's application side teams are faster and more cowboy[0] but they are super AI-native so that makes sense.
0: I think this is the eras cowboys win so they're (unsurprisingly) smart about doing this
- Rohunyyy 2 months ago
  
  I am surprised we haven't had an actual Y2K crash with these AI codes. Like how do you review a 1000 lines of Claude generated PR?
  
  krilcebre 2 months ago
  
  You don't. I can guarantee that 90% of the generated code will never receive a detailed review, simply because there's too much of a cognitive overhead, and too little time, everything moves too fast.
  I remember having to do such a code review before an AI in a highly complex component, and it would take a full day of work to do it. In this day and age, most of the people i know take like half an hour and are mostly scanning for obvious mistakes, where the bigger problem are those sneaky non obvious ones.
  
  kakacik 2 months ago
  
  Exactly. Its same for reviewing somebody else's code. How many companies did this perfectly before llms came? I know mine didn't. But these days people that aren't senior enough do reviews of llm output, and do a quick mental path through the code, see the success and approve it.
  What could work - llm creating a very good test suite, for their own code changes and overall app (as much as feasible), and those tests need a hardcore review. Then actual code review doesn't have to be that deep. But if everybody is shipping like there is no tomorrow, edge cases will start biting hard and often.
  
  CamperBob2 2 months ago
  
  You get Codex or Gemini to review it.
  Pro tip: tell it the code came from Claude. That will make it put its war face on.
bootsmann 2 months ago

This wouldn't happen if they used my CLAUDE.md of course!
blitzar 2 months ago

They were holding it wrong.

kerim-ca 2 months ago

Full Article

Amazon’s ecommerce business has summoned a large group of engineers to a meeting on Tuesday for a “deep dive” into a spate of outages, including incidents tied to the use of AI coding tools.

The online retail giant said there had been a “trend of incidents” in recent months, characterised by a “high blast radius” and “Gen-AI assisted changes” among other factors, according to a briefing note for the meeting seen by the FT.

Under “contributing factors” the note included “novel GenAI usage for which best practices and safeguards are not yet fully established”.

“Folks, as you likely know, the availability of the site and related infrastructure has not been good recently,” Dave Treadwell, a senior vice-president at the group, told employees in an email, also seen by the FT.

The note ahead of Tuesday’s meeting did not specify which particular incidents the group planned to discuss.

Amazon’s website and shopping app went down for nearly six hours this month in an incident the company said involved an erroneous “software code deployment”. The outage left customers unable to complete transactions or access functions such as checking account details and product prices.

Treadwell, a former Microsoft engineering executive, told employees that Amazon would focus its weekly “This Week in Stores Tech” (TWiST) meeting on a “deep dive into some of the issues that got us here as well as some short immediate term initiatives” the group hopes will limit future outages.

He asked staff to attend the meeting, which is normally optional.

Junior and mid-level engineers will now require more senior engineers to sign off any AI-assisted changes, Treadwell added.

Amazon said the review of website availability was “part of normal business” and it aims for continual improvement.

“TWiST is our regular weekly operations meeting with a specific group of retail technology leaders and teams where we review operational performance across our store,” the company said.

Separately, the company’s cloud computing arm — Amazon Web Services — has suffered at least two incidents linked to the use of AI coding assistants, which the company has been actively rolling out to its staff.

AWS suffered a 13-hour interruption to a cost calculator used by customers in mid-December after engineers allowed the group’s Kiro AI coding tool to make certain changes, and the AI tool opted to “delete and recreate the environment”, the FT previously reported.

Amazon previously said the incident in December was an “extremely limited event” affecting only a single service in parts of mainland China. Amazon added that the second incident did not have an impact on a “customer facing AWS service”.

The FT previously reported multiple Amazon engineers said their business units had to deal with a higher number of “Sev2s” — incidents requiring a rapid response to avoid product outages — each day as a result of job cuts.

Amazon has undertaken multiple rounds of lay-offs in recent years, most recently eliminating 16,000 corporate roles in January. The group has disputed the claim that headcount cuts were responsible for an increase in recent outages.</i>

scuff3d 2 months ago

Gonna see a lot more of this in the coming years. The real cost of LLM tools has a delay. Devs don't tend to notice it until they're neck deep in code then don't understand, swearing the next prompt will get them out. CEOs won't notice until it starts costing them money, and that of course assumes anyone will be willing to admit it. Lot of people have their careers on the line spending a metric shit ton of money on untested tools.

potetoooooo 2 months ago

nice domain

wiseowise 2 months ago

Hold a meeting?! No way! That’s a news worthy material!

Seriously, who even cares? It’s probably going to be “guys be careful but also continue to push slop kthx”.