I think unfortunately the conclusion here is a bit backwards; de-risking deployments by improving testing and organisational properties is important, but is not the only approach that works.
The author notes that there appears to be a fixed number of changes per deployment and that it is hard to increase - I think the 'Reversie Thinkie' here (as the author puts it) is actually to decrease the number of changes per deployment.
The reason those meetings exist is because of risk! The more changes in a deployment, the higher the risk that one of them is going to introduce a bug or operational issue. By deploying small changes often, you get deliver value much sooner and fail smaller.
Combine this with techniques such as canarying and gradual rollout, and you enter a world where deployments are no longer flipping a switch and either breaking or not breaking - you get to turn outages into degradations.
This approach is corroborated by the DORA research[0], and covered well in Accelerate[1]. It also features centrally in The Phoenix Project[2] and its spiritual ancestor, The Goal[3].
I tend to agree. Whenever I've removed artificial technical friction, or made a fundamental change to an approach, the processes that grew around them tend to evaporate, and not be replaced. I think many of these processes are a rational albeit non-technical response to making the best of a bad situation in the absence of a more fundamental solution.
But that doesn't mean they are entirely harmless. I've come across some scenarios where the people driving decisions continued to reach for human processes as the solution rather than a workaround, for both new projects and projects designated specifically to remove existing inefficiencies. They either lacked the technical imagination, or were too stuck in the existing framing of the problem, and this is where people who do have that imagination need to speak up and point out that human processes need to be minimised with technical changes where possible. Not all human processes can be obviated through technical changes, but we don't want to spread ourselves thin on unnecessary ones.
This is just anecdotal but I have found anytime a network interface is involved, it can slow down the deployment. I had a case where I was deleting lambdas in a VPC, and connected to EFS, that the deployment was rather quick but it took ~20 minutes for cloudformation to cleanup and finish.
The reason by boss tends to give is that it’s made by AWS, so it cannot possibly be bad. Also, it’s free. Which is never given as anything more than a tangentially related reason, but…
I think this was the meme before moduliths[1][2] where people conflated the operational and code change aspects of microservices. But it's just additional incidental complexity that you should resist.
IOW you can do as many deploys without microservices if you organize your monolithic app as independent modules, while keeping out the main disadvantages of the microservice (infra/cicd/etc complexity, and turning your app's function calls into a unreliable distributed system communication problem).
An old monolithic PHP application I worked on for over a decade wasn't set up with independent modules and the average deploy probably took a couple seconds, because it was an svn up which only updated changed files.
I frequently think about this when I watch my current workplace's node application go through a huge build process, spitting out a 70mb artifact which is then copied multiple times around the entire universe as a whole chonk before finally ending up where it needs to be several tens of minutes later.
Even watching how php applications get deployed these days, where it goes through this huge thing and takes about the same amount of time to replace all the docker containers.
> Not a silver bullet; you increase api versioning overhead between services for example.
That's actually a good thing. That ensures clients remain backwards compatible in case of a rollback. The only people who don't notice the need for API versionin are those who are oblivious to the outages they create.
As long as every team managing the different APIs/services don’t have to be consulted for others to get access.
You then get both the problems of distributed data and even more levels of complexity (more meetings than with a monolith)
It’s a monkey’s paw solution, now you have 15 kinda slow pipelines instead of 3 slow deployment pipelines. And you get to have the fun new problem of deployment planning and synchronizing feature deployments.
You can do this with a monolith architecture as others point out. It always comes down to governance. With monoliths you risk slowing yourself down in a huge mess of SOLID, DRY and other “clean code” nonsense which means nobody can change anything without it breaking something. Not because any of the OOP principles are wrong on face value, but because they are so extremely vague that nobody ever gets them right. It’s always hilarious to watch Uncle Bob dismiss any criticism with a “they misunderstood the principles” because he’s always completely right. Maybe the principles are just bad when so many people get them wrong? Anyway, microservices don’t protect you from poor governance it just shows up as different problems. I would argue that it’s both extremely easy and common to build a bunch of micro services where nobody knows what effect a change has on others. It comes down to team management, and this is where our industry sucks the most in my experience. It’ll be better once the newer generations of “Team Topologies” enter, but it’ll be a struggle for decades to come if it’ll ever really end. Often it’s completely out of the hands of whatever digitalisation department you have because the organisation views any “IT” as a cost center and never requests things in a way that can be incorporated in any sort of SWE best practice process.
One of the reasons I like Go as a general purpose language is that it often leads to code bases which are easy to change by its simplicity by design. I’ve seen an online bank and a couple of landlord systems (sorry I can’t find the English word for asset and tenant management in a single platform) explode in growth. Largely because switching to Go has made it possible for them to actually deliver what the business needs. Mean while their competition remains stuck with unruly Java or C# code bases where they may be capable of rolling out buggy additions every half year if their organisation is lucky. Which has nothing to do with Go, Java or C# by the way, it has to do with old fashioned OOP architecture and design being way too easy to fuck up. In one shop I worked they had over a thousand C# interfaces which were never consumed by more than one class… Every single one of their tens of thousands of interfaces was in the same folder and namespace… good luck finding the one you need. You could do that with Go, or any language, but chances are you won’t do it if you’re not rolling with one of those older OOP clean code languages. Not doing it with especially C# is harder because abstraction by default is such an ingrained part of the culture around it.
Personally I have a secret affection for Python shops because they are always fast to deliver and terrible in the code. Love it!
I had a boss who actually acknowledged that he was deliberately holding up my development process - this was a man who refused to allow me a four day working week.
Sounds like a process problem. 2024 development cycles should be able to handle multiple lanes of development and deployments. Also why things moved to microservices so you can deploy with minimal impact as long as you don't tightly couple your dependencies.
You don't need microservices to do this. It's actually easier deploying a monolith with internal dependencies than deploying microservices that depend on each other.
This is very accurate - microservices can be great as a forcing function to revisit your architectural boundaries, but if all you do is add a network hop and multiple components to update when you tweak a data model, all you'll get is headcount sprawl and deadlock to the moon.
I'm a huge fan of migrating to microservices as a secondary outcome of revisiting your component boundaries, but just moving to separate repos & artifacts so we can all deploy independently is a recipe for pain.
I think unfortunately the conclusion here is a bit backwards; de-risking deployments by improving testing and organisational properties is important, but is not the only approach that works.
The author notes that there appears to be a fixed number of changes per deployment and that it is hard to increase - I think the 'Reversie Thinkie' here (as the author puts it) is actually to decrease the number of changes per deployment.
The reason those meetings exist is because of risk! The more changes in a deployment, the higher the risk that one of them is going to introduce a bug or operational issue. By deploying small changes often, you get deliver value much sooner and fail smaller.
Combine this with techniques such as canarying and gradual rollout, and you enter a world where deployments are no longer flipping a switch and either breaking or not breaking - you get to turn outages into degradations.
This approach is corroborated by the DORA research[0], and covered well in Accelerate[1]. It also features centrally in The Phoenix Project[2] and its spiritual ancestor, The Goal[3].
[0] https://dora.dev/
[1] https://www.amazon.co.uk/Accelerate-Software-Performing-Tech...
[2] https://www.amazon.co.uk/Phoenix-Project-Helping-Business-An...
[3] https://www.amazon.co.uk/Goal-Process-Ongoing-Improvement/dp...
I tend to agree. Whenever I've removed artificial technical friction, or made a fundamental change to an approach, the processes that grew around them tend to evaporate, and not be replaced. I think many of these processes are a rational albeit non-technical response to making the best of a bad situation in the absence of a more fundamental solution.
But that doesn't mean they are entirely harmless. I've come across some scenarios where the people driving decisions continued to reach for human processes as the solution rather than a workaround, for both new projects and projects designated specifically to remove existing inefficiencies. They either lacked the technical imagination, or were too stuck in the existing framing of the problem, and this is where people who do have that imagination need to speak up and point out that human processes need to be minimised with technical changes where possible. Not all human processes can be obviated through technical changes, but we don't want to spread ourselves thin on unnecessary ones.
A bit tangential but why is CloudFormation so slowww?
I figure it's because AWS can get away with it.
This is just anecdotal but I have found anytime a network interface is involved, it can slow down the deployment. I had a case where I was deleting lambdas in a VPC, and connected to EFS, that the deployment was rather quick but it took ~20 minutes for cloudformation to cleanup and finish.
The reason by boss tends to give is that it’s made by AWS, so it cannot possibly be bad. Also, it’s free. Which is never given as anything more than a tangentially related reason, but…
Microservices lets you horizontally scale deployment frequency too.
I think this was the meme before moduliths[1][2] where people conflated the operational and code change aspects of microservices. But it's just additional incidental complexity that you should resist.
IOW you can do as many deploys without microservices if you organize your monolithic app as independent modules, while keeping out the main disadvantages of the microservice (infra/cicd/etc complexity, and turning your app's function calls into a unreliable distributed system communication problem).
[1] https://www.fearofoblivion.com/build-a-modular-monolith-firs...
[2] https://ardalis.com/introducing-modular-monoliths-goldilocks...
An old monolithic PHP application I worked on for over a decade wasn't set up with independent modules and the average deploy probably took a couple seconds, because it was an svn up which only updated changed files.
I frequently think about this when I watch my current workplace's node application go through a huge build process, spitting out a 70mb artifact which is then copied multiple times around the entire universe as a whole chonk before finally ending up where it needs to be several tens of minutes later.
Even watching how php applications get deployed these days, where it goes through this huge thing and takes about the same amount of time to replace all the docker containers.
Yeah, if something even simpler works, that's of course even better.
I'd argue the difference between that PHP app and the Node app wasn't the lack of modularity, you could have a modulith with the same fast deploy.
(But of course modulith is too just extra complexity if you don't need it)
Not a silver bullet; you increase api versioning overhead between services for example.
> Not a silver bullet; you increase api versioning overhead between services for example.
That's actually a good thing. That ensures clients remain backwards compatible in case of a rollback. The only people who don't notice the need for API versionin are those who are oblivious to the outages they create.
True but your API won't be changing that rapidly especially in a backwards-incompatible way.
What's that got to do with microservices?
Edit, because you can avoid those things in a monolith.
As long as every team managing the different APIs/services don’t have to be consulted for others to get access. You then get both the problems of distributed data and even more levels of complexity (more meetings than with a monolith)
It’s a monkey’s paw solution, now you have 15 kinda slow pipelines instead of 3 slow deployment pipelines. And you get to have the fun new problem of deployment planning and synchronizing feature deployments.
You can do this with a monolith architecture as others point out. It always comes down to governance. With monoliths you risk slowing yourself down in a huge mess of SOLID, DRY and other “clean code” nonsense which means nobody can change anything without it breaking something. Not because any of the OOP principles are wrong on face value, but because they are so extremely vague that nobody ever gets them right. It’s always hilarious to watch Uncle Bob dismiss any criticism with a “they misunderstood the principles” because he’s always completely right. Maybe the principles are just bad when so many people get them wrong? Anyway, microservices don’t protect you from poor governance it just shows up as different problems. I would argue that it’s both extremely easy and common to build a bunch of micro services where nobody knows what effect a change has on others. It comes down to team management, and this is where our industry sucks the most in my experience. It’ll be better once the newer generations of “Team Topologies” enter, but it’ll be a struggle for decades to come if it’ll ever really end. Often it’s completely out of the hands of whatever digitalisation department you have because the organisation views any “IT” as a cost center and never requests things in a way that can be incorporated in any sort of SWE best practice process.
One of the reasons I like Go as a general purpose language is that it often leads to code bases which are easy to change by its simplicity by design. I’ve seen an online bank and a couple of landlord systems (sorry I can’t find the English word for asset and tenant management in a single platform) explode in growth. Largely because switching to Go has made it possible for them to actually deliver what the business needs. Mean while their competition remains stuck with unruly Java or C# code bases where they may be capable of rolling out buggy additions every half year if their organisation is lucky. Which has nothing to do with Go, Java or C# by the way, it has to do with old fashioned OOP architecture and design being way too easy to fuck up. In one shop I worked they had over a thousand C# interfaces which were never consumed by more than one class… Every single one of their tens of thousands of interfaces was in the same folder and namespace… good luck finding the one you need. You could do that with Go, or any language, but chances are you won’t do it if you’re not rolling with one of those older OOP clean code languages. Not doing it with especially C# is harder because abstraction by default is such an ingrained part of the culture around it.
Personally I have a secret affection for Python shops because they are always fast to deliver and terrible in the code. Love it!
Fast deployment causes incident war rooms.
Maybe the opposite, slow rollbacks cause escalating incidents.
Related:
Slow Deployment Causes Meetings - https://news.ycombinator.com/item?id=10622834 - Nov 2015 (26 comments)
I had a boss who actually acknowledged that he was deliberately holding up my development process - this was a man who refused to allow me a four day working week.
[dead]
Sounds like a process problem. 2024 development cycles should be able to handle multiple lanes of development and deployments. Also why things moved to microservices so you can deploy with minimal impact as long as you don't tightly couple your dependencies.
You don't need microservices to do this. It's actually easier deploying a monolith with internal dependencies than deploying microservices that depend on each other.
This is very accurate - microservices can be great as a forcing function to revisit your architectural boundaries, but if all you do is add a network hop and multiple components to update when you tweak a data model, all you'll get is headcount sprawl and deadlock to the moon.
I'm a huge fan of migrating to microservices as a secondary outcome of revisiting your component boundaries, but just moving to separate repos & artifacts so we can all deploy independently is a recipe for pain.