Would it surprise you to learn that software developers test their software? It may not seem like it with all the outages this year, but the average developer spends 42% of their workweek on maintenance. So what's causing all the outages?
Crowdstrike may have made the loudest noise, but 2024 has seen service outages that have halted fast food deliveries, put WhatsApp and Instagram out of action, left passengers stranded at Heathrow Airport and halted fresh food sales at the Brexit border.
These disruptions didn’t happen because developers didn’t test the software, but because of hypercomplex software configurations: too many changes to products by too many people for too many reasons. Many companies are testing code inside a software architecture that no one understands anymore. It’s like a frighteningly large Jenga tower. Just one more “feature block” could bring it all down. No one wants to touch it. No one even remembers how it was built. But sooner or later, an executive decrees “this product needs AI.”
That's the eternal battle of software engineering: innovation versus maintenance, and it's eroding the very fabric of the world's software.
Director of Product Management and Product Marketing at Qt Group.
Why is software eroding?
Just like rocks and mountains, our software is slowly eroding over time. If software is a rock, then developers are the wind and water that break it. Thanks to the dependency hell we've gotten ourselves into.
So much new code is being added to codebases that affects or is affected by other moving parts of the software that product complexity has skyrocketed. Some of this is due to pressure from senior management to outperform rivals, but sometimes it's because developers are simply trying to shorten workflows.
A developer is asked to add a new feature. The feature bloats the codebase. The developer adds a shortcut to work faster, which adds complexity. A manager asks the developer to extend the product. The old shortcut? It's incompatible with the update. Things break. The developer starts patching. It takes a long time. The developer adds another shortcut, and… it keeps going around in circles.
Software erosion eventually becomes a self-fulfilling prophecy: a destabilizing chain reaction, where the smallest quality-of-life patch is both a pain to implement and a risk to the functionality of teams in other silos.
The human cost of software erosion
If more than 40% of development time is spent just keeping code alive, that doesn't add value to the product. If you factor in meetings, time for feedback, etc., you're likely to end up with less than half of the week left for value-added development.
This is madness for a development manager, who suddenly can only use half of his developers’ time for innovation. It’s even more miserable for the developers: what kind of satisfaction and pride could they derive from constantly fixing code that keeps breaking? The service outage disaster and the ensuing PR blunder are just one more blow to morale.
The next thing that happens is the churn of engineers who no longer want to work. Bringing in new engineers lengthens time to market, frustrating everyone involved. Old mistakes reappear and more people leave, except now they are seasoned veterans too. Where does it all end?
Fixing the 'shift left' itself
There has been a lot of talk about the “shift left” for years, and yet some companies have failed to understand why we talk so much about this philosophy. The cyclical misery faced by many developers will not end until we solve the “shift left” problem.
To solve the “shift left” problem, it's not enough to just squeeze more testing time into developers' busy schedules. Some companies hire outside QA testers to lighten the load, but it's an expensive solution. It's like putting duct tape over the leaky holes in a deck that's being battered by battleships.
A late fix is an expensive fix. The least expensive fix is to get the architecture right while you're still designing it. Build quality assurance into your software development before you write all the code, not after. If you're in the early stages of prototyping, then sure, QA can be unnecessary overhead. But the moment you build a viable business that attracts customers, you need quality code. You don't want to lose customers early in your product's lifespan. If your product ships next week and you have to redesign your architecture to avoid a product recall, that's a catastrophe of epic proportions. It will lengthen your time to market, increase the cost of development, and burn everyone out.
How do you get quality code? You need multiple sources of information. Don't skimp on static code analysis and functional testing, which should be run as you write new code. You need to know how much code you're cloning, where your hidden dependencies are, how your components talk to each other, and so on. When you know these things and run architecture verification, it's easier to identify problems. If you can't do every possible type of testing right away, that's fine, but start building out these processes over time.
And more importantly, does your architecture enable you to achieve your goals? If not, it’s time to redesign it. A company with decades of legacy code may not be able to do it, but an SMB with five years of code? Litigation from unhappy customers dwarfs any headaches of redesign.
Also understand that different roles have different incentives. Sometimes developers resist static code analysis because it's “extra work” and adds time to projects. Whose responsibility should it be? Figure that out early.
Now that power outages and crippling errors are becoming common, it's never been more important to understand how the Jenga tower was built. Very few people know its architecture inside out. With a little discipline, that can change. It has to change, because that tower will soon come tumbling down.
We have presented the best laptops for programming.
This article was produced as part of TechRadarPro's Expert Insights channel, where we showcase the brightest and brightest minds in the tech industry today. The views expressed here are those of the author, and not necessarily those of TechRadarPro or Future plc. If you're interested in contributing, find out more here: