2020-01-30

Making codebases age well

Why are some systems easy to update after a decade, while others are hard to understand after a year or two? And why are some companies drowning in bugs and need a month to complete a change that takes days in other companies?

Some of the root causes may be organizational, but technical choices also make a big difference.

After seeing plenty of projects on either end of the spectrum, here are the biggest differences between codebases that age well and the ones that don’t.

1. Have less stuff to maintain

The most maintainable systems I’ve seen had architects that pushed for:

Fewer features by being choosy about which features to implement
Fewer technologies rather than more - e.g. avoiding polyglot microservices
Sticking to the same patterns over time, even if they’re imperfect (but good enough)
Fewer services, technologies, 3rd party components, etc
Fewer lines of code, relying more on third-party libraries and services
Cleaning up old unused parts
Fewer moving parts overall

Whatever is left would ideally be organized into groups of very similar-looking components so that whoever works on the system in five years would be able to understand it easily.

The reverse of this — a technology/feature/sub-system sprawl — is one of the scariest ways in which the system can become unmaintainable because it is difficult to reverse. This can happen when generations of programmers come through an organization, decide that the old way of building things is bad, introduce new technologies/patterns/systems without replacing the old ones, and then are replaced by the next generation. Or by multiple teams working in parallel, independently picking their favorite tech stacks. To a lesser degree, this can happen when less experienced teams keep adding new tech to the mix without thinking through the complexity that comes out.

Either way, the end result is a system that’s hard to fully comprehend, easy to break, and probably has team members that can’t or don’t want to touch unfamiliar parts of the system.

2. Choose mature, popular technologies

If you want your codebase to be easy to update in five years, don’t use a technology that will be dead in five years. Or one that will change so much that upgrading becomes impractical. Anything built on Angular 1, Silverlight, or early milestone Spring Boot versions would be very uncomfortable to work with today.

For this reason, I prefer to pick popular technologies that are already on v2 or v3 and have largely stabilized.

As of early 2020, this means:

React: Good. Popular, widely used, about six years old. Probably the best pick on the front end.
F#: Bad. Mature, but not widely used. Avoid unless you already have F# experts, or it’s a solo side project.
Svelte: Bad. Popular, high satisfaction, but new; might not survive or might change a lot in the next few years.
C: Good. Will likely outlive all of us. Safe choice for its use cases.

There are a number of other, immediate advantages to picking popular, older technologies over newer or less popular ones:

More and better tools + libraries
Lots of Stackoverflow answers to frequent problems
Easy to find developers who can work with it

There are several variations on this theme; one of my coworkers calls himself “Service Pack 2 guy” (service pack 1 is too early for him). Choose boring technology is another popular essay that expands on this.

3. Tests, tests, tests

At my new job, about a month into it, I needed to update a service written in Python that was rarely touched by others. It was written over the past few years, and the original authors have left the company some time ago. After spending some time browsing around the source code, I discovered that not only did it have great comments, but also 98% unit test coverage! In addition, it was part of a massive integration test suite. A couple weeks later when I was done, I was extremely confident that I didn’t break any existing functionality, and that my new additions would probably work as expected.

While automated tests can slow you down in the short run, on systems that have a long life they’re totally worth the effort.

I find that the tests that give the most return for the effort are:

Unit tests, because they’re easy to write and run, and
End-to-end tests on the whole system, because they can uncover most integration issues and prove that the entire system works as expected.

In some cases, the latter type can also be run in production after deployments to catch problems early.

4. Help the next developer understand the system

Whether it’s a new developer joining a 100-person dev org, or a single dev looking at the source code that has not been touched in the past five years, they’ll have a hard time modifying a system when they don’t know what it does or why.

So help them out in advance. This means:

Don’t surprise them with unexpected side-effects
Leave comments that explain the “why”
Have clear names
High-level documentation

Who knows, the next developer could even be you!