Building a legacy: The art of crafting maintainable systems

Bridge_arches.jpg

There are a lot of different tasks for software developers, but each and every one of us has had to review old code. Whether it’s checking for a previous version or seeing how someone solved a problem in the past, legacy code is part of the job. But have you ever reviewed previous versions and become frustrated enough to ask “who wrote this code?” and madly hit “git blame” to find out who is responsible? I’m sure your answer is yes! Maybe the person responsible was you five years ago, a colleague that left the company whom you never met, or even a current teammate you look up to. 

There’s no blame in “git blame.” Everyone writes legacy code because of a very simple rule: time. With time, requirements change, systems evolve, and most importantly, humans learn. So, it’s inevitable that today’s best becomes tomorrow’s pain — although, these pains could become tolerable by introducing a few practices to your team.

In my years of software development, I’ve worked with a number of legacy systems. Sometimes, I was lucky enough to get a chance to redesign and refactor, but other times, I was not. Through all of that, I’ve learned that the most important aspect of a legacy system is its maintenance. I’ve lost track of how many times I’ve asked myself “how on earth should I add this new feature or find the bug when I can’t even understand what the code does?”

Keep the blessing of future git blames with maintenance

A maintainable service has two simple yet difficult-to-implement features: readability and predictability.

Readability means smoothly following the flow of code and understanding it with the right amount of context. Predictability means there are strong patterns in the code and architecture. Finding a pattern — even if it’s not the best practice — and following it is easier than dealing with various random occurrences, and it facilitates future maintenance.

As a member of the Threat Data Services team at Elastic, I have dug into aged services a handful of times. In my experience, here are the five best things you can do to create and maintain a responsible system: 

1. A bad convention is better than no convention

Code convention impacts the readability more than you think. Code is usually the first or second place you go to figure something out. Whether there’s good documentation or not, it’s best to have identifiable patterns and conventions. And even if they’re bad, you’re probably tempted to fix them — but, don’t! 

Sticking to the existing conventions and practices increases readability of the code and makes teammate onboarding processes faster and easier. Imagine you’re speaking a language with your team where “coffee” means “juice.” If suddenly someone starts to use those words in their true meanings, everyone gets confused, documents get outdated, code namings and comments stop making sense, and people have to ask every time, “Do you mean ‘coffee coffee’ or ‘coffee juice’?” Then, this becomes so much worse when refactoring or redesigning the legacy code base. 

If you know consistently that ‘coffee’ means ‘juice,’ then it’s much easier to simply follow that system. When you redesign the legacy system in the future, you don’t have to distinguish between the two of them — you’ll simply know! A bad convention is still a convention, and any convention is better than no convention.

2. Limit your adventures

The tech world moves faster than light. Every year, there’s a new trend; every week, we find a new tool; and every day, we discover a new library. It’s natural to get excited about these things — we all feel the urge to explore and use them in the next piece of our work because we aspire to deliver the best! But, how far should we incorporate our personal adventures into the system? Before getting carried away by some exciting feature, let's pause. Take a breath and think twice — is this really the best solution? Do we really need that? Can we achieve the same goal with the existing set of tools at hand?

Introducing new tools to an existing system can get tricky. It expands the scope of debugging and possibility of failure. It adds complexity and brings more context switching to you and your team. I’m not saying you should never add new tools to a system, but let’s be cautious and check the pros and cons thoroughly beforehand to make sure it’s clear to everyone on the team.

3. Test for serenity

Let’s be honest, writing tests is no delight. It’s a repetitive, frustrating, and boring task. But when the day comes and something breaks, you’ll be glad you did it. Different kinds of tests, such as unit, integration, and end-to-end, help us understand a system from a better perspective. There have been times where I didn’t understand a piece of code, and it wasn’t until I found the unit tests that it suddenly became crystal clear.

In addition to understanding the code better, tests also lower the risk of breaking production. This only happens if the tests are up to date and run as a mandatory step of CI/CD prior to merging or deploying changes.

At first glance, it looks obvious — why would anyone not run tests? But life is full of surprises. The lack of experience, rush of delivery, and continuous pressure of adding features and resolving bugs can all lead to ditching tests. And that’s alright, but it’s better to lose the saddle than the horse. It’s never too late to add tests, and as I always say, take baby steps. 

This is how I approach adding tests to legacy code:

  1. Find out if there are any existing tests and ensure they pass on your machine.

  2. Make sure the test stage is part of your CI/CD.

  3. Sort the test types from the most beneficial to the least beneficial based on your needs. 

  4. Make tests part of the team’s review process.

  5. Write them as you go, such as adding a set of integration tests in each working cycle or adding unit tests whenever a file or module is touched. Aim for covering the important parts of code rather than covering everything.

  6. Improve the tests and increase the coverage as you maintain the system.

And remember, it’s not a one-time task. So, when you’re done, keep reminding yourself and your teammates to take care of tests properly.

4. Document for salvation

I’m not going to lecture you about the importance of documentation since you’ve most likely heard it a million times already. Instead, I’ll talk about what is usually missing: history.

During the initial development of a system, you only know so much, and the decisions you make might not be the best in the long run. You’ll make the best decision with your knowledge of business requirements, technical expertise, and the tools at hand. All three will change in time, and sometimes, they lead to weird patches. I believe what we miss in our docs is not the explanation of the current state but the story behind it.

The history-aware documentation will give the future team a justification of why things are the way they are. It provides context and passes the lessons learned to the next generation of developers. It could be a paragraph explanation, a link to an issue where the change is discussed, comments in the code, or anything else.

I can give you a fresh example from my team. A few years ago, a pull request got merged that mistakenly made the code compress already bzipped files with gzip again upon saving them. Thankfully, the library we were using for decompression didn’t care about layers and went as far as needed, so everything was fine for years. We weren’t aware of this until we needed to implement a new service to read those files. That’s when we learned that we have a secret layer of compression and inconsistency within our resources. 

This was a huge bug, and fixing it was a giant hassle — imagine replacing terabytes of kilobyte-sized files. So, we had to keep it as-is and twist the new code to cover the logic. Adding a comment above that twisted block of code explained the reason behind it and helped everyone on the team to learn and remember it. This also prevented new teammates and us in the future from making this mistake again. Don’t be ashamed — document your bad decisions and mistakes! 

5. Keep your secrets in a safe

While it’s common knowledge to guard sensitive data, it’s also easy to mess it up. Since you don’t want to hand over your keys on a silver platter, strictly following a few rules saves you a great deal of time — and managing your secrets the right way is one of them.

Secrets could be issued either per person or per service. In both cases, make sure to use a well-known secret manager like Vault, create fine-grained access levels, and rotate credentials as frequently as needed.

Important note on service secrets: They should be bound to services, not people. People leave the company or change teams, and their permissions will be updated accordingly. If secrets are bound to people, this could lead to a break in your system. Issuing secrets for services ensures that this doesn’t happen and prevents accidentally granting unnecessary permissions.

If this has piqued your interest and you want to learn more about role-based access control (RBAC), Elastic Cloud has handled RBAC in a standard and straightforward way. You can check out the documentation for a good resource.

Lead the change from within

Legacy systems can be rough, but they don’t have to be. Whether you just joined the team or have been there for a while, you want to be at peace. To find peace, find standards, and then keep them and follow them — and if they don’t exist, create them.

These tips will help strengthen your codebase and your community. We’re excited to strengthen our own community by remaining open about our development process. That’s why Elastic has gone open source once again!

The release and timing of any features or functionality described in this post remain at Elastic's sole discretion. Any features or functionality not currently available may not be delivered on time or at all.