First published at Tuesday, November 28, 2017
How to Refactor Your Legacy Code: A Decision Matrix
When you are beginning to consider refactoring your big legacy codebase towards a new software design, then it is not uncommon to feel helpless after estimating this to be a huge terrifying 2-5 years project.
To help solve the problem of not knowing where and how to begin, we have had great success using a decision matrix to decide how each part of the legacy code should be changed in such a refactoring project. Two main factors should influence your refactoring decisions:
The change rate of the code (module, class, function)
The business value of the code (module, class, function)
We start looking at the top right corner of the matrix, where you find high business value code with a high change rate. Realistically this is the code that started the discussion of a large refactoring project in your team. While the risk is high in changing this code, so are the benefits:
Better software design allows to on-board new developers easier
(Much) higher test coverage reduces risk to break code or introduce bugs
High software quality reduces implementation time for future changes and new features
If your company considers software to be a primary function and profit driver, then it should be simple to sell your manager or boss onto the benefits.
But the truth is that every software project consists of a large bulk of essential "boilerplate" use-cases such as administration backends, third-party system bridges and many others that are neither considered "high business value" nor does their code change a lot. This code falls into the bottom left section of the matrix and should not be touched, at least until you already consider the refactoring project a success and improved all the other code.
Often this part of the codebase can make up 50% or more of the total lines and "removing" them from consideration in the refactoring project can simplify the process greatly and increases your chance of success.
The two quadrants where either change rate or business value is low are not as simple to decide on as the two extreme cases.
In the high business value, but low change rate case, you might be tempted to start refactoring the full code towards a new software design, but I would always advise against this to keep the refactoring scope small.
Instead you could introduce a thin facade / adapter layers that wrap the legacy code with a new and improved API instead. This way you can call this code from your new design with high business value and high change rate, without having to interact with the legacy code at all.
In the low business value, but high change rate case, refactoring is important to fight yourself free from the usually expensive and time-consuming process of changing legacy code. But because the code has no primary value for the business, investing energy in building a new software design might be too big a time investment.
Instead you should work on improving the quality of individual units within this legacy code and start writing or improving the tests for them.
First Steps in Refactoring Effort
So how do we start working a refactoring project with this matrix?
The easiest way is too extract code that is changing a lot into new functions or small objects and then integrate them back into the legacy code. We have written other blog posts on how to do these simple refactorings with concrete examples:
Based on these and other refactoring strategies you are starting a sort of spearhead refactoring effort into your high change rate code base and gradually improve its quality.
After a while you have a significant amount of refactored code so that you might get an idea of a new software design that you want to move that code to. With an idea for a new software design when working on high business value code you can then go one step further and refactor towards this a new software design that can lead to even better code with the 3 benefits I listed before..