When Should You Do A Rewrite?
Everyone wants to rewrite their codebase. The current one has too much cruft. It’s legacy code. Ugh.
At some point in every programmer’s experience, they interact with a codebase that is so frustrating that they dream of scrapping the whole thing and starting over. When is this a good idea? When is this a macabre fantasy that’s best left unrealized?
Code grows like an organism. Some evolutions stretch previously simple areas into monstrosities built to handle far more than originally intended. Some sections show unique characteristics of a trend or an experiment that didn’t carry over to the rest of the codebase, but they remain because they solved some problem and “if it ain’t broke, don’t fix it.” Some programmers leave indelible marks on a codebase, and then move on, leaving a trail of code practices in their wake. These things are usually the “cruft” that drives the current maintainer into a rewriting frenzy.
The code is not the problem, though.
Programs are more than code. Programs exist in the shared mindspace of all who interact with it. Code is an artifact, much like the GUI, the documentation, and the API. The people who interact with a program all have expectations of it, and how it should work. This is how we know that some behavior of the program is “buggy.” If the code was the program, then there would be no bugs because the program would behave exactly as it was written to behave. Instead, programs occasionally act counter to our expectations and we call this a bug.
The consumers of the program have certain expectations of how it they should be able to interact with it. These expectations might come from reading the documentation or the API specs or by interacting with it. The business people who directed the program to be written have different expectations of it. They had an idea of what they wanted built, and they have had to modify this mental model with what it was possible to build (or, at least, what their team told them was possible given the circumstances). The designers have their own expectations of how consumers should be able to interact with the program. Lastly, programmers, as the writers and maintainers of the code, have expectations of how the various parts of the program interact to produce the desired behavior.
The original team of designers and programmers and product managers have an advantage in understanding the program: they got to build up their mental models of the program as it evolved from an idea to a reality. When unexpected behavior occurs, these lucky people can recognize the evolutionary path that probably produced it. The product manager says, “oh yeah, we were going to have that user flow, but we abandoned it.” The designer says, “that dialog box is left over from when we used to make them all slide out like that.” The programmer says, “I didn’t think that function was being called from anywhere anymore.”
Similarly, longtime users of the program have an understanding of it based not on the current GUI, API, or documentation, but on their history with it. They say, “there used to be a way to upload multiple files at once.” Or they say, “calling that method should not have mutated the arguments.” These frustrated expectations are, in fact, bugs.
New maintainers and users don’t have this luxury. They must build up their mental model of the program as it exists today. There may be quirks and contradictions in the program that must be discovered and memorized, often without context. Their historical context is not this program, but other programs they have used or maintained that were similar. Based on this history, dissimilarities are seen by these people as bugs, or at the very least, frustrating shortcomings of the program. They say, “this should be easier,” or, “this should be possible.”
What does this have to do with rewriting code? The goal of a rewrite is bring the written artifacts of the program (usually code, but also documentation) into closer conformity with the expectations of those who interact with it. Often the “cost” of a rewrite is measured in the time it will take a programmer to rewrite the code, but the real cost is how many people’s mental models of the program will be invalidated when the rewrite is done. Will old APIs still work? Will the users be able to translate their workflow from the old GUI to the new one? Will other maintainers know how to contribute afterwards? Will the documentation and user-flow charts accurately reflect the new program? Writing code is the easy part. Building up mental models is much harder. Multiply that difficulty by the number of people whose mental models must change and that is the true cost of a rewrite.
Keeping this cost low is the driver of the decision of whether to rewrite or not. Here are some ways to eliminate or reduce invalidating the existing mental models of the program:
- Keep the same file structure. Current maintainers know where to look for things. They know which file or function or object interacts with the others. Keeping this the same and rewriting the inner code only, maintains those connections.
- Keep the same API. The consumers know how to invoke the program’s methods already. They don’t need to know that anything changed “under the hood.”
- Stay backwards-compatible. If you need (or want) to change the API, consider expanding it. Current users can gradually move to the “new” way of interacting with the program while staying productive with the “old” way that they already know.
- Bring the documentation and code into alignment. Sometimes a program quirk (“bug”) becomes widely known, expected, and possibly accepted. Rather than change the code to eliminate the quirk, consider documenting it for future consumers. Conversely, some program misbehaviors are so counter to user expectations and existing documentation that the responsible thing to do is change the code to match the documentation.
- Rewrite the code collaboratively with existing maintainers. As noted before, building the program is the easiest way to build mental models of the program, so the more current maintainers that can be involved in a rewrite, the more mental models will be updated contemporaneously.
- Communicate changes early and often. Explain what the changes are and why they are happening. Provide guidance on bridging the divide between the old and new program (“if you used to do X, do Y now”). Be up front about the costs and benefits of the new program. Prepare people for future changes. Invite feedback on proposed future changes.
- Make a different program. Sometimes the rewrite needs to be so total that it makes sense to leave the old program alone and start anew. This, unfortunately, means that there will probably not be an easy way to bridge from the old program to the new one. Eventually the old one may be abandoned by its maintainers and people interacting with it will be left unsupported.
So when should you do a rewrite? When the people-cost is low.