When to Rewrite vs Refactor Legacy Code

Every engineering team has stared at a legacy codebase and thought “we should just rewrite this.” Most of the time, that instinct is wrong. Sometimes it isn't.

A neutral comparison setup: two unlabeled folders side-by-side, a blank scorecard with icons only, and one strong metaphor object in the center

There is a recurring conversation in software engineering that goes like this: someone looks at an old codebase, declares it unmaintainable, and proposes a ground-up rewrite. The team gets excited. A new tech stack is chosen. Work begins. Months later, the rewrite is behind schedule, the old system is still running in production, and the team is maintaining two codebases instead of one.

This pattern is so common that it has become a cautionary tale. Joel Spolsky wrote about it decades ago, calling it the single worst strategic mistake a software company can make. And yet teams keep doing it, because the frustration with legacy code is real and the promise of a clean slate is seductive.

The truth is more nuanced. Most rewrites fail, but not all of them. The challenge is knowing which situation you're in. Here is a framework for making that decision.


Why most rewrites fail

Rewrites fail for a predictable set of reasons. The first is underestimation. The old codebase looks messy, but it contains years of accumulated knowledge – edge cases handled, bugs fixed, business rules encoded. A rewrite starts clean but has to rediscover all of that knowledge, and it usually takes longer than anyone expects.

The second reason is the moving target problem. While the new system is being built, the old system is still being used. Features are still being added to it. Bug fixes are still going into it. By the time the rewrite reaches feature parity, the target has moved. This creates a perpetual gap that never quite closes.

The third is resource competition. A rewrite doesn't eliminate the need to maintain the existing system. It adds a second system to maintain. The team's bandwidth doesn't double, so both systems get less attention than they need.

These failures aren't about talent or effort. They're structural. The economics of rewrites are fundamentally stacked against success unless specific conditions are met.


The refactor-first default

For most teams in most situations, incremental refactoring is the right approach. It lets you improve the codebase while keeping it running. It reduces risk because each change is small and reversible. It maintains feature velocity because the team isn't split between two systems.

Effective refactoring follows a pattern. Identify the highest-friction areas of the codebase – the modules where bugs cluster, where changes take disproportionately long, where new team members get lost. Focus refactoring effort there. Leave the parts that work alone, even if they're not pretty. Ugly code that works and rarely changes is not a priority.

The key insight is that you don't need to fix everything. You need to fix the parts that are actively slowing you down. A codebase doesn't need to be beautiful. It needs to be workable.


When a rewrite actually makes sense

There are genuine situations where a rewrite is the right call. Recognising them requires honesty about the specific constraints you're facing.

The platform is dead. If the codebase is built on a framework or runtime that is no longer maintained, and critical security patches or compatibility updates are no longer available, refactoring within that platform has a shelf life. You're improving code that will need to be abandoned regardless.

The architecture fundamentally cannot support the next phase. A monolithic system that needs to become a distributed system. A single-tenant application that needs to become multi-tenant. A synchronous pipeline that needs to become event-driven. These aren't refactoring tasks – they're architectural transformations that may genuinely require rebuilding core components.

The codebase is small enough to rewrite quickly. The rewrite risk scales with size. Rewriting a 5,000-line service is a fundamentally different proposition from rewriting a 500,000-line platform. If the rewrite can be completed in weeks rather than months, the moving-target problem is manageable.

You can run both systems in parallel. The strangler fig pattern – building the new system alongside the old one and gradually redirecting traffic – eliminates the big-bang cutover risk. If your architecture supports this approach, a phased rewrite becomes much less dangerous.


How a full codebase review informs the decision

The rewrite-vs-refactor decision is only as good as the information it's based on. And this is where most teams go wrong. They make the decision based on how the code feels rather than what the code actually contains.

A full codebase review gives you the data you need. It tells you which parts of the codebase are genuinely problematic and which are merely unfamiliar. It quantifies the scope of inconsistency, dead code, and architectural drift. It distinguishes between code that is messy-but-functional and code that is structurally unsound.

With that information, the decision becomes concrete. Instead of arguing about whether the codebase is “too far gone,” you can look at the findings and ask specific questions. How many modules have critical issues? What percentage of the code is dead weight? Are the architectural problems localised or systemic?

If the problems are concentrated in specific areas, refactoring those areas is almost certainly the right move. If the problems are pervasive and architectural, a phased rewrite might be justified. Either way, you're making the decision with evidence rather than frustration.

The worst outcome is a rewrite driven by emotion. The second worst is a refactoring effort that targets the wrong parts of the codebase. Both are avoidable if you start by understanding what you actually have.


Limits and tradeoffs

  • It can miss context. Treat findings as prompts for investigation, not verdicts.
  • False positives happen. Plan a quick triage pass before you schedule work.
  • Privacy depends on your model setup. If you use a cloud model, relevant code is sent to that provider; local models can keep inference on your own hardware.