Technical Debt Is Not a Metaphor

Everyone treats technical debt as a figure of speech. It is not. It has real costs you can measure in hours, incidents, and money. Here is how.

Calculator, cost worksheet, and technical debt backlog cards laid out next to a blurred code screen

Ward Cunningham coined the term “technical debt” as an analogy. He wanted to explain to business stakeholders that shipping imperfect code was like taking on a loan – you could move faster now, but you would pay interest later. The analogy worked. It worked so well that it became the default way engineers talk about code quality problems.

The trouble is that people started treating the analogy as if it were merely illustrative. A colourful way to describe something abstract. A metaphor that makes an intangible concept easier to grasp. And once technical debt became “just a metaphor,” it became easy to deprioritise. Metaphors do not have deadlines. Metaphors do not show up on balance sheets. Metaphors do not cause production incidents at 2am.

But technical debt does all of those things. It is not a metaphor. It is a literal cost centre, and it can be measured.


The cost of workarounds

Every piece of technical debt generates workarounds. A poorly structured database query that takes four seconds instead of forty milliseconds means someone has added a cache layer. That cache layer needs invalidation logic. The invalidation logic has edge cases. Those edge cases generate bugs. Those bugs require investigation and patches.

None of this work would exist if the query had been fixed. But nobody fixed it, because fixing it was “technical debt” and there were features to ship.

You can measure this. Look at your ticketing system. Identify tickets that exist because of workarounds for known issues. Count the hours spent on those tickets over the last quarter. Multiply by your average fully loaded developer cost. That number is not a metaphor. It is what you actually spent.

In most codebases, the number is surprisingly large. A single poorly abstracted module can generate dozens of downstream workarounds, each one consuming hours of engineering time that would otherwise go towards product development.


Incident frequency in debt-heavy modules

Production incidents are not evenly distributed across a codebase. They cluster. And they cluster in the places where technical debt is highest – modules with tangled dependencies, inconsistent error handling, missing validation, and unclear state management.

You can verify this with your own incident data. Map your production incidents from the last twelve months to the files and modules they originated in. You will almost certainly find that a small number of modules account for a disproportionate share of incidents. Those modules are your debt hotspots.

Each incident has a cost. The on-call engineer who triaged it. The developer who diagnosed and patched it. The product manager who communicated the impact to customers. The trust eroded with each outage. These costs are real and quantifiable. When you trace them back to the modules that caused them, you are no longer talking about a metaphor. You are talking about which parts of your codebase are actively losing you money.

Teams that track incident-to-module mapping often discover that fixing the top three debt-heavy modules would eliminate 40 to 60 percent of their production incidents. That is not a vague improvement. That is a concrete reduction in operational cost.


The onboarding time delta

When a new developer joins your team, how long does it take them to make their first meaningful contribution? In a clean, well-structured codebase, this might be days. In a debt-heavy codebase, it is weeks or months.

The difference between those two numbers is the onboarding tax imposed by technical debt. Every undocumented convention, every implicit assumption, every module whose behaviour can only be understood by reading three other modules first – these all extend the time it takes for a new hire to become productive.

Calculate it. If your average onboarding time is eight weeks and an equivalent role at a company with a cleaner codebase takes three weeks, those five extra weeks are a direct cost. For a developer earning a reasonable salary, five weeks of reduced productivity represents a significant sum per hire. Multiply by the number of developers you hire per year. That is the annual onboarding tax your technical debt imposes.

This cost is invisible in most organisations because nobody measures onboarding against a baseline. But it is real, it is recurring, and it scales with every new team member.


Retention risk

Developers leave jobs for many reasons. Compensation, management, growth opportunities. But one reason that appears consistently in exit interviews and anonymous surveys is frustration with the codebase. Working in a system that fights you every day – where simple changes require complex workarounds, where the build takes twenty minutes, where nobody understands the deployment pipeline – is demoralising.

Technical debt drives attrition. Not always directly, but as a compounding factor. The best engineers, who have the most options, are the ones most likely to leave a frustrating environment. When they leave, the remaining team loses institutional knowledge, which makes the debt harder to address, which makes the environment more frustrating for the next person.

The cost of replacing a developer – recruitment, interviewing, onboarding, the ramp-up period – is typically estimated at six to nine months of salary. If your technical debt contributes to even one additional departure per year, the financial impact dwarfs the cost of fixing the debt.


How to build the evidence base

Moving from “we should fix this someday” to “this costs us a specific amount per quarter” requires evidence. Here is how to gather it.

Step 1: Identify your debt inventory. Run a systematic code review across your codebase. Not a PR review – a full audit that examines the system holistically. Catalogue the findings: inconsistent patterns, security gaps, dead code, performance bottlenecks, missing tests, tangled dependencies. Tools like VibeRails generate this inventory automatically, giving you a structured list of findings with severity and category.

Step 2: Map findings to incidents. Cross-reference your code quality findings with your incident log. Which production issues trace back to known debt? This gives you the incident cost of each piece of debt.

Step 3: Estimate workaround hours. For each significant finding, estimate how many hours per quarter your team spends working around the issue rather than through it. Be conservative. Even conservative estimates tend to be large.

Step 4: Measure the velocity drag. Compare delivery speed in clean modules versus debt-heavy modules. How long does a typical feature take in each? The delta is the velocity cost of debt in those modules.

Step 5: Calculate the total. Sum the incident costs, workaround hours, velocity drag, onboarding tax, and attrition risk. Present the total in monetary terms. This is what your technical debt costs per quarter. It is not an estimate of future risk. It is a measurement of current expenditure.


From metaphor to line item

When you present technical debt as a quarterly cost rather than a vague concern, the conversation changes. You are no longer asking for permission to do something that feels optional. You are identifying a cost centre and proposing a reduction.

The business case for addressing technical debt is not “our code should be cleaner.” It is: “We are spending a measurable amount per quarter on workarounds, incidents, and slow onboarding caused by specific, identifiable issues in our codebase. Fixing the top ten issues would reduce that cost by a projected amount and free up engineering capacity for product work.”

That is not a metaphor. That is a proposal with numbers, and it is the kind of proposal that gets approved.

Technical debt is only invisible when nobody measures it. Once you have the data – the incident maps, the workaround hours, the onboarding deltas – it becomes as concrete as any other operational cost. And concrete costs get budgets.


Limits and tradeoffs

  • It can miss context. Treat findings as prompts for investigation, not verdicts.
  • False positives happen. Plan a quick triage pass before you schedule work.
  • Privacy depends on your model setup. If you use a cloud model, relevant code is sent to that provider; local models can keep inference on your own hardware.