Business August 12, 2024

The Cost of Delayed Code Review

Code review is valuable. Late code review is expensive. When feedback arrives after the developer has moved on, context is lost, branches have diverged, and the cost of acting on findings multiplies.

Cost and tradeoff planning: calculator, simple chart blocks (no numbers), and two options laid out side-by-side with neutral props

Most teams agree that code review is important. Fewer teams have thought carefully about when code review happens. The implicit assumption is that as long as code gets reviewed before it merges, the timing does not matter much. Review it today, review it next week – the feedback is the same either way.

This assumption is wrong. The value of code review feedback degrades rapidly over time. A finding that takes five minutes to address on the day the code was written can take an hour to address a week later and may never get addressed at all if it arrives a month late. The content of the feedback may be identical. The cost of acting on it is not.

Context loss is the primary cost

When a developer writes code, they hold a mental model of the problem in working memory. They know why they chose this approach over the alternatives. They know which edge cases they considered and which they deferred. They know the implicit constraints imposed by the surrounding code, the requirements, and the conversations that led to this particular implementation.

This mental model does not persist. Within hours of moving to a different task, significant portions of it have faded. Within days, most of the detailed reasoning is gone. The developer can still read their own code and understand what it does, but the why – the reasoning behind specific decisions, the rejected alternatives, the known limitations – is increasingly lost.

When review feedback arrives while the mental model is still fresh, the developer can evaluate the feedback immediately. They know whether the reviewer's concern is valid, whether the suggested change is compatible with constraints the reviewer may not have seen, and how to implement the fix efficiently.

When review feedback arrives after the mental model has faded, the developer must reconstruct it. They re-read the code. They re-read the requirements. They try to remember why they made certain choices. This reconstruction is never complete – some context is permanently lost – and it takes time. A five-minute fix becomes a thirty-minute investigation followed by a tentative change.

Branch divergence compounds the problem

Code does not sit still while waiting for review. Other developers merge their changes. The main branch evolves. The longer a branch sits waiting for review, the further it diverges from the current state of the codebase.

This creates two problems. The first is mechanical: merge conflicts. When a reviewed branch finally needs to be updated, the developer must resolve conflicts between their changes and everything that has been merged since they opened their pull request. This is tedious, error-prone work that adds no value. It exists purely because of the delay.

The second problem is semantic. Even when there are no textual merge conflicts, the codebase may have changed in ways that affect the correctness of the pending changes. A dependency that was updated. An API that was modified. A pattern that was refactored across the project. The pending code may still merge cleanly but no longer behave correctly in the changed context. Catching these semantic conflicts requires re-reviewing the code against the current state of the codebase – effectively reviewing it twice.

Both problems scale with delay. A branch reviewed within hours will have minimal divergence. A branch reviewed after two weeks may require significant rework just to bring it up to date, before any review feedback can be addressed.

Sunk cost bias resists change

There is a psychological cost to delayed review that is rarely discussed. When a developer has written code, tested it, possibly deployed it to a staging environment, and moved on to other work, they have a significant investment in that code. Not just time, but cognitive and emotional investment.

Review feedback that arrives at this point is asking the developer to revisit a completed task. This triggers sunk cost bias – the natural human resistance to undoing work that has already been done. The developer has mentally closed the book on this code. Re-opening it feels like going backwards.

This does not mean developers are irrational or defensive. It means that the timing of feedback affects how receptively it is received. Feedback during active development feels collaborative: the reviewer and the developer are both working on the same problem. Feedback after the developer has moved on feels like rework: the reviewer is asking the developer to go back to something they have already finished.

In practice, this means late feedback is more likely to be negotiated down or deferred. A structural concern that would have been addressed immediately during active development becomes a ticket for later when it arrives two weeks late. And tickets for later have a tendency to never get done.

The queue problem

In most teams, code review operates as a queue. Developers submit pull requests, and reviewers work through them in roughly the order they arrive. When reviewers are busy with their own development work – which is most of the time – the queue grows.

Queue theory tells us something predictable about this arrangement. When a system operates near capacity, small increases in input rate cause large increases in wait time. If your reviewers have just enough capacity to handle the average flow of pull requests, any above-average week creates a backlog that cascades forward. One busy week produces two weeks of delayed reviews.

This is why many teams experience review delays not as a constant problem but as periodic crises. Things are fine for a while, then suddenly everything is waiting for review. The backlog clears, things are fine again, then it happens again. The underlying issue is that review capacity is not sufficient for peak load, only average load.

The traditional solution is to allocate more reviewer time. But reviewer time is developer time, and there is always more development work than time available. Teams are caught between two bad options: slow down development to speed up review, or accept review delays to maintain development velocity.

Quantifying the delay cost

The cost of delayed review can be estimated with reasonable accuracy using data most teams already have.

Context reconstruction time. Track how long it takes developers to address review feedback as a function of delay. If feedback addressed within one day takes an average of 15 minutes and feedback addressed after five days takes an average of 45 minutes, the delay cost is 30 minutes per finding per week of delay.

Merge conflict resolution time. Track the time spent resolving merge conflicts on branches that have been open for more than a few days. This is pure overhead created by delay.

Deferred findings. Track review findings that are acknowledged but deferred to later tickets. What percentage of those deferred tickets are actually completed? In most teams, the completion rate for deferred review findings is below 30 percent. Each deferred finding that is never addressed is a quality issue that persists in the codebase.

Review re-work. Track instances where code that passed review causes incidents or bugs that the review should have caught. Late reviews are shallower – reviewers rushing through a backlog spend less time on each review – and shallow reviews miss more issues.

Sum these costs over a quarter. The total is what delayed review costs your team. In most organisations, it is considerably more than the cost of addressing the delay itself.

Reducing review latency

There are structural approaches to reducing review delay, each with different trade-offs.

Dedicated review time. Some teams allocate specific blocks of time for review – first thing in the morning, for example. This ensures that the review queue is processed daily rather than when reviewers happen to have a gap. The cost is that reviewers lose their most productive hours to review work. The benefit is that review latency drops to less than 24 hours.

Smaller pull requests. Smaller changes are faster to review, which means the queue moves faster. The cost is that developers must break their work into smaller increments, which requires discipline and sometimes results in intermediate states that are less clean. The benefit is dramatic: a team that moves from average PR sizes of 500 lines to 100 lines typically sees review latency drop by 60 to 70 percent.

Automated baseline review. AI-assisted code review can handle the baseline analysis – identifying security issues, inconsistent patterns, potential bugs, and style violations – before a human reviewer ever looks at the code. This reduces the workload on human reviewers, allowing them to focus on the architectural and design-level feedback that requires human judgement. When the automated review is thorough, the human review becomes faster and more focused, which reduces the queue depth.

Full-codebase review as a complement. Some issues – systemic patterns, cross-module inconsistencies, architectural drift – are better caught through periodic full-codebase reviews than through per-change review. Moving these findings out of the PR review process reduces the scope of each individual review, making the process faster. The full-codebase review happens on a separate cadence, with its own triage process, independent of the PR flow.

The compounding effect

Delayed code review does not just cost time in the immediate term. It compounds.

Code that merges with unaddressed feedback becomes the foundation for future work. Other developers build on it. They copy its patterns. They integrate with its interfaces. By the time someone circles back to address the original review findings – if they ever do – the cost of the change has multiplied because it now affects downstream code as well.

This is how codebases accumulate structural problems. Not through deliberate decisions to ship low-quality code, but through the slow erosion of review effectiveness when feedback arrives too late to be acted upon. Each individual delay seems minor. The cumulative effect over months and years is significant.

Timely review is not a nice-to-have. It is a direct determinant of code quality outcomes. The same feedback, delivered promptly or delayed, produces fundamentally different results. Teams that treat review latency as a core metric – and invest in reducing it – consistently produce higher-quality code than teams that treat review timing as an afterthought.