Opinion June 3, 2024

Code Review Is Not About Catching Bugs

Ask most developers what code review is for, and the first answer is catching bugs. That is not wrong, exactly. It is just the least important thing review does.

The standard justification for code review goes something like this: another set of eyes catches bugs that the author missed. Before code ships, a reviewer reads it and spots the off-by-one error, the missing null check, the incorrect conditional logic. The reviewer comments, the author fixes, and the bug never reaches production.

This framing is not wrong. Code review does catch bugs. Studies have confirmed it. But it misses the point. Bug detection is the most visible benefit of code review and the least valuable one.

The most important things that code review accomplishes have nothing to do with finding bugs. They are knowledge sharing, consistency enforcement, and architectural stewardship. These outcomes are harder to measure, slower to materialise, and vastly more valuable over time.

Knowledge sharing: the bus factor antidote

In any codebase, there is a constant risk that critical knowledge lives in one person's head. The authentication module was written by someone who left last year. The billing integration was set up by a contractor. The deployment pipeline was configured by the CTO during a weekend sprint three years ago. When the only person who understands a piece of code is unavailable, the team is stuck.

Code review is the primary mechanism by which knowledge spreads across a team. When a developer reviews code in a module they did not write, they build familiarity with that module. They see how the authentication flow works. They learn the conventions used in the billing integration. They understand why the deployment pipeline has that unusual step.

This knowledge transfer happens passively, without requiring dedicated documentation or knowledge-sharing sessions. Every review is a teaching moment. The reviewer learns about a part of the codebase they might otherwise never encounter. The author, in explaining their choices, reinforces their own understanding and often identifies gaps in their reasoning.

Teams that skip code review concentrate knowledge. Teams that practise it distribute knowledge. The difference becomes apparent when someone goes on holiday, changes role, or leaves the company. In a team with thorough review practices, at least one other person has seen every piece of code. In a team without review, some code has never been read by anyone other than its author.

This is not a theoretical concern. The bus factor – the number of people who would need to be unavailable before a project stalls – is directly affected by review practices. If every PR is reviewed by at least one other developer, the minimum bus factor for any piece of code is two. Without review, it can be one. The difference between one and two is the difference between a manageable risk and a single point of failure.

Consistency enforcement: the invisible quality multiplier

Every codebase develops conventions over time. How errors are handled. How configuration is managed. How API responses are structured. How database queries are constructed. These conventions may be written down in a style guide or they may exist only as patterns in the code itself.

Without review, conventions drift. Different developers adopt different patterns. New team members, without exposure to existing conventions, invent their own. Over months and years, the codebase accumulates multiple approaches to the same problem. Error handling is done with exceptions in some modules and return codes in others. Some API endpoints return errors in a consistent format and others use ad-hoc structures.

Code review is where consistency is maintained. A reviewer who notices that a new function handles errors differently from the rest of the module can point out the discrepancy. A reviewer who sees a new pattern emerging can ask whether it should replace the existing pattern or conform to it. These are not bug fixes. They are course corrections that keep the codebase coherent.

Consistency matters because inconsistent code is harder to maintain. A developer reading an inconsistent codebase cannot rely on patterns. Every function might do things differently. Every module is a new learning exercise. This slows down development, increases the risk of misunderstanding, and makes automated tooling less effective. Consistent code, by contrast, is predictable. Once you understand the pattern, you can navigate the codebase quickly and make changes with confidence.

Linters and formatters enforce syntactic consistency. Code review enforces semantic consistency – consistency of patterns, approaches, and design decisions. This is the harder kind to automate and the more important kind to maintain.

Architectural stewardship: protecting the long-term

Every codebase has an architecture, whether it was intentionally designed or emerged organically. That architecture defines how modules relate to each other, how data flows through the system, and where responsibilities live. Preserving that architecture – or deliberately evolving it – is one of the most important functions of engineering leadership.

Code review is the mechanism through which architecture is protected at the point of change. When a developer proposes a change that violates architectural boundaries – a presentation layer module importing from the data layer, a shared library depending on an application-specific module, a synchronous call inserted into an asynchronous pipeline – a reviewer can flag it before it merges.

This is not about catching a bug. The code might work perfectly. It might pass all tests. It might do exactly what the product requirements specify. But it degrades the architecture, and architectural degradation compounds. One shortcut creates a precedent. The next developer, seeing the existing violation, follows the same pattern. Within a year, the boundary that was supposed to separate two concerns has been crossed so many times that it no longer exists in practice.

A reviewer with architectural context can prevent this. They can say: this works, but it should not live here. They can suggest an alternative that achieves the same result without violating the boundary. This kind of feedback has nothing to do with bugs and everything to do with the long-term health of the system.

Without review, architectural stewardship depends entirely on the discipline of individual developers. Some will respect the boundaries. Others, under deadline pressure, will take shortcuts. Over time, the shortcuts win, and the architecture erodes. Review is the counterbalance.

Why bug detection is overvalued

The emphasis on bug detection in code review is partly historical. In the era before comprehensive test suites and CI pipelines, manual code inspection was one of the few ways to catch bugs before deployment. Review was a quality gate, and its value was measured in defects found.

Today, many categories of bugs are caught more effectively by other means. Unit tests catch logic errors. Integration tests catch interface mismatches. Static analysis tools catch null dereferences, type errors, and simple security issues. Linters catch formatting and convention violations.

This is not to say that reviewers never find bugs. They do. But the bugs that reviewers uniquely catch – the ones that tests, linters, and static analysis miss – tend to be subtle design-level issues: a race condition under specific timing, an incorrect assumption about upstream behaviour, a state machine that handles most transitions correctly but has an unreachable error state. These are genuinely valuable catches, but they are not the primary value of review.

When you measure code review solely by bugs caught, you undervalue it. Worse, you create incentives that undermine its real benefits. If bug detection is the metric, then reviewing code that is already well-tested feels like a waste of time. Why review code that passes all tests? Because the review is not just about whether the code is correct. It is about whether the code is consistent, whether it maintains architectural boundaries, and whether the team understands it.

Reframing what review is for

If code review is primarily for knowledge sharing, consistency, and architectural stewardship, then the way we practise it should change.

Review assignments should rotate. If the goal is knowledge distribution, having the same person review the same module every time concentrates knowledge rather than spreading it. Rotating reviewers ensures that more people gain exposure to more parts of the codebase.

Review comments should explain, not just flag. A comment that says “this is wrong” catches a bug. A comment that says “we handle this pattern differently in the rest of the module because of X” teaches. The second kind of comment is more valuable because it prevents an entire class of future inconsistencies, not just one instance.

Architectural context should be part of review. Reviewers should know enough about the intended architecture to spot violations. This means architecture decisions need to be documented and accessible, not just held in one person's head. A reviewer without architectural context can catch bugs but cannot perform architectural stewardship.

Review scope should include existing code. If knowledge sharing and consistency are primary goals, then reviewing only new changes is insufficient. Periodic review of existing code – the stable, unchanged modules that carry the most risk and embody the most accumulated knowledge – serves these goals directly.

The long game

Bug detection has an immediate, visible payoff. You find a bug. You fix it. The production incident is avoided. The value is clear and measurable.

Knowledge sharing, consistency, and architectural stewardship have a slower, less visible payoff. You do not see the incident that was prevented because three people understand the authentication module instead of one. You do not see the velocity improvement that comes from consistent patterns across the codebase. You do not see the refactoring cost that was avoided because architectural boundaries were maintained.

But these invisible benefits compound. Over months and years, they are the difference between a codebase that gets harder to work with and one that remains tractable. They are the difference between a team that moves faster as it grows and one that slows down with every new hire.

Code review is not about catching bugs. Bugs are the side effect. The real purpose is to ensure that the codebase remains understandable, consistent, and structurally sound – and that the team, collectively, understands what it has built.

Limits and tradeoffs

It can miss context. Treat findings as prompts for investigation, not verdicts.
False positives happen. Plan a quick triage pass before you schedule work.
Privacy depends on your model setup. If you use a cloud model, relevant code is sent to that provider; local models can keep inference on your own hardware.