Opinion October 28, 2024

Naming Things: A Code Review Perspective

There are two hard problems in computer science: cache invalidation, naming things, and off-by-one errors. Naming is usually treated as a joke. It should not be.

Code editor showing variable names highlighted with a style guide document open alongside it

Every developer knows the aphorism about naming being one of the two hard problems in computer science. It gets a laugh in conference talks and a knowing nod in code reviews. But the reason naming is hard is not because English is ambiguous or because developers lack vocabulary. Naming is hard because it requires you to understand what something actually is, what it does, and how it relates to everything else in the system. Bad naming is not a failure of style. It is a failure of clarity.

When you review a large codebase – not a single pull request, but hundreds of thousands of lines across dozens of modules – naming problems stop looking like cosmetic issues. They start looking like architectural symptoms. And the patterns they reveal are remarkably consistent across projects, teams, and technology stacks.

Bad naming as a signal of unclear thinking

Consider a function called processData. What does it process? What kind of data? What is the output? The name tells you nothing. It is a placeholder that was written when the developer was still figuring out what the function should do – and was never revisited once the thinking clarified.

This pattern appears everywhere: handleStuff, doWork, utils, helpers, misc. These names are not wrong in the way that a misleading name is wrong. They are empty. They carry no information. And they accumulate because nobody goes back to rename things once the code works. The name was good enough to get the tests passing, so it stayed.

The problem compounds over time. A new developer encounters processData and has to read the entire function body to understand what it does. Then they write their own function, and because they do not fully understand the existing one, they call theirs transformPayload. Now there are two functions that might do the same thing, or might not, and the names do not help you tell the difference.

Unclear names are not a cosmetic problem. They are a comprehension tax paid by every developer who reads the code, every time they read it. In a codebase with 500 files, that tax adds up to hours per week of unnecessary cognitive effort.

Inconsistent naming as an architectural symptom

Single instances of bad naming are easy to fix. The more revealing problem is inconsistent naming across a codebase. When the same concept is called three different things in three different modules, you are not looking at a style issue. You are looking at a codebase where the team does not share a common vocabulary – which usually means they do not share a common understanding of the architecture.

A real example: in one codebase, the concept of a user's session was referred to as session in the authentication module, context in the middleware layer, userState in the frontend, and token in the API layer. Each name made local sense within its module. But across the system, the inconsistency made it genuinely difficult to trace how session data flowed from login to API response. A developer debugging a session-related bug had to mentally translate between four different vocabularies.

This kind of inconsistency does not happen because developers are careless. It happens because different parts of the codebase were written by different people at different times, and nobody established a shared glossary. It is an organisational problem, not a technical one. But it manifests in the code as naming drift, and you can only see it when you review the codebase as a whole.

What AI code review catches that humans miss

Human code reviewers are excellent at evaluating the naming within a single pull request. Does this variable name make sense in context? Is this function name descriptive enough? These are the kinds of questions that come up naturally in PR review.

What human reviewers almost never catch is cross-codebase naming inconsistency. No single reviewer has the entire codebase in their head. They cannot tell you that the term customer is used in the billing module while client is used in the CRM module and user is used in the authentication module – all referring to the same entity. They review one file at a time, and each file looks reasonable in isolation.

AI code review, particularly full-codebase analysis, is well suited to detecting these patterns because it can hold the entire codebase in context simultaneously. The patterns it surfaces include:

Misleading names. Functions whose names suggest one behaviour but whose implementations do something different. A function called validateEmail that also normalises the email and saves it to the database is not a validation function. It is a validation-normalisation-persistence function wearing a validation costume.

Inconsistent conventions. One module uses camelCase for function names while another uses snake_case. One module prefixes private methods with an underscore while another does not. These inconsistencies make it harder to navigate the codebase because your expectations are violated at random intervals.

Abbreviation drift. Early in a project, someone abbreviates transaction as txn. Later, someone else abbreviates it as trans. A third developer writes it out in full. Now searching for transaction-related code requires three different queries, and grep results are incomplete by default.

Conflicting terminology. Two modules use the same term to mean different things, or different terms to mean the same thing. The billing module's account is a financial entity. The authentication module's account is a user identity. They are not the same thing, but they share a name, which means every conversation about “accounts” requires clarification.

The cost of bad naming in large codebases

In a small project with five files, bad naming is an annoyance. In a large project with five hundred files, it is a material cost. The impact shows up in several measurable ways.

Onboarding time. New developers take longer to become productive when the naming is inconsistent because they cannot build a reliable mental model of the system. Every time they encounter a concept under a new name, they have to stop and figure out whether it is the same thing they saw before or something different. This can add days to onboarding.

Bug introduction rate. When similar concepts have similar but different names, developers make incorrect assumptions about what code does. They call the wrong function, pass the wrong parameter, or duplicate logic that already exists under a name they did not recognise. These bugs are not caused by lack of skill. They are caused by a codebase that actively misleads the people working in it.

Search and discovery failures. When naming is inconsistent, developers cannot find existing code. They search for validate and miss the function called check that does the same thing. They search for customer and miss the module that uses client. This leads to duplication, which leads to divergence, which leads to bugs.

Review overhead. Every code review takes longer when the reviewer has to decode unclear names. Instead of evaluating the logic, they are spending time figuring out what the code is supposed to do. This slows the entire development cycle.

VibeRails consistency findings

When VibeRails analyses a codebase, naming consistency is one of the categories it evaluates. The findings are not style preferences. They are structured observations about where the codebase uses conflicting terminology, where abbreviation patterns have drifted, and where the same concept appears under multiple names across different modules.

These findings are presented with specific file locations, the conflicting terms, and the modules affected. The goal is not to enforce a particular naming convention. It is to make visible the naming inconsistencies that no single developer can see because they only work in a few modules at a time.

The output is a map of terminological confusion. It tells you where your codebase's vocabulary is fractured and gives you enough information to decide whether to standardise. Not every inconsistency needs to be fixed. But every inconsistency should be a conscious choice, not an accident that nobody noticed.

Names are not decoration

The instinct to treat naming as a style preference is understandable. It feels subjective. One person prefers getUserById and another prefers fetchUser. In isolation, both are fine.

But naming is not about individual preferences. It is about whether the codebase communicates its intent clearly to the people who have to work with it. A well-named codebase is one where a developer can read a function signature and understand what it does without reading the implementation. A poorly-named codebase is one where every function requires investigation.

The difference between these two states is not aesthetics. It is productivity, correctness, and the long-term maintainability of the system. Naming is thinking made visible. When the naming is confused, the thinking was confused. And when the naming is inconsistent across a codebase, the team's shared understanding of the system is fragmented.

Full-codebase review makes this fragmentation visible for the first time. What you do with that visibility is up to you. But at least you can see it.

Limits and tradeoffs

It can miss context. Treat findings as prompts for investigation, not verdicts.
False positives happen. Plan a quick triage pass before you schedule work.
Privacy depends on your model setup. If you use a cloud model, relevant code is sent to that provider; local models can keep inference on your own hardware.