AI Code Review for Rust Projects

Why Rust codebases still need code review

Rust's ownership system and borrow checker provide memory safety guarantees that most languages cannot match. But safety at the memory level does not mean safety at the application level. Rust projects accumulate their own forms of technical debt that the compiler is not designed to detect.

Unsafe blocks are the most obvious concern. Every unsafe block is a contract between the developer and the compiler: the developer promises that the invariants Rust normally enforces are maintained manually. In mature Rust codebases, unsafe blocks accumulate for FFI bindings, performance-critical paths, and hardware interactions. Over time, the assumptions behind those unsafe blocks can become stale as the surrounding code evolves. A data structure changes shape, a pointer arithmetic assumption no longer holds, or a concurrency pattern shifts in a way that invalidates the safety argument. Reviewing whether each unsafe block still upholds its contract requires reading the surrounding context carefully.

Ownership and borrowing complexity is another source of debt. As Rust projects grow, developers sometimes fight the borrow checker by introducing unnecessary clones, wrapping values in Arc<Mutex<_>> when simpler ownership patterns would suffice, or restructuring code in convoluted ways to satisfy lifetime requirements. These workarounds compile and run correctly, but they obscure intent, hurt performance, and make the codebase harder to modify.

Panic handling is a subtler issue. Rust distinguishes between recoverable errors (Result) and unrecoverable panics, but many codebases use unwrap() and expect() liberally in code paths that are not truly unrecoverable. A panic in a library function, a web handler, or an async task can bring down the entire process or leave it in an inconsistent state. Identifying which unwraps are safe and which represent production risk requires understanding the calling context.

What rule-based tools miss in Rust

Clippy is Rust's standard linting tool, and it is excellent at catching common mistakes, suggesting idiomatic alternatives, and enforcing style. Tools like cargo audit check dependency vulnerabilities, and cargo miri can detect undefined behaviour in unsafe code. But these tools operate on individual patterns and known vulnerability databases rather than reasoning about the codebase as a whole.

Consider an async Rust application that uses Tokio for its runtime. A task spawns a background job that holds a reference to shared state through an Arc. The task itself is cancelled when a timeout fires, but the background job continues running and mutating the shared state. Clippy will not flag this because each individual operation is valid. The issue is an architectural one: the cancellation semantics of the parent task do not account for the lifecycle of the spawned job. Finding this requires tracing the flow of ownership and task lifecycles across multiple async functions.

Macro-heavy codebases present another challenge. Procedural macros and declarative macros can generate substantial amounts of code that does not appear in the source files. Rule-based tools analyse the expanded output, but they cannot assess whether the macro abstraction itself is well-designed, whether it introduces hidden performance costs, or whether its generated code follows the same conventions as the hand-written code around it. These are judgement calls that require understanding the project's design goals.

Error propagation patterns also escape rule-based analysis. Rust's ? operator makes error propagation concise, but it can also mask error context. A chain of five function calls each using ? might propagate an IO error to the caller with no indication of which step failed or why. Assessing whether error context is sufficient for debugging requires reading the full call chain.

How VibeRails reviews Rust projects

VibeRails performs a full-codebase scan using frontier large language models. Every Rust source file is analysed along with Cargo configuration, build scripts, test modules, and documentation. The AI reads each file and reasons about its purpose, structure, and relationship to the rest of the project.

For Rust code specifically, the review covers:

Unsafe block audit – whether each unsafe block still upholds its safety invariants given the current surrounding code, missing safety comments, unsafe abstractions that could be replaced with safe alternatives
Ownership patterns – unnecessary cloning, overuse of Arc and Mutex where simpler ownership would suffice, convoluted lifetime annotations that indicate structural problems
Panic risk – unwrap() and expect() in non-terminal code paths, index operations on slices without bounds checking, arithmetic that could overflow in release mode
Error handling – error propagation chains that lose context, custom error types that discard source errors, inconsistent use of Result versus panic across the API surface
Macro complexity – procedural macros that generate non-idiomatic code, declarative macros with unclear expansion behaviour, macro usage that obscures the actual control flow
Async correctness – task cancellation safety, blocking calls inside async contexts, spawned tasks that outlive their expected scope, deadlock potential in async mutex usage

Each finding includes the file path, line range, severity level, category, and a detailed description with suggested remediation. Findings are organised into 17 categories so teams can prioritise systematically.

Dual-model verification for Rust

Rust's safety model means that many potential issues are matters of degree rather than outright bugs. An unnecessary clone might be a deliberate trade-off for readability. An unsafe block might be well-justified by its performance context. Distinguishing between pragmatic engineering decisions and genuine technical debt requires nuanced judgement.

VibeRails supports running reviews with two different AI backends – Claude Code and Codex CLI – in sequence. The first pass discovers issues, the second pass verifies them using a different model architecture. When both models independently flag the same unsafe block or panic-prone unwrap, confidence is high. When they disagree, the finding warrants closer human review during triage.

This cross-validation is particularly useful for Rust because the language's emphasis on correctness means that false positives during code review are especially disruptive. Developers who write Rust expect their code to be correct by construction, and flagging something that is actually fine wastes trust. Dual-model verification reduces noise so that the findings that reach triage are the ones worth discussing.

From findings to fixes

After triaging findings, VibeRails can dispatch AI agents to implement fixes directly in your local repository. For Rust projects, this typically means replacing unnecessary clones with borrows, adding error context with anyhow or thiserror, converting unwrap() calls to proper error handling, documenting safety invariants on unsafe blocks, or simplifying convoluted lifetime annotations.

Each fix is generated as a local code change you can inspect, test, and commit or discard. The AI works within the conventions of the existing codebase, matching the project's error handling strategy, module structure, and naming conventions.

VibeRails runs as a desktop app with a BYOK model – it orchestrates Claude Code or Codex CLI installations you already have. No code is uploaded to VibeRails servers. AI analysis is sent directly to the provider you configured, billed to your existing subscription. Licences are per-developer: $19/month or $299 lifetime, with a free tier of 5 issues per session to evaluate the workflow.

Download Free See Pricing