Defense software teams face a constraint that most of the industry ignores. When your source code constitutes controlled technical data under ITAR, you cannot simply adopt the latest AI tool and start feeding it your codebase. Every cloud API call, every SaaS integration, every third-party service that touches your code can create export control risk. ITAR violations can carry serious civil and criminal penalties, including fines and debarment from future government contracts.
This has effectively locked many defense contractors out of AI code review. Local AI models running on controlled infrastructure change the calculus: you can design a workflow where technical data remains within your defined authorization boundary. That can materially reduce risk versus sending code to a general-purpose external API. This article is technical guidance, not legal advice; involve your export-control counsel and FSO.
What ITAR means for software teams
The International Traffic in Arms Regulations (ITAR), administered by the Directorate of Defense Trade Controls (DDTC) under the Department of State, control the export of defense articles, defense services, and related technical data. For software teams, the critical concept is “technical data” – defined in 22 CFR 120.33 as information required for the design, development, production, manufacture, assembly, operation, repair, testing, maintenance, or modification of defense articles.
Source code for software that is itself a defense article, or that is directly related to a defense article on the United States Munitions List (USML), falls squarely within this definition. The code is technical data. It is controlled.
The control has teeth. Technical data cannot be shared with foreign persons – regardless of where those persons are physically located. A foreign national working in your US office is still a foreign person under ITAR. Technical data cannot cross US borders electronically, which means it cannot be transmitted to servers in other countries. And it cannot be processed by systems where unauthorised persons might have access, even if those systems are located within the United States.
For software development teams, this creates specific obligations. Version control systems must enforce access controls that exclude foreign persons without appropriate authorisation. Build servers must be located within controlled environments. Code review tools must not transmit source code to systems outside the authorization boundary. And every data flow involving technical data must be documented and auditable.
This is the regulatory environment. It is not optional. It is not a suggestion. It is federal law, actively enforced, with severe consequences for non-compliance. Any AI code review solution for defense contractors must operate entirely within this framework.
Why cloud AI tools create export control risk
When a developer sends source code to an AI API endpoint, the code is transmitted to and processed on infrastructure operated by the AI provider. For ITAR-controlled technical data, this creates multiple risk vectors.
Infrastructure access. Cloud AI providers employ thousands of engineers who maintain the infrastructure that processes API requests. These engineering teams are global. A request to Anthropic's API or OpenAI's API is processed on infrastructure where engineers of various nationalities have operational access. Even if the provider's data centres are located in the United States, the personnel who maintain those systems may include foreign persons. Under ITAR, making technical data accessible to foreign persons without authorisation – even inadvertently – constitutes an export.
Data routing. Cloud infrastructure is designed for reliability and performance, not for data residency enforcement. API requests may be load-balanced across regions. Intermediate caching layers may temporarily store data in locations you cannot control. Even providers that claim US-only processing may route traffic through global CDN nodes or rely on sub-processors with international infrastructure.
Data retention. AI providers may retain input data for debugging, model improvement, abuse monitoring, or legal compliance. Even providers that offer zero-retention APIs may have logging systems that capture request metadata. The distinction between retaining the full source code and retaining a log entry that contains a code snippet is legally irrelevant under ITAR – both constitute technical data.
Sub-processor chains. Many AI services rely on sub-processors for compute, storage, networking, and monitoring. Each sub-processor in the chain is an additional entity with potential access to your technical data. Tracing the full chain of custody for a single API call through a cloud provider's infrastructure stack is effectively impossible.
The risk is not theoretical. ITAR violations can trigger investigations, consent agreements, and significant penalties. For a defense contractor, sending ITAR-controlled source code to a cloud AI API is often not a cost-benefit calculation; it is a compliance boundary that must be evaluated and documented carefully.
The local-model solution
With local AI processing, the compliance picture can simplify: you can design a workflow where ITAR-controlled source code is processed only on authorised systems, within your defined boundary, with tightly controlled network paths. This reduces reliance on third-party inference infrastructure and shrinks the sub-processor chain you need to evaluate.
VibeRails is a desktop application. It runs on the developer's workstation or on a controlled server within the facility. It orchestrates Claude Code CLI, which communicates with the AI model. When that model is a local instance running on Ollama or vLLM on the same machine or on a GPU server within the facility's network, the data flow is:
Source code (on the local filesystem) → Claude Code CLI (local process) → Local model server (local GPU) → Review findings (written to local filesystem)
In a correctly configured deployment, technical data does not need to cross your authorization boundary. The source code is read from the local filesystem, processed by a model running on authorized hardware, and findings are written back to local storage. The key requirement is that your environment has no unintended paths to external networks and that the model endpoint and storage are inside your boundary.
This is not a legal shortcut. It is an architectural approach designed to reduce the need to send controlled data to external AI providers. Final compliance depends on program requirements, boundary definition, access controls, and documentation.
For model selection, defense contractors should evaluate open-weight models that can be inspected and hosted inside their environment. Pick models that fit your available VRAM at an acceptable quantization level, and validate on representative repositories before standardizing.
AWS GovCloud for ITAR workloads
Not every defense software team has GPU hardware on-premises. AWS GovCloud (us-gov-west-1 and us-gov-east-1) is designed for certain regulated workloads and is commonly used for export-controlled programs. Whether a specific GovCloud deployment meets your ITAR program requirements depends on your contract, boundary, access controls, and documentation. GPU instances are available for AI inference.
Instance selection. P5 instances with NVIDIA H100 GPUs are available in some regulated cloud environments. An H100-class GPU provides enough memory for larger coding models. For budget-conscious teams, L4/A10-class instances can be sufficient for smaller models at more aggressive quantization.
Network isolation. The critical configuration is a private VPC with no internet gateway. The instance has no route to the public internet. Model weights are loaded from S3 via a VPC endpoint – an internal AWS network path that never traverses the public internet. Instance management is performed via AWS Systems Manager (SSM), which also operates within the GovCloud boundary. Source code is transferred to the instance via SSM session or a VPN connection to the VPC.
Operational flow. The workflow is: (1) provision a GPU instance in the private VPC, (2) load model weights from S3 via VPC endpoint, (3) transfer source code to the instance via SSM or VPN, (4) run the review using VibeRails or Claude Code CLI directly on the instance, (5) extract findings via SSM or VPN, (6) terminate the instance. The goal is to keep processing and storage within your defined boundary and eliminate paths to the public internet; validate this with your security and compliance teams.
Cost. GovCloud GPU instances are priced differently than commercial regions and vary by instance type and region. For intermittent reviews, the per-session spend can be materially lower than purchasing dedicated hardware, but you should validate current pricing for your region and model size.
Documenting the data flow for compliance
Your compliance team and your Facility Security Officer (FSO) will need documentation that demonstrates ITAR compliance. The local-model architecture makes this documentation straightforward because the data flow is simple and verifiable.
Your documentation should establish five facts:
1. No technical data leaves the authorization boundary. Document the network configuration. For on-premises deployments, the model server has no external network access. For GovCloud deployments, the VPC has no internet gateway. In both cases, source code is processed on systems within the boundary and findings remain within the boundary.
2. The AI model runs on authorised systems. Document the systems that perform inference. For on-premises, identify the specific workstation or server, its physical location, and its access controls. For GovCloud, document the VPC configuration, instance type, and the GovCloud region's ITAR accreditation.
3. Access to the processing infrastructure is restricted appropriately. For on-premises systems, this is controlled by your facility's physical access controls and your IT access management. For cloud environments, rely on provider documentation and your contractual terms, and document your IAM policies to show that only authorized personnel can access the resources in scope.
4. Review findings are stored within the authorization boundary. VibeRails stores findings locally in JSON files on the machine where it runs. If findings are exported as reports, document where those reports are stored and who has access. Findings derived from ITAR-controlled source code may themselves contain technical data (code snippets, architectural descriptions) and must be handled accordingly.
5. Model provenance. Document the model you deploy: license, source, and any available information about training data and evaluation. Even if the model weights are open, your inputs and outputs can still be controlled technical data and must be handled accordingly.
VibeRails's architecture makes this documentation exercise straightforward. The desktop application runs locally. The AI model runs locally (or in GovCloud). The data flow is a straight line from source code to findings, with no intermediate hops, no third-party services, and no external network communication. Your compliance officer can verify the architecture in an afternoon.
Getting started
If you have avoided AI code review because export control requirements make external AI APIs unacceptable, local models running on controlled infrastructure can reduce the risk surface by keeping inference within your boundary.
For the full technical setup – environment variables, model selection, GPU requirements, and step-by-step configuration – see our Local AI Code Review technical guide. For a broader overview of local AI code review beyond ITAR use cases, see the complete guide to local AI code review. To evaluate VibeRails on your own (non-controlled) codebase first, download the free tier – it includes 5 issues at no cost and requires no account creation.
Limits and tradeoffs
- It can miss context. Treat findings as prompts for investigation, not verdicts.
- False positives happen. Plan a quick triage pass before you schedule work.
- Privacy depends on your model setup. If you use a cloud model, relevant code is sent to that provider; local models can keep inference on your own hardware.