Banned for using a VPN: when 'safety' rules eat legitimate IT policy

A developer files issue #51583 on the Claude Code repo. Their organization had just upgraded to Max. Within hours of the payment going through, the org account was disabled. The flagged behavior, according to the support response: "use of a VPN."

The VPN in question was the corporate VPN. Mandated by their own IT policy. The kind every regulated workplace, every government contractor, and every enterprise with an SSO posture requires you to be on before you touch a build artifact.

This is the second post in our five-part series on AI vendor kill-switch risk. The first looked at how a routine identity check can sever your dev environment with no warning. This one is about the structural problem behind that ticket: automated abuse detection has an irreducible false-positive rate, and when the model is wrong, the customer is the one who pays.

One ticket, one class of problem

The reporter was direct about the bind they were in:

"Our IT requires us to access the internet through our VPN. Right after we paid for our Max subscription, our org was disabled and the appeal form has been sitting unanswered for days. We can't tell our developers to turn off the VPN — that's a security policy violation."

That single sentence captures three different system-design failures stacked on top of each other. The detection model fired on a signal (VPN egress) that correlates with abuse in consumer populations and is required in enterprise populations. The enforcement action (org-wide disable) is binary, not graduated. And the remediation path is a Google Form with no SLA.

Strip the brand off the ticket and you'd find the same pattern across every other documented case. The NSTBrowser ban-prevention guide — written, notably, as a third-party FAQ because the official one is thin — lists the recurring triggers: VPN or proxy use, signing in from a new region, "rapid-fire" message volume, and payment events that trigger an audit. Each one of those is a perfectly normal enterprise behavior.

Four false-positive patterns, all documented

The friction is not random. It clusters.

Corporate VPN egress. The trigger in #51583, and the most-reported single cause in NSTBrowser's catalogue. Any company with a network-policy posture above "open Wi-Fi" routes outbound traffic through a known set of egress IPs. Those IPs are also used by VPN consumers trying to hide. The model can't tell them apart from packet metadata alone.

Region change. Distributed teams hit this constantly. A staff engineer flies Sydney → San Francisco for an offsite, signs in from a hotel network, and the session is invalidated. NSTBrowser specifically flags "logging in from a new country" as a top-three trigger.

Payment-triggered audit. LaoZhang's writeup collects multiple cases where the act of paying — recharging a Max plan, upgrading a tier — is itself the event that surfaces an account for review. The model treats "new payment from new geography" as a fraud signal. From the customer's perspective, they paid you money and you turned them off.

The coding agent loop itself. This one is delicious. Claude Code, by design, runs tight tool-call loops: read file, run test, edit, re-run, repeat. To a fraud heuristic trained on chat traffic, that pattern looks like scripted abuse. NSTBrowser explicitly warns users to "avoid sending messages too rapidly" and to keep request volumes "human-like." For a coding agent in a debug loop, "too rapidly" is the product. The thing the user is paying for is the same thing the abuse model is trained to flag.

Layer those four signals together — VPN egress, payment event, multi-region usage, scripted-looking traffic — and you have described a normal Tuesday at any enterprise running a coding agent across a distributed team. There is no sane operating posture that avoids all four. Asking developers to disable the corporate VPN to keep their AI pair-programmer happy is not a workaround; it is a compliance violation written by your vendor.

The math of automated trust-and-safety

Anthropic's Transparency Hub reported, for the January 2026 window, that 1.45 million accounts were disabled in a single quarter for safety or abuse reasons. That number is not the problem. Most of them are presumably legitimate enforcement against scrapers, abuse rings, and ToS violators.

The problem is the false-positive rate against that base. No production fraud model runs at 0% FPR — it's a fundamental property of binary classifiers operating under class imbalance. A very good model might run at 0.1%. At 1.45 million disables per quarter, 0.1% is 1,450 wrongful bans every three months. A 0.5% FPR — still a respectable number for an industrial classifier — is 7,250.

Those wrongful bans don't fall on individuals chatting with a model. They fall disproportionately on the populations whose normal behavior most resembles abuse: enterprises with VPNs, distributed teams crossing regions, contractors working from countries the model has never seen the customer use, and — uniquely — software engineers running coding agents in a tight loop.

The same disclosure window included the public suspension of OpenClaw's creator. TechCrunch reported the ban was reversed days later after public attention. That is the actual remediation system in 2026: have a Twitter following. If you don't, your appeal sits in the same queue as a scraper farm in Vladivostok, and a junior reviewer who has never read your contract decides whether your engineering team gets to ship this sprint.

It is worth being precise about why even a good model produces a steady drip of #51583-shaped tickets. Modern abuse classifiers operate on tens of thousands of weak signals. They are tuned for population-level precision and recall, not per-customer correctness. The parameter the provider optimises is total moderation cost minus revenue at risk. The customer's continuity is, structurally, an externality of that optimisation — until the customer is large enough or loud enough to internalise it through a procurement contract. Most aren't.

Why enterprises are uniquely exposed

The ban surface gets worse the more sophisticated your IT estate is.

Managed identity through SSO means a single account can sign in from ten different device fingerprints across a workday. Egress through a corporate firewall consolidates traffic from hundreds of seats onto a few IPs — exactly the shape of a botnet. Contractor-of-record arrangements mean the billing entity, the user identity, and the IP geography may all live in three different countries. Multi-device sign-ins are mandatory; nobody runs production from one laptop anymore.

Every one of those properties, in isolation, is a faint anomaly signal. Stacked, they look like fraud. The fraud model doesn't know your SOC 2 auditor signed off on the VPN. It just sees the egress IP.

So the population most able to pay for an enterprise tier is also the population whose baseline behavior most resembles abuse. The provider has no economic incentive to tune the model toward that population specifically — the legit-abuse cost on a wrongly-banned scraper is zero, and the cost on a wrongly-banned Fortune 500 customer is borne entirely by the customer until they escalate publicly.

The contractual gap

Here is the asymmetry, stated cleanly. When the model is right, the provider saves moderation cost. When the model is wrong, the customer's developers stop shipping. There is currently no contractual instrument that prices that asymmetry into the relationship.

Standard SaaS terms of service give the provider sole discretion to suspend for "suspected abuse." That clause was written for a world where the worst case was someone losing access to a chat product. It was not written for a world where the same clause covers the IDE half your engineering org is now embedded in. The legal language has not caught up to the operational reality.

A few questions follow that nobody in the industry has good answers to. Whose insurance covers a wrongful ban during a release window? Does it count as a vendor outage for SLA purposes if the ban turns out to be a false positive? If a regulator audits your SDLC and finds your "AI pair programmer" was offline for four business days because your VPN matched a heuristic, who carries that finding?

Buying enterprise SaaS in 2026 means buying into someone else's classifier. That is a category of dependency the procurement playbooks have not yet been updated for. It is also exactly why the sovereign and self-hosted positioning of tools you actually own is no longer a niche concern — it is a business-continuity one.

Next: the appeal black hole

The corporate-VPN ticket is not the worst part of this story. The worst part is what happened after the form was submitted: nothing, for days, with no SLA, no escalation path, and no human in the loop. The structure of that "appeal" — and why it is the single most expensive thing about depending on a hosted coding agent — is the next post in this series.

Until then: if your CI is gated on a vendor's classifier deciding your VPN is fine today, you don't have a build pipeline. You have a counterparty.