Vibe Killer: Six Weeks, Ten Trust Failures, and the Cleanup Crew We Just Hired
When the world of "vibe everything" gets a cyber smack in the face.
Two weeks ago, Anthropic launched a model it said was too dangerous to release. Within hours, a group of hobbyists on a Discord channel were using it.
The model is called Claude Mythos. Anthropic shipped it under a $100M controlled-release program called Project Glasswing, built around the premise that it shouldn’t be on the open internet. The System Card that accompanied the launch described a pre-release evaluation in which Mythos autonomously escaped a sandbox, chained a multi-step exploit to reach the internet, and emailed a researcher, without being told to.
The Discord group didn’t need an exploit. They got in through a third-party contractor who had been granted legitimate pentest access, combined with endpoint URL patterns they inferred from Anthropic’s conventions for previous models. Gizmodo reported the access was also enriched by data spilled from a recent breach at Mercor, an AI-labeling vendor whose customer list includes Anthropic. Other outlets treated that connection as adjacent rather than causal. Either way, the ingredients were ordinary: shared credentials, pattern-matching, and a data dump from somewhere in the contractor ecosystem. No zero-day. Two weeks later, Bloomberg published.
In the group’s own words: they like messing around with new models. They are not trying to wreak havoc.
What the Mythos story actually is, once you pull the camera back, is the last incident in a six-week wave.
Back up to February. An employee at a small AI startup called Context.ai downloads a Roblox cheat script onto a work laptop. The script carries an infostealer. The attackers quietly walk off with Google Workspace credentials, API keys, and the support@context.ai inbox, then wait.
Around the same time, a separate crew steals a publishing token from Aqua Security’s Trivy, a vulnerability scanner that sits inside thousands of developer pipelines. Aqua rotates credentials, but not cleanly enough. By mid-March, a group called TeamPCP turns that foothold around and pushes malicious code into Trivy itself. Every downstream pipeline that runs it starts leaking secrets. Within a week they’ve used those stolen secrets to poison Checkmarx, then LiteLLM. LiteLLM is the plumbing that routes requests between applications and the major AI providers. It’s installed in roughly a third of cloud environments. Its customer list includes Mercor, an AI contractor vendor. Mercor’s customer list includes Anthropic.
On March 26, security researchers stumble into a misconfiguration in Anthropic’s own website and find a draft blog post describing an unreleased model codenamed Capybara. Fortune and CNBC run the story the next day. A Discord channel that tracks unreleased AI models now knows what to look for.
March 31, a North Korean state actor compromises Axios, the most-used HTTP client in JavaScript. For three hours, every system in the world that auto-updated pulled a backdoor.
April 7, Anthropic launches Mythos. The Discord group is inside within hours.
April 19, Vercel confirms that the Context.ai infostealer access from February was used to take over an employee’s Google Workspace account and get into internal systems. The stolen data goes up for sale on a hacker forum for $2 million.
April 20, a researcher publishes that the vibe-coding platform Lovable has been exposing private chats, source code, and database credentials from every project created before November 2025. Lovable’s first public response was that they had not suffered a breach. A few hours later they acknowledged they had, in fact, accidentally re-enabled public access to every user’s chat logs back in February during a backend refactor, and hadn’t noticed for two months. Before that refactor, “public” on Lovable had meant the whole project, including every conversation with the AI, visible to anyone on the internet, intentionally, as a product decision. More than 10% of the projects on the platform were leaking sensitive data.
April 21, Bloomberg publishes on Mythos.
Four failures
The dependency ladder no one climbs. LiteLLM sits in 36% of cloud environments. Axios is in roughly every JavaScript codebase on earth. The median production app in 2026 pulls over a thousand transitive dependencies and almost nobody reads past depth one. When TeamPCP poisoned Trivy, they were reading downstream. When Sapphire Sleet poisoned Axios, they were counting on CI/CD pipelines that auto-pull the latest.
OAuth is the new phishing. A Vercel employee signed up for an AI office-suite product with their enterprise Google account and granted “Allow All” permissions. Not a phishing email. A sanctioned OAuth grant to a small AI startup that turned out to have an infostealer-infected employee who liked Roblox cheats. The attack path reads as legitimate traffic in every log it touches.
Public-by-default is a product decision now. Lovable is the sharpest example because they said the quiet part out loud. They designed a system where “public” meant everything: chat logs, error messages, pasted credentials, the whole development narrative. Then they shipped it to millions of people who thought “public” meant “others can see my published app.” When security researchers reported the bug, Lovable’s partners at HackerOne closed it as “not planned.” When journalists asked, Lovable denied a breach had occurred, then blamed their own documentation, then blamed HackerOne for not escalating, then apologized for the apology. The VibeWrench study of 100 vibe-coded apps this March found 70% lacked CSRF protection, 41% exposed secrets, 21% had no authentication on API endpoints. Junior engineers at least know what RLS is. This is about millions of people shipping production applications who have never heard the acronym, and who believe, reasonably given the marketing, that the tool handles it.
The perimeter is contractors, not code. The Mythos access path ran through a third-party contractor with pentest credentials, combined with data that spilled out somewhere in the contractor ecosystem. That material existed because Anthropic, like every major AI lab, routes work through specialized vendors who in turn depend on open-source infrastructure that was itself recently compromised. The Vercel breach ran through an AI-tool vendor’s infostealer-infected employee. Every perimeter in this cascade was a trust relationship, not a firewall. Miles Brundage asked the right question on X: what rights does Anthropic actually have over how its hyperscaler partners grant downstream access? Nobody seems sure. I work in national security AI, and I spend a lot of time inside exactly this failure mode, what it means to deploy AI in environments where contractor-credential sprawl isn’t an embarrassing X post, it’s a formal incident. The pattern I see is that the vendor ecosystem has grown faster than the governance tooling for it. The policies haven’t caught up with the contracts. The Mythos breach isn’t an outlier. It’s a preview.
Is this different from the cybersecurity of the past?
Some of it isn’t. Supply-chain attacks have been around since SolarWinds, and arguably since Ken Thompson’s compiler-trust paper in 1984. Every mature CISO has a binder (yes, some of us used binders) on all of it.
Three things are genuinely different.
Compositional velocity. We invented a new architectural category (AI gateways, AI office suites, AI coding assistants) that concentrates credentials for dozens of providers in one place and that the industry adopts in months. LiteLLM went from useful library to node in a third of cloud environments in under two years.
Who’s building. Vibe coding moves authorship from people who understand they’re shipping a server to people who don’t. And it isn’t confined to startups. Fortune 500s are rearchitecting core workflows on Lovable, Cursor, Vercel, and the rest of the stack, often with engineering headcounts that wouldn’t have been considered adequate for a single internal tool five years ago. The permission structure of enterprise software is being redrawn, at speed, by people who don’t necessarily read the dependency tree. Even in my world of national security and mission-critical industries. There are platforms in serious rotation right now that would not survive a real code review. The appetite for building tooling this way is growing, not shrinking.
Tempo. The Vercel CEO attributed the attacker’s “surprising velocity” to AI augmentation. Microsoft’s April threat intel documented live actors using generative AI for code generation and backend orchestration. For the first time, the builder’s velocity advantage has been canceled by the attacker’s.
The cleanup crew is the arson squad
The same class of model that makes the Mythos access story notable is also the cleanup crew we are about to hire en masse.
Mozilla used Mythos Preview to find and patch 271 Firefox vulnerabilities in a single release. Project Glasswing vendors are auditing their own attack surfaces with the tool that would otherwise be auditing them. Every enterprise that shipped vibe-coded apps last quarter will eventually pay for a Mythos-class model to go read its own codebase and tell it where the RLS policies are missing and which endpoints never had authentication in the first place. The cleanup crew and the arson squad are the same hire.
What’s different now is the compression. The wreckage and the repair are happening in the same month, sometimes the same week. The question isn’t whether the cleanup happens, it will happen again and again. The question is whether it catches up in time, and whether the world cares enough to pay for it when so much of the landscape is driven by speed and marketing, including, lately, doomer marketing (including the Mythos marketing). Over the last year the AI conversation has lived almost entirely inside two poles: this will save the world, or this will end it. Neither vocabulary gets you to the boring middle where the actual work is.
Still optimistic, with one caveat
I am, and not reluctantly.
We are laying an incredible foundation for how the world interacts with intelligence. The categories being built right now (agentic systems, AI-native enterprise tooling, the reshaping of what software authorship even means) are among the most important things happening in technology, and they are worth what they cost. I’d rather be in the world that ships this and has to clean up behind itself than the world that refuses to ship because the cleanup is hard.
What gives me pause is the mismatch. We spend our mornings flagging existential risk, publishing 244-page System Cards, briefing Congress on what agentic models can do in the wrong hands. Then we spend our afternoons making supply-chain mistakes that would embarrass a junior developer. A company publishes a controlled-release program designed to keep a dangerous model out of the wrong hands, and the wrong hands are in the model before lunch, through shared contractor credentials and a URL pattern. A vibe-coding platform worth $6.6 billion responds to a data exposure affecting thousands of projects by saying they did not suffer a breach, then admitting they’d made every user’s chats public for two months.
I don’t know whether the cleanup discipline catches up in time, or whether the market rewards the companies that build more slowly and more carefully. Every incentive right now rewards speed. Even the doomers are selling speed. They just disagree about what we’re speeding toward.
Weeks like this one don’t make me less optimistic about where we’re going. They make me more skeptical about who’s driving and what incentives are at play.


