Hackers Stole The AI Training Playbook

There is a company called Mercor. You probably haven’t heard of it. It’s worth $10 billion, it works with OpenAI, Anthropic, Meta, and Google, and until last week it held some of the most sensitive secrets in AI: not just data, but the actual blueprints for how the most powerful models on earth are trained. Last Thursday, hackers put 4 terabytes of that data up for auction.

The Middleman Nobody Talked About

Mercor’s business model is elegant and a little strange. It recruits experts, doctors, lawyers, mathematicians, coders, to evaluate AI outputs and help companies like OpenAI and Anthropic improve their models. That process is called RLHF (Reinforcement Learning from Human Feedback), and it’s central to why ChatGPT doesn’t sound like a malfunctioning microwave anymore. Mercor is the invisible layer between the labs and the human teachers. Its clients trusted it with something more valuable than the models themselves: the methodology. The labeling protocols. The selection criteria. The training strategies that turn a raw model into a product people pay for.

This is the kind of thing AI labs treat like nuclear codes. You can reverse-engineer a model’s outputs. You cannot easily reverse-engineer the specific decisions a team made about how to shape it over years of fine-tuning. That’s the real IP. And it was sitting in Mercor’s infrastructure.

40 Minutes Was All It Took

The attack vector is almost comically mundane. LiteLLM is an open-source Python library that lets developers connect applications to AI APIs. It’s used in roughly 36% of cloud environments. On March 27, a hacking group called TeamPCP used a compromised maintainer’s credentials to publish two malicious versions of the LiteLLM PyPI package (versions 1.82.7 and 1.82.8). Those versions were live for approximately 40 minutes before being pulled.

Forty minutes. In that window, thousands of automated systems across the AI industry pulled the poisoned package as part of routine dependency updates. Mercor was one of them. The malicious code gave attackers a foothold into Mercor’s internal infrastructure, and from there the exfiltration began.

This is exactly the kind of systemic risk that Anthropic and other labs talk about when they discuss AI safety in broad terms, but almost never apply to their own operational security stack. The assumption seems to be that sophisticated AI companies are protected by sophisticated security. The reality is that their most sensitive assets sometimes flow through a $10 billion intermediary that runs on the same open-source tooling as everyone else.

Lapsus$ Is Now Selling the Recipe

After the breach, the notorious extortion group Lapsus$ (yes, the same group that previously hit Samsung, Nvidia, and Microsoft) claimed credit and announced it was auctioning the stolen data. The alleged haul: candidate profiles, employer data, video interviews, source code, internal credentials, TailScale VPN access, and, most critically, the training datasets and methodologies belonging to Mercor’s AI lab clients.

Meta moved first. The company suspended its partnership with Mercor within days of the news breaking. That reaction tells you something. Meta has its own massive AI infrastructure and its own security teams. If it judged the exposure serious enough to immediately cut ties with a major vendor, the data that was exposed was not routine.

TeamPCP and Lapsus$ are reportedly collaborating to monetize the stolen data and access across the broader LiteLLM supply chain attack, which means Mercor may be only one of many companies affected. The scale is still being assessed.

The Problem With Outsourcing Your Brain

Here’s the question nobody is asking loudly enough: why were the training secrets of OpenAI, Anthropic, Meta, and Google all concentrated in a single third-party vendor with apparently standard open-source dependencies?

There’s a parallel worth drawing here. The AI industry has spent enormous energy worrying about model behavior problems, alignment, sycophancy, hallucinations. Entire research programs exist around making sure models don’t tell users what they want to hear rather than what’s true. But the pipeline that creates those models, the human feedback layer, the labeling, the fine-tuning infrastructure, that part apparently runs on vibes and a hope that the vendor checked their PyPI dependencies recently.

The irony is layered. Companies building systems designed to be trustworthy at massive scale trusted their most sensitive operational secrets to a third-party service running on a library that could be poisoned through a single compromised maintainer account. The security surface of AI training is not just about the model weights. It’s about every human and system that touches the data.

What Actually Gets Exposed in a Training Data Breach

Most people reading headlines about “AI training data leaked” probably imagine some database of text scraped from the web. That’s not what this is. What Mercor held for its clients includes:

Labeling protocols: the specific instructions given to human evaluators to rate AI outputs. These encode value judgments about what good responses look like, and are closely guarded because they reveal the philosophical choices baked into a model’s personality.
Data selection criteria: which examples were used to train and fine-tune, which were rejected, and why. This is how you understand what a model was deliberately shaped to do (and not do).
Training strategies: the sequencing, weighting, and methodological choices that distinguish one lab’s approach from another. Reproducing a competitor’s model is extremely hard. Reproducing their methodology is a shortcut that would normally take years of independent research.

If the stolen data is as described, any well-resourced actor with access to it could potentially replicate core aspects of GPT-5, Claude, or Llama’s fine-tuning approaches. That’s not a hypothetical future risk. It’s on the auction block right now.

A Supply Chain Problem That Was Always There

The Mercor breach is the largest AI-specific supply chain attack so far, but it fits a pattern that has been building for years. Open-source package ecosystems like PyPI and npm are structurally vulnerable to this kind of attack. Maintainer accounts get compromised. Malicious versions get published. Automated CI/CD systems pull them before anyone notices.

The difference here is the target. Previous supply chain attacks have gone after banking credentials, crypto wallets, developer environments. This one happened to land inside the infrastructure connecting every major Western AI lab to the humans who shape their models. The blast radius is not a website going down or a crypto wallet getting drained. It’s the competitive intelligence layer of an industry valued in the trillions.

Scientists have been experimenting with planting ideas into minds in controlled settings. Lapsus$ has apparently managed something structurally similar with the AI industry: a 40-minute window, a poisoned dependency, and now the inner workings of how the world’s most-used AI systems were taught to think are on the open market.

Mercor says it is “one of thousands” of companies affected by the LiteLLM compromise. That’s either reassuring context or a much larger problem than anyone has acknowledged yet. Probably both.

🐾 Visit the Pudgy Cat Shop for prints and cat-approved goodies, or find our illustrated books on Amazon.

Hackers Stole the AI Training Playbook (And It’s Going Up for Auction)

The Middleman Nobody Talked About

40 Minutes Was All It Took

Lapsus$ Is Now Selling the Recipe

The Problem With Outsourcing Your Brain

What Actually Gets Exposed in a Training Data Breach

A Supply Chain Problem That Was Always There

Leave a Reply Cancel reply

The Middleman Nobody Talked About

40 Minutes Was All It Took

Lapsus$ Is Now Selling the Recipe

The Problem With Outsourcing Your Brain

What Actually Gets Exposed in a Training Data Breach

A Supply Chain Problem That Was Always There

You Might Also Like

SoftBank Wants to Borrow $40 Billion to Buy More OpenAI. What Could Go Wrong?

Europe’s AI Gold Rush: Nscale Just Raised $2 Billion and It’s Not Even Building a Model

AI Deepfakes Are Getting Dumb Enough to Be Dangerous: That’s the Real Problem

Leave a Reply Cancel reply