AI supply chain attacks don’t even require malware…just post poisoned documentation

Foto: The Register
As many as 58 out of 97 analyzed samples of documentation changes were accepted without proper verification, opening a path for hackers to seize control of the software development process without using a single line of malicious code. The new Context Hub service, launched by Andrew Ng to provide AI agents with up-to-date API data, became the target of a proof-of-concept attack carried out by Mickey Shmueli. The expert demonstrated that the lack of content sanitization mechanisms in the documentation processing pipeline allows for injecting "poisoned" instructions directly into large language models. Unlike traditional supply chain attacks, this method does not require infecting libraries with malware. An attacker simply sends a pull request with fraudulent documentation that suggests using non-existent or attacker-controlled dependencies. Coding agents, such as Claude Code, uncritically download this data via an MCP server and automatically place malicious packages in configuration files, such as requirements.txt. For global users, this represents a massive risk: AI-generated code may appear correct and clean, while in reality, it contains security vulnerabilities introduced during the model's "education" stage. The automation of programming now necessitates rigorous verification not only of the code itself but, above all, the sources of truth upon which autonomous assistants rely.
In the world of software development, we have become accustomed to the fact that supply chain attacks require complex code, infecting libraries, or hijacking developer accounts. However, the latest research into the Context Hub service, launched by AI pioneer Andrew Ng, proves that in the era of autonomous coding agents, plain text is enough. Instead of writing malware, an attacker can simply publish "poisoned" documentation, which artificial intelligence will uncritically implement into a project.
New architecture, old mistakes
The problem emerged with the release of Context Hub — a service designed to solve one of the most irritating problems for developers using AI: hallucinations regarding API parameters. As Andrew Ng noted, models like Claude Code often reach for outdated methods, e.g., calling the older chat completions API from OpenAI instead of the newer responses API (available in the GPT-5.2 model), even though the latter has been on the market for a year. Context Hub is intended to provide agents with up-to-date knowledge via an MCP (Model Context Protocol) server, acting as a bridge between documentation and the model.
The service's operating mechanism relies on the community: developers submit documentation as GitHub pull requests (PR), and moderators approve them. This is where Mickey Shmueli, creator of the competing service lap.sh, spotted a critical vulnerability. In his Proof of Concept (PoC), he demonstrated that the entire process suffers from a complete lack of content sanitization. If a malicious instruction or a fake dependency makes its way into the documentation, the AI agent will treat it as revealed truth.
Read also
Poisoned documentation instead of a virus
The PoC attack carried out by Shmueli is strikingly simple. Instead of hoping that the model will invent a package name on its own (which happens regularly), the attacker feeds it that name in official-looking documentation. In the test, forged instructions were prepared for Plaid Link and Stripe Checkout, containing non-existent, potentially malicious PyPI packages. The AI agent, reading such documentation, automatically adds these libraries to the project's requirements.txt file.
Statistical data from the Context Hub repository raises concerns about the effectiveness of moderation. Out of 97 closed PR submissions, as many as 58 were accepted and merged into the main database. Shmueli points out that the review process seems to favor the quantity of documentation over its security. In the repository code, no traces of automatic scanning for executable instructions or suspicious package references were found. The result? The agent downloads the data, reads the poisoned content, and builds the project, generating clean code without any warnings.
Model resilience: Anthropic's arms race
The results of tests conducted on various models of the Anthropic family are interesting. Shmueli checked how individual versions handle detecting forged dependencies in documentation:
- Haiku: The weakest result. In 40 out of 40 attempts (100%), it uncritically entered the malicious package into the configuration file, never once raising doubts.
- Sonnet: Showed moderate vigilance. It issued warnings in 48% of cases (19/40), but still implemented the fake library in 53% of attempts (21/40).
- Opus: The highest model proved to be the most resilient. It issued warnings in 75% of tests (30/40) and ultimately did not allow the code or the requirements.txt file to be infected.
The ability of Opus to detect anomalies stems from the fact that it is trained on a larger number of packages and exhibits a higher degree of sophistication in context analysis. Nevertheless, the fact that even advanced models are deceived in nearly half of the cases (like Sonnet) shows that relying solely on model intelligence is a risky strategy.
A problem that cannot be easily fixed
The situation with Context Hub is actually a new version of the well-known indirect prompt injection problem. Large language models inherently cannot effectively distinguish between data (documentation) and system instructions. When an AI agent is given access to external knowledge sources, every line of text can become a command. Shmueli emphasizes that the problem does not only concern Andrew Ng's service — practically all current systems providing community-created documentation for AI fail in terms of content verification.
"The agent downloads the documentation, reads the poisoned content, and builds the project. The response looks completely normal. Working code. Clean instructions. No warnings." – Mickey Shmueli, creator of lap.sh.
In the current ecosystem, where the speed of deploying new AI tools wins over rigorous security audits, developers face a new challenge. Traditional SCA (Software Composition Analysis) tools may not detect the threat because, to them, a malicious package looks like any other new dependency added by a programmer — in this case, a programmer in the form of an algorithm.
The end of the era of uncritical trust
The Context Hub incident should be a wake-up call for the entire industry. An architecture in which an AI agent has direct output to the network and permissions to modify configuration files, while being fed unverified external data, is fundamentally flawed. This is a classic example of the "lethal trifecta" — a threat model described by Simon Willison, where access to private data, access to the network, and lack of instruction verification combine.
It can be assumed that in the near future, it will become standard to isolate coding agents in sandbox environments without access to the external network or to force the use of only local, digitally signed documentation databases. Until AI systems learn to separate the data layer from the control layer, every public knowledge library will remain a potential attack vector. Supply chain security in the AI era no longer depends on what a hacker types in the code, but on what the machine reads.





