Research11 min readMIT Tech Review

The Pentagon is planning for AI companies to train on classified data, defense official says

P
Redakcja Pixelift3 views
Share
The Pentagon is planning for AI companies to train on classified data, defense official says

Foto: MIT Tech Review

The Pentagon plans to enable artificial intelligence companies to access classified data to train their AI models. As revealed by a high-ranking Department of Defense official, the initiative aims to strengthen military capabilities through the use of advanced artificial intelligence technologies. The move represents a breakthrough in the approach to data security — previously, classified information was protected from access by external entities. Now the Pentagon is considering changing its policy to enable cooperation with commercial leaders in the AI industry. For technology companies, this means potentially significant government contracts and access to unique datasets. At the same time, it raises questions about the security of such data and the risk of leaks of sensitive information. The decision reflects a growing conviction in military circles that technological competition with China and Russia requires private sector engagement. However, the Pentagon must resolve issues of access control and protection of state secrets before implementing a full program.

The Pentagon is preparing to introduce a radical change in how it cooperates with artificial intelligence companies. According to MIT Technology Review reports, the Department of Defense plans to create secure environments where generative model producers — companies like Anthropic, OpenAI, or potentially other industry players — will be able to train military versions of their systems directly on classified data. This is not a simple evolution — it is a fundamental shift in the paradigm of national security and sensitive information management.

The initiative is a response to growing military interest in AI applications and the fact that the U.S. military is already using such tools in operational conditions. Models such as Claude from Anthropic are already being used for target analysis in Iran and other sensitive intelligence tasks. However, the current model — where existing models receive access to classified information during deployment — has serious limitations. The Pentagon wants to go further: allow these models to learn from secret data, which could potentially create specialized AI systems tailored to military needs.

This move raises a series of questions about security, control, and the future of relations between the technology industry and government institutions. If the Pentagon truly opens access to its most sensitive data to private technology companies, the consequences could be both promising and dangerous.

How AI currently functions in the military

To understand the significance of this plan, one must first know how the Pentagon uses AI today. Models like Claude are already being deployed in environments containing classified data, but in a very specific way. Systems operate in isolated networks — so-called air-gapped environments — where they have access to sensitive information but cannot communicate with it outside these secure zones.

Practical applications are already surprisingly advanced. The military uses AI to analyze satellite imagery, process large amounts of intelligence data, and support tactical decisions. In the Iranian context — as mentioned in reports — models help identify and analyze potential targets. These are not speculations or future scenarios; this is happening now.

The problem with the current model is that general models — trained on public data — are not optimally adapted to specific military needs. Claude or GPT-4 are universal tools that had to learn from publicly available data. When the Pentagon wants to use them to analyze classified documents, the system must deal with data far more specialized than anything it saw during training. This limits its effectiveness.

Secure environments for training on classified data

The Pentagon's new plan changes this dynamic. Instead of just using existing models in secure networks, the Department of Defense wants to allow companies like Anthropic to actually train their models on classified data. This means that Claude or other systems could learn directly from military databases, classified documents, and intelligence information.

To make this secure, the Pentagon plans to create dedicated, completely isolated computing infrastructures. Technology companies would not have access to this data outside strictly controlled environments. The process would look as follows: the Pentagon would provide hardware, network connections, and data access itself, while companies like Anthropic would provide algorithms and knowledge about model training. Training would take place entirely in a controlled security zone, with no possibility of data exfiltration.

This approach has theoretical advantages. A model trained on actual military data would be far more effective at analyzing similar data in the future. It could better understand military context, specialized terminology, military organizational structures, and patterns in intelligence data. For the Pentagon, this would mean AI that truly understands its problems, rather than a general tool requiring constant adaptation.

Why the Pentagon doesn't build this itself

The question that naturally arises is: why doesn't the Department of Defense develop its own AI models from scratch instead of cooperating with private companies? The answer is prosaic but important. The Pentagon simply does not have such internal capability.

Building and training state-of-the-art generative models requires not only astronomical computing resources but also teams of scientists, engineers, and experts that the Pentagon does not possess in sufficient numbers. The technology industry attracts these talents with better salaries, more attractive work environments, and the opportunity to publish research. Anthropic, OpenAI, or Google have hundreds of specialists engaged in AI development; the Pentagon has only a handful.

Additionally, technology companies already possess advanced, functioning models. Rather than investing billions of dollars in building something from scratch over years, the Pentagon can use existing solutions and adapt them to its needs. This is a pragmatic approach, though it raises serious security and control challenges.

Cooperation with private companies also means that the Pentagon can benefit from advances in AI research without having to finance the entire infrastructure itself. However, the price for this is high: the need to share the most sensitive data with the private sector.

Security threats and potential leaks

This is where things become truly interesting — and concerning. Allowing technology companies access to Pentagon classified data involves inherent security risks. Even if all training takes place in an isolated environment, simply employing workers with access to such data creates an attack vector.

History shows that the largest classified information leaks often come from people with legitimate access. Edward Snowden had access to NSA data; Chelsea Manning had access to military reports. If an engineer at Anthropic has access to Pentagon classified data — even only for the purpose of training a model — they become a potential target for foreign intelligence services, hackers, or may act on their own ideological motivations.

The Pentagon will have to introduce enhanced security procedures — thorough security clearances for employees of technology companies, monitoring of their activities, restrictions on where they can work and who they can communicate with. This will be complicated and costly, but absolutely necessary.

There is also a more subtle risk: what will happen to the knowledge gained during training? If Anthropic trains Claude on Pentagon classified data, will that knowledge remain contained in the model? Could it be extracted or reproduced? If Claude learns to recognize patterns in Iranian communication networks or military structures, could that knowledge be used in a commercial model available to the public? The Pentagon will have to introduce mechanisms that completely separate military versions of models from commercial versions.

Implications for the technology industry

From the perspective of technology companies — particularly Anthropic — this Pentagon proposal is both an opportunity and a threat. On one hand, access to Pentagon classified data, if used properly, could provide invaluable information for training models. Military and intelligence data is an extremely rich source of information about real problems and challenges.

However, engaging in such a project involves serious consequences for public image. Already, technology companies face criticism regarding the militarization of AI. Google's project with the Pentagon — Maven — met with resistance from employees and activists. If Anthropic formally engages in training models on classified data for military purposes, it could spark similar protests among its employees and in the broader technology community.

There is also the question of precedent. If the Pentagon gains access to training models on classified data, will other government agencies — CIA, NSA, FBI — want the same? Will other countries insist on similar agreements? This could create a world where AI models are trained on secret data by various governments, completely separated from the public AI research ecosystem.

Consequences for transparency and control

One of the fundamental challenges with this plan is the issue of transparency and external oversight. Currently, independent researchers can analyze public AI models, test them, search for errors and biases. This is crucial for security and accountability. A model trained on Pentagon classified data would be completely inaccessible to such analysis.

This means that the Pentagon would have to rely solely on internal security tests conducted by Anthropic and its own teams. No one from outside could check whether the model tends to recommend actions that could violate international humanitarian law, or whether it contains errors that could lead to unintended consequences. This is essentially a new model — where AI systems used in military decisions will be completely opaque to public oversight.

For Polish readers of Pixelift, it is worth noting that this trend has global implications. If the United States establishes a precedent for training AI on classified data, other countries — including potential NATO allies — will want to do the same. Poland, as a NATO member, could find itself in a situation where there is pressure to cooperate with technology companies in a similar model. This raises questions about who will have access to Polish military and intelligence data and how it will be regulated.

Precedents and current practices

The Pentagon is not new to cooperation with private companies on sensitive projects. The history of DARPA (Defense Advanced Research Projects Agency) shows that the Department of Defense has experience working with universities and companies on classified projects. There are already legal and procedural mechanisms that allow for such cooperation.

However, AI is a different beast. Generative models are black boxes — even their creators do not always understand exactly how they make decisions. This makes them more unpredictable than traditional software. Additionally, training a model on Pentagon classified data would be the most ambitious project of its kind — far larger than previous initiatives.

It is also worth remembering that the Pentagon is already experimenting with AI in real operational conditions. Autonomous drones, data analysis systems, decision support systems — all use elements of AI. The transition to models trained on classified data would be a natural consequence of this evolution, but a far more radical leap.

Realistic implementation scenarios

If the Pentagon truly proceeds with implementing this plan, what could it look like in practice? Most likely, it would start with pilot projects with one or two companies — probably Anthropic, given its reputation in AI safety. Initial training could involve relatively less sensitive data — for example, analysis of open military reports or historical data.

If the pilot were successful, the Pentagon could gradually expand access to more sensitive data. The process would be lengthy — likely stretched over several years — due to the importance of security. Each phase would require approval from higher authorities and review of security procedures.

In a long-term scenario, the Pentagon could possess an entire range of specialized AI models, each trained on classified data for a specific application. It could have models for intelligence analysis, models for tactical planning, models for logistics analysis. Each would be adapted to specific needs and completely isolated from the public AI ecosystem.

Long-term perspective

If this Pentagon plan comes to fruition, we will witness a fundamental change in the AI ecosystem. Until now, the technology industry — both in the United States and globally — has developed largely independently of governments. Companies like OpenAI or Anthropic built models on public data and sold access to them to governments and businesses.

If the Pentagon gains access to training models on classified data, it would mean that the government becomes an integral part of the training process, rather than just an end user. This could create a precedent for other governments and agencies. Ultimately, we could see a world where there are two separate paths for AI development: public, available to everyone, and governmental, available only to select security agencies.

This would have profound implications for security, competition, and democracy. On one hand, it could give governments tools to better respond to security threats. On the other hand, it could create an asymmetry in access to advanced AI, where governments have access to systems far more advanced than public ones. This could also create pressure on technology companies to cooperate with governments, reducing their independence.

For Poland and Europe, this is particularly important. If the United States establishes this model, NATO will be under pressure to adopt a similar approach. European technology companies may find themselves in a situation where they are forced to cooperate with European governments in a similar way to remain competitive. This could have a significant impact on the European AI ecosystem and the independence of European technology companies.

Comments

Loading...