Meta rolls out new AI content enforcement systems while reducing reliance on third-party vendors

Hollie Adams/Bloomberg / Getty Images
Meta will implement advanced AI systems for content moderation, gradually reducing cooperation with external providers. New algorithms will be responsible for detecting and removing materials related to terrorism, child exploitation, drugs, fraud and scams. The platform will deploy these systems across all its applications, but only when they consistently outperform current moderation methods. This is a significant step toward internal content control — Meta will be less dependent on external partners. For users, this means potentially faster response to dangerous content, but also the risk of algorithmic errors without human oversight. Investment in proprietary AI could improve moderation consistency, though history shows that automated systems often struggle with context and cultural nuances. Meta's decision reflects a broader trend of tech giants taking control of moderation instead of outsourcing it.
Meta is taking a decisive step toward complete autonomy in the area of content moderation. On Thursday, the company announced the implementation of advanced artificial intelligence systems that are to take over functions previously partially delegated to external vendors. This is not a mere process optimization — it is a fundamental change in approach to one of the most difficult challenges in the technology industry. Meta declares that the new systems will better detect violations, act faster, and reduce cases of over-moderation, which for years has been a source of controversy among users and creators.
The decision by Mark Zuckerberg's company does not appear in a vacuum. The industry faces growing regulatory pressure, and at the same time, the costs of maintaining global teams of content moderators are constantly rising. Meta, operating on billions of posts daily in its ecosystem (Facebook, Instagram, WhatsApp, Threads), needs a solution that will be scalable, consistent, and — above all — effective. This change also signals the company's confidence in the capabilities of modern AI models, though it will certainly not be without challenges.
Where Meta's AI will watch over order
The new systems will be responsible for detecting and removing content classified as the most serious categories of violations. This includes terrorism, child exploitation, drug trafficking, fraud and scams. These are not trivial cases — these are areas where every second of delay can have serious consequences for user safety. Terrorism and child sexual exploitation are particularly sensitive topics, where errors in both insufficient and over-moderation can lead to tragedy or human rights violations.
Read also
Meta has not revealed exactly which applications will be covered by the implementation in the first phase, but logic suggests it will be the main platforms first: Facebook and Instagram. These services generate the largest amount of content and pose the greatest potential for harm if moderation failed. WhatsApp, due to end-to-end encryption, presents a separate problem — there AI will have to work under different conditions, analyzing metadata and context rather than the content of messages themselves.
It is worth noting that the company focuses on so-called priority harms. This is a strategic choice. Instead of trying to automate everything (which would be impossible), Meta focuses on cases where AI has the best chance of success and where errors are least tolerated. Fraud and scams are somewhat easier to automate than, say, hate speech or misinformation, which require deep understanding of cultural context.
Moving away from external moderators — a matter of economics and control
Reducing relationships with third-party vendors is a key element of the strategy. For years, Meta has relied on networks of subcontractors — companies such as Accenture, Teleperformance, or Appen, which employed thousands of moderators in countries such as the Philippines, India, or Morocco. These workers, often earning significantly below the average salary in the United States, reviewed billions of posts, videos, and comments, making decisions about removal or retention of content.
The human-based system had many flaws. Moderators experienced serious mental health problems — constant exposure to content containing violence, sexuality, and hatred left lasting marks. There were also issues with consistency — different workers made different decisions for similar cases, leading to unequal treatment of users. Additionally, maintaining a global network of employees was costly and susceptible to many disruptions (such as pandemics or logistical problems).
From Meta's perspective, the transition to AI is not only a saving of operational costs, but above all regaining full control over the moderation process. When decisions to remove content are made by a system developed internally, the company has a clear picture of how moderation works, can quickly implement changes, and — importantly — avoids media coverage of stories about worker trauma. This is a PR plus, though it's hard to say whether justified.
Promise of greater accuracy and speed — is it real?
Meta claims that its AI systems will be more accurate and faster than current methods. As for speed, this is realistic — an algorithm can scan a million posts in a fraction of a second, while a human reviews dozens a day. But accuracy? That's more complicated.
Modern AI models, particularly those trained on large datasets, perform incredibly well at detecting obvious violations. If a post contains known keywords related to terrorism, contains images of a child in a sexual context (through hashing and comparison with databases), or obvious signs of fraud, AI can identify it quickly. The problem appears at the boundary — in gray areas where context is key.
Take an example: a post containing words related to drugs. This could be a scientific article about addiction, a personal story from someone struggling with addiction, a joke, or an actual attempt to sell. A human can understand this by reading the whole thing. AI, especially if it doesn't have access to full context, can be mistaken. Meta says its systems will be deployed "when they consistently outperform current methods" — this is a key caveat. If this is truly the condition, that's good. If it's just rhetoric, the problem remains.
Reducing over-moderation — promise or reality?
One of Meta's arguments is particularly interesting: AI systems are to reduce over-enforcement — cases where content is removed incorrectly. This is a real challenge. Thousands of users annually complain that their posts were removed for no reason — a piece of nude art, a discussion about domestic violence, a joke that AI interpreted literally. Human moderators also make mistakes, but at least they can explain their decision when a user appeals.
AI, if trained properly, should be more consistent. However, a problem arises here: the more conservative the moderation (the more posts are removed to avoid missing violations), the more over-enforcement there will be. This is the classic trade-off problem between false positives and false negatives. Meta will have to find a balance, and that's not simple.
For Polish users and creators, this could matter. The Polish community on Facebook and Instagram is active and sometimes — especially in political discussions — encounters stricter moderation. If the new systems better understand context, they could reduce the number of incorrectly removed posts. But if they work too conservatively, they could also limit freedom of expression.
Technology behind the new systems — what do we know?
Meta has not disclosed technical details of its new moderation systems. However, based on what is known about AI architecture at the company, we can speculate. Most likely, this involves a combination of large language models, computer vision (image recognition), and graph neural networks (neural networks that analyze relationships between users and content).
Systems such as those being developed by Meta (including models from the LLaMA family) can analyze text in context, understand sarcasm, allusions, and cultural nuances — at least in theory. Computer vision is already quite advanced and can recognize images containing prohibited material with high accuracy. Graph neural networks allow for pattern analysis — for example, if account A sends messages to accounts B, C, D with similar content, the system can identify potential spam or scam.
A key issue will be how Meta will update these systems. Content moderation is not a game played once — it's a constant race between moderators (human or AI) and people trying to circumvent systems. When AI learns to detect one type of fraud, fraudsters change tactics. Meta will have to constantly retrain its models, which requires resources and — importantly — access to new training data.
Implications for the industry and competition
Meta's move is not isolated. OpenAI, Google, TikTok — all these companies are investing in automatic content moderation. However, Meta, due to the scale of its operations, has a unique position. A billion users means a billion data points to learn from. If Meta manages to develop systems that work better than the competition, this will be a significant competitive advantage.
On the other hand, reduced employment by subcontractors may be viewed critically. Organizations dealing with workers' rights are already protesting the working conditions of moderators. Automation may seem like a solution, but for thousands of workers in developing countries, it means job loss. Meta will need to communicate this carefully to avoid a wave of negative publicity.
For smaller platforms that don't have the resources to develop their own AI systems, this could be problematic. They will be forced to use external vendors (such as Crisp Thinking or Two Hat Security), which will never be as advanced as Meta's systems. This could deepen the gap between large and small platforms.
Reality: where problems may arise
Despite Meta's optimism, the implementation of new systems will not be without challenges. First, errors will occur — this is inevitable. Every AI system, no matter how advanced, will sometimes make wrong decisions. The question is: how quickly will Meta fix them and how will it communicate these errors to users?
Second, there will be problems with bias — algorithmic prejudices. If training data contains more examples of violations from specific cultures or languages, the system may be more aggressive toward them. This could lead to disproportionate impact on minorities and marginalized groups. Meta must be aware of this risk and actively mitigate it.
Third, there is the question of appeals and recourse. When a human makes a decision, they can explain it. When AI makes a decision, a user may not understand why their post was removed. Meta must provide clear channels for appeal and explanation, otherwise it risks a wave of user frustration.
Perspective for users and creators
For the average Facebook or Instagram user, the change may be unnoticed — moderation will simply work in the background as before. However, for content creators, particularly those operating on the edge of guidelines (for example, educators dealing with drugs, sex, or violence), it could be significant. If AI better understands context, they may avoid false removals. If it's worse, they may experience more problems.
For the Polish creator community, particularly those dealing with controversial topics, this could be a double-edged sword. On one hand, better moderation systems could mean fewer arbitrary decisions. On the other hand, AI trained on global data may not fully understand Polish cultural and political context, which could lead to errors in content interpretation.
For companies dealing with e-commerce and advertising on Facebook and Instagram, the change could impact the speed of removing fraudsters and spammers. If AI is better at detecting scams, it could reduce the number of fraudsters on the platform, which will be beneficial for honest businesses.
The future of moderation — where is the industry heading?
Meta's implementation of advanced AI systems is a signal of the direction the entire industry is heading. In the future, we can expect moderation to become increasingly automated, but also — if regulations require it — increasingly transparent. The EU's Digital Services Act already requires platforms to explain moderation decisions, which will pose greater challenges to AI.
Another direction is decentralization of moderation. Instead of one global strategy, platforms could begin offering different levels of moderation in different regions, adapting to local regulations and preferences. This would be more technically complicated, but could solve many problems related to cultural bias.
Meta is at a crossroads. Investment in advanced AI systems for moderation is a logical step for a company of this size, but success will depend on whether the systems will actually work better than current methods, whether they will be fair to all users, and whether they will be transparent in their operation. If Meta manages to achieve this, it could become a model for the industry. If not, it could be another example of how AI, despite all its power, still has much to learn about the complexity of human communication and social norms.









