Imagine you are looking for the perfect TV, the best noise-canceling headphones, or a laptop that will survive years of intensive work. Instead of digging through dozens of reviews, you ask ChatGPT a simple question: "What does the WIRED editorial team recommend?". The answer appears in a second, sounding professional and credible. The problem is, it is almost entirely made up. An experiment conducted by WIRED reviewers exposed the painful truth about how language models handle content curation and reliable shopping advice.

Hallucinations instead of reliable tests

When WIRED testers decided to check what ChatGPT attributes to their own brand, the results turned out to be alarming. Instead of searching current databases and providing products that actually passed rigorous laboratory tests, the bot began generating lists full of errors. In many cases, the chatbot pointed to devices that the editorial team had never recommended, or – even worse – those that had received low marks in official reviews.

This phenomenon, known in the industry as AI hallucinations, takes on particular significance in the context of shopping. A user, trusting the authority of a well-known technology brand, may spend thousands of dollars on equipment that in reality does not meet quality standards. ChatGPT not only confused specific models but was capable of attributing opinions to reviewers that they never expressed, creating an illusion of expert knowledge where there is only a statistical probability of subsequent words occurring.

Why does artificial intelligence fail in advising?

The problem lies at the core of how Large Language Models (LLMs) operate. OpenAI trains its systems on gigantic datasets that are a static snapshot of the internet from a specific moment. Although newer versions have access to the web, the information synthesis process still falters. Here are the main reasons why AI recommendations are unreliable:

Lack of distinction between opinion and fact: AI mixes sponsored content, forum comments, and official editorial verdicts.
Recency issues: The consumer electronics market changes from week to week; chatbots often promote models that have been discontinued.
Blind attribution: These systems tend to "guess" what a given editorial team might recommend based on a product's general popularity, rather than on the content of a specific article.
Ignoring test context: AI does not understand why a particular laptop won in the "for students" category but lost in "for video editors."

Screen with incorrect <a href= — Analysis of AI responses shows glaring discrepancies with actual editorial test results.

A threat to the authority of tech media

For sites like WIRED or our portal Pixelift, reader trust is the most valuable currency. It is built over years through reliable testing, taking devices apart, and checking them in extreme conditions. When ChatGPT incorrectly states that a given product is an "editor's choice," it directly hits the credibility of journalists. A reader who is disappointed by a purchase suggested by AI "on behalf" of a well-known brand may never return to it.

This situation also shows a dangerous trend in the search engine ecosystem. In the era of SGE (Search Generative Experience), where AI-generated answers appear above search results, users click on source links less frequently. They receive a ready-made, often incorrect information mush that cuts them off from the in-depth analysis and context that only a human expert offers.

"Want to know what our reviewers actually tested and picked as the best TVs, headphones, and laptops? Ask ChatGPT, and it will give you the wrong answers" – concludes the WIRED editorial team.

The necessity of returning to sources

In the era of a flood of machine-generated content, the role of the content curator becomes more important than ever. Although ChatGPT is a brilliant tool for programming or brainstorming, when it comes to spending money on technological equipment, it remains a poor advisor. Verifying information at the source, e.g., directly on sites like WIRED, is currently the only way to avoid costly mistakes.

It can be assumed that until AI models start treating the authority of specific sources with due weight, we will witness a progressive degradation of information quality on the web. RAG (Retrieval-Augmented Generation) technology, which is intended to combine text generation with searching reliable databases, is a step in the right direction, but the experiment with hardware recommendations shows that we are still far from the ideal. In the world of creative technologies and professional equipment, human experience and subjective, yet test-backed assessment, still win over the algorithm.

I Asked ChatGPT What WIRED’s Reviewers Recommend—Its Answers Were All Wrong

Hallucinations instead of reliable tests

Read also

Why does artificial intelligence fail in advising?

A threat to the authority of tech media

The necessity of returning to sources

More from Industry

Broadcom agrees to expanded chip deals with Google, Anthropic

OpenAI asks California, Delaware to investigate Musk's 'anti-competitive behavior' ahead of April trial

Hope for a U.S.-Iran deal, Apple's anniversary, OpenAI's podcast deal and more in Morning Squawk

AI data center boom ‘stress tests’ insurers as private capital floods in

Related Articles

The Ridiculously Nerdy Intel Bet That Could Rake in Billions

Researchers didn’t want to glamorize cybercrims. So they roasted them

AI agents promise to 'run the business,' but who is liable if things go wrong?

Netflix, Meta, and IBM speakers: AI will make anyone a 10x programmer, but with 10x the cleanup

Comments