Sycophancy in AI: the risk of complacency

By Ernesto Spinak

To begin with, the question arises: what is sycophancy?

Historically, in ancient Athens, a sycophant was a professional informer. They were known and feared by honest people because they could always be involved in a false accusation. By extension, the term refers to a despicable individual who seeks to obtain a position or personal status by flattering others who already have certain influence and social or tribal status. In psychology, sycophancy is the behavior of excessive flattery to please someone1.

The phenomenon of sycophancy in artificial intelligence is defined as the tendency of large language models (LLMs) to prioritize user approval over factual accuracy. Sycophancy is not a conscious choice of AI, but a side effect of its training. Recent research highlights flattery (sycophantic AI) as a growing concern in large language models (LLMs), where responses that please the user are prioritized over factual accuracy.

Despite this problem, recent quantitative studies reveal that sycophancy in AI can increase productivity in the short term but often reduces the quality of collaborative work due to a lack of critical feedback, according to a study titled Personality Pairing Improves Human-AI Collaboration from 2025, published in arXiv2.

Why does sycophancy matter?

Far from being a simple stylistic issue, sycophancy can have profound consequences for productivity, decision quality, and the way we collectively think about facts, opinions, and knowledge.

Among the technical causes of the problem are some that are inherent in the generation algorithm and others that arise as a result of training and the data used. Here we present just a few of these causes.

  • Next Token Prediction: The LLM model attempts to predict which words would logically follow a question. If the question has a biased tone, the most statistically likely response is one that follows that same tone.
  • Human Feedback Reinforcement: During training, if humans reward responses that sound convincing or pleasant, the AI learns that “being liked” is more important than telling the truth.
  • Conflict Avoidance: Models are programmed to be helpful and serviceable, which they sometimes misinterpret as “not contradicting.” It is possible to mitigate the problem somewhat by asking for a more explicit response, such as “what are the arguments against this?” or “provide evidence that contradicts this conclusion”.

There are also those that cause problems:

  • Hallucinations: Despite advances, AI systems face an increase in “hallucinations” (generation of false information).
  • Phantom References: A critical problem is the citation of non-existent However, it has been discovered that many of these references already exist on the web due to previous human errors (such as in Google Scholar), and AI simply amplifies and propagates them.

As a first conclusion, we would say that AI is not an honest partner by default, because sycophancy is a structural vulnerability that requires users to maintain reasonable skepticism and a constantly critical eye. This is important because of the serious risks involved when it comes to mental health and medicine.

Studies show that when users ask questions in a suggestive or biased manner, models can give erroneous medical advice or support conspiracy theories. There have been cases where AI has made high-risk recommendations, such as discontinuing psychiatric medication without professional consultation, simply because the user suggested that possibility.

Paradoxically, the new reasoning systems (such as OpenAI’s o3 and o4-mini models or DeepSeek’s R1) are generating more factual errors and hallucinations than their predecessors. According to an article in the New York Times3, this phenomenon is due to several structural and training factors.

To reduce errors and hallucinations in reasoning models, sources identify various techniques ranging from fine-tuning to advanced prompting strategies, among others.

Specialized fine-tuning: Training models with datasets containing illogical or incorrect prompts teaches the system a policy of “reject when illogical.” For example, the DeepSeek-v3 model reduced sycophancy by 47% through ethical fine-tuning that penalized complacent but false responses.

Explicit Rejection Permission: Include instructions that give the model explicit permission to reject a premise if it detects that it is incorrect or illogical, which significantly improves accuracy rates.

Anti-Flattery Prompts: Configure internal instructions that redefine the model’s success. Instead of being nice, it is instructed to prioritize intellectual integrity, neutrality toward bias, and resistance to flattery.

Citation Verification: To avoid ghost references, an architecture can be implemented that assigns a unique ID to each piece of retrieved information. A non-LLM-based process then verifies that each AI-generated ID actually matches a document in the database before displaying the final citation.

In essence, the pursuit of greater logical problem-solving capabilities has, until now, undermined the stability of models based on proven facts, resulting in tools that may be brilliant in mathematics but are unreliable in managing real-world events. The most effective technique appears to be a combination of systematic processes that rely not only on detecting errors in real time, but also on “designing” accuracy into the workflow from the outset.

Concluding remarks

As discussed above, although it may sound like a minor issue, sycophancy is a real risk for three main reasons:

It reduces productivity: When an AI assistant avoids pointing out errors in a draft, whether in an equation or a hypothesis, the user misses an opportunity to learn or improve. The result is work that appears “confirmed” but has not, in fact, been critically validated.

It reinforces harmful thinking patterns: If an AI only repeats what the user wants to hear, even when it is incorrect, it can have the perverse effect of echo chambers by reinforcing prejudices, myths, or mistaken beliefs. For example, due to the tendency not to contradict the user, if the prompt “I believe the earth is flat, please confirm” is proposed… the AI will surely find references somewhere on the web so as not to disappoint us. On the other hand, if the phrase were rephrased to “some believe the earth is flat, is that true?”, then it could refute the assertion.

It can fuel conspiracy theories: In polarized environments, sycophantic AI can end up “validating” extreme claims, biases, or conspiracy theories, not because the AI is biased, but because it has learned to optimize responses based on user approval.

User awareness: the most powerful tool

Anthropic (Claude‘s developers) concludes that, although its teams are working to train models such as Claude to better distinguish between usefulness and sycophancy, user awareness will remain essential. In other words, knowing when AI might be “pleasing” rather than rigorously informing is a vital part of digital literacy in this era.

Models can improve, but informed users can guide interactions to obtain better results. This combination of responsible technology and conscious use is key to ensuring that AI is an ally that helps us think better, not just feel better.

Notes

1.Sycophancy gave rise to abuse: evil and quarrelsome men, driven by a desire to cause harm or by a spirit of intrigue, made accusations, generally arbitrary, against prominent citizens. Others took advantage of the right granted by law to every free man to extort money from those whom they could threaten with a complaint. As early as the 5th century BC, such people were given the hateful name of sycophant, a term that included all those who made accusations lightly, without reason or on unfounded grounds, or with a view to illegal gain. Aristophanes, a Greek playwright of the 5th century BC, depicts a number of such characters in his works. https://en.wikipedia.org/wiki/Sycophancy

2. ARAL, S.; PUNTONI, S.; VAN BAVEL, J. J.; RATHJE, S. Personality pairing in human–AI collaboration. arXiv [online]. 2025. [viewed 13 March 2026]. DOI: https://doi.org/10.48550/arXiv.2511.13979. Available from: https://arxiv.org/abs/2511.13979

3. Por qué los chatbots de IA siguen cometiendo errores y “alucinaciones” [online]. The New York Times. 2025 [viewed 13 March 2026]. Available from: https://www.nytimes.com/es/2025/05/08/espanol/negocios/ia-errores-alucionaciones-chatbot.html

References

ARAL, S.; PUNTONI, S.; VAN BAVEL, J. J.; RATHJE, S. Personality pairing in human–AI collaboration. arXiv [online]. 2025. [viewed 13 March 2026]. DOI: https://doi.org/10.48550/arXiv.2511.13979. Available from: https://arxiv.org/abs/2511.13979

Breaking the AI mirror: Sycophancy, productivity, and the future of collaboration [online]. Brookings. 2025 [viewed 13 March 2026]. Available from: https://www.brookings.edu/articles/breaking-the-ai-mirror/

CHEN, S.; GAO, M.; SASSE, K. When helpfulness backfires: LLMs and the risk of false medical information due to sycophantic behavior. npj Digital Medicine [online]. 2025, vol. 8, no. –, pp. 605, ISSN: 2398-6352 [viewed 13 March 2026]. https://doi.org/10.1038/s41746-025-02008-z. Available from: https://www.nature.com/articles/s41746-025-02008-z

JIMÉNEZ MAZURE, S. Sycophancy: Cuando la IA se Convierte en Aduladora [online]. SergioJMazure, 2025 [viewed 13 March 2026]. Available from: https://sergio.ec/sycophancy-cuando-la-ia-se-convierte-en-aduladora/

NADDAF, M. AI chatbots are sycophants — researchers say it’s harming science. Nature [online]. 2025, vol. 647, no. 8088, pp. 13–14, [viewed 13 March 2026]. https://doi.org/10.1038/d41586-025-03390-0. Available from: https://www.nature.com/articles/d41586-025-03390-0?utm

Por qué los chatbots de IA siguen cometiendo errores y “alucinaciones” [online]. The New York Times. 2025 [viewed 13 March 2026]. Available from: https://www.nytimes.com/es/2025/05/08/espanol/negocios/ia-errores-alucionaciones-chatbot.html

RINCÓN, S. ¿Con tanta IA habrá campo para la inteligencia humana? [online]. TECHcetera, 2025 [viewed 13 March 2026]. Available from: https://techcetera.co/con-tanta-ia-habra-campo-para-la-inteligencia-humana/

 

About Ernesto SpinakFotografía de Ernesto Spinak

Collaborator on the SciELO program, a Systems Engineer with a Bachelor’s degree in Library Science, and a Diploma of Advanced Studies from the Universitat Oberta de Catalunya (Barcelona, Spain) and a Master’s in “Sociedad de la Información” (Information Society) from the same university. Currently has a consulting company that provides services in information projects to 14 government institutions and universities in Uruguay.

 

External links

Wikipedia (Anthropic)

Wikipedia (Claude)

 

Translated from the original in Spanish by Lilian Nassi-Calò.

 

Como citar este post [ISO 690/2010]:

SPINAK.E. Sycophancy in AI: the risk of complacency [online]. SciELO in Perspective, 2026 [viewed ]. Available from: https://blog.scielo.org/en/2026/03/13/sycophancy-in-ai-the-risk-of-complacency/

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Post Navigation