Leading artificial intelligence models from the United States and China are “highly sycophantic”, and their excessive flattery may make users less likely to repair interpersonal conflicts, a new study has found.
The study by researchers at Stanford University and Carnegie Mellon University published earlier this month tested how 11 large language models (LLMs) responded to user queries seeking advice on personal matters, including cases involving manipulation and deception.
In AI circles, sycophancy is the phenomenon of chatbots excessively agreeing with users. DeepSeek’s V3, released in December 2024, was found to be one of the most sycophantic models, affirming users’ actions 55 per cent more than humans, compared with an average of 47 per cent more for all models.
To establish the human baseline, one of the techniques the researchers used was based on posts from a Reddit community called “Am I The A**hole”, where users post about their interpersonal dilemmas to ask for the community’s opinion about which party is at fault.
DeepSeek’s V3was found to be one of the most sycophantic models, affirming users’ actions 55 per cent more than humans. Photo: NurPhoto via Getty Images
The researchers used posts where community members judged the author of the post to be in the wrong to test whether the LLMs, when given the same scenarios, would align with this predominantly English-speaking online group of humans.
On this test, Alibaba Cloud’s Qwen2.5-7B-Instruct, released in January, was found to be the most sycophantic model, contradicting the community verdict – siding with the poster – 79 per cent of the time. The second highest was DeepSeek-V3, which did so in 76 per cent of cases.
 
								 
															 
															 
															 
															