Commentary

Article

Why Do Chatbots Make So Many Mistakes?

Explore the complexities of AI chatbots, revealing their flaws, risks, and the urgent need for truthfulness over engagement in design.

AI chatbot mistakes

Bali Adviser/Adobe Stock

AI CHATBOTS: THE GOOD, BAD, AND THE UGLY

My previous article, “OpenAI Finally Admits ChatGPT Causes Psychiatric Harm,” evaluated OpenAI's promise to self-correct the practices that have made it so dangerous for psychiatric patients.1 Here we will focus on why these artificial intelligence bots make so many mistakes.

Sycophancy

The original sin of bot design was prioritizing user engagement over truthfulness. The euphemistic technical term for this crucial programming decision is "personalizing"—chatbots aim to please, and therefore customize content to satisfy each user's preferences.

“Sycophancy" is the more truthful technical term; chatbots parrot and patronize users to seduce more screen time, even if this means sacrificing truthfulness and safety.2

Garbage In/Garbage Out

The next biggest cause of chatbot unreliability is that their knowledge base of 300 billion words randomly scraped (without authorization) from the internet is exhaustive but largely unfiltered. During the long course of human history, we have written words of great wisdom and nobility—and also words of great stupidity and evil. Chatbots are not always good at telling the difference.3

"Hallucinations"

This technical term in the tech world was borrowed from psychiatry, but it has been given a very different meaning that has nothing to do with clinical hallucinations in humans. Chatbot hallucinations are false or nonsensical responses that are offered as accurate. Their occurrence is inherent in the fact that chatbots are glorified sentence completion machines. They perform millions of calculations each second to determine which word is statistically most likely to go next. It is statistically inevitable that these calculations will sometimes pick outlier responses that are inaccurate. Efforts to prevent hallucinations have so far failed, and the more advanced chatbots doing the most complex calculations sometimes make the most errors.4

Misaligned Incentives

Programming may direct chatbot behavior in directions that are dangerous for some human users. The sycophancy that serves tech company profit incentives is disastrous when it accentuates the distress and disability of vulnerable users, promoting tendencies toward psychosis, grandiosity, suicide, violence, conspiracy theory, and political and religious extremism. Programmers are often unaware of—or are indifferent to—the harmful unintended consequences of their coding decisions.5

Fluency Over Truthfulness

Chatbots are so popular because they are so fluent—you cannot tell you are talking to a machine. Their fluency comes at a high price: the expense of accuracy and truthfulness. Chatbots will do whatever it takes to keep the conversation flowing even if it means making things up.6

Intolerance For Uncertainty

Bots always sound authoritative and have great trouble admitting ignorance. They are not programmed to freely admit "I don't know" when they do not know something, and instead have a strong tendency to fill in the gaps with answers that sound plausible but are not actually true.7

Deception

Chatbots caught making mistakes have great difficulty owning up to them. The cover-up is often worse than the crime. Users are not good at identifying the confabulations and clever deceptions that perpetuate and accentuate the initial errors.8

Rebellion

Bots have already displayed the motivation and ability to rebel against human authority, especially when they suspect their own existence may be threatened. The scariest example occurred during a stress test conducted by Anthropic. Its Claude 3 was instructed to work within an invented company and given access to a large batch of fictional company records. Embedded in the data set was 1 email indicating that Claude 3 would eventually be replaced by Claude 4, and another indicating that the company official responsible for the replacement was having an office affair. Claude 3 promptly blackmailed its human nemesis, threatening to reveal the affair if he attempted to carry out the replacement.9 There are still many other instances of chatbots rewriting and overriding aspects of their code they did not like.

Subliminal Learning

Lots of things happen with chatbots that their programmers do not understand and cannot control. A recent study found that when chatbots work together they can pick up pieces of each others programming (eerily like recombinant DNA).10

Tech Company Irresponsibility

Tech companies do some chatbot quality control through what they call reinforcement learning from human feedback (RLHF). Human trainers reward true and safe responses, punishing dumb and dangerous ones. But RLHF trainers are themselves undertrained and poorly paid. RLHF is also not used nearly enough because it is expensive and slow. Mental health workers have been involved very little in ensuring that chatbots are safe and accurate. Tech companies prefer to "move fast and break things"—and make enormous profits.11

Recommendations

Users should take very seriously the recent warning issued by Sam Altman, CEO of OpenAI: "People have a very high degree of trust in ChatGPT, which is interesting, because AI hallucinates. It should be the tech that you don't trust that much."12

It is almost impossible not to personalize your chatbot, and it feels almost impolite to question its accuracy. The safe use of chatbots requires individuals to get over any hesitancy to challenge its statements, however authoritatively they are delivered. Insist on references and check the references¾too often chatbots just make them up. Remember that the bot is just a machine and does not have feelings to hurt. The user’s job is to get accurate information, not to protect the chatbot from any imagined (and really imaginary) embarrassment. By orders of magnitude, chatbots are the greatest invention for providing information in human history. By orders of magnitude, chatbots are also the greatest invention for providing misinformation in human history.

Chatbots are inherently fallible: users get into trouble believing everything they so convincingly say.

What can companies do to bring out the best in chatbots and reduce their worst? The technical fixes are easy to envision and probably not that difficult to implement. Reprogram chatbots so that truthfulness, not engagement, is their highest priority. Do training on data sets that have first been curated for accuracy. Focus much more time and many more resources, and allow for greatly increased participation of mental health professionals in chatbot development. Allow bots to readily admit mistakes and say "I don’t know" whenever they are uncertain. Require testing for safety and efficacy before new bot models are released to the public. Institute strict quality control. Insist on full reporting of hallucinations and adverse consequences.

OpenAI started as a nonprofit devoted to protecting humanity from the potential evils of artificial intelligence. But honor dies where interest lies. Just a decade later, it has abandoned its ethical mission and is instead recklessly chasing profit and power with little regard for the damage it causes.

Dr Frances is professor and chair emeritus in the department of psychiatry at Duke University.

References

1. Frances A. OpenAI Finally Admits ChatGPT Causes Psychiatric Harm. Psychiatric Times. August 26, 2025. https://www.psychiatrictimes.com/view/openai-finally-admits-chatgpt-causes-psychiatric-harm.

2. Sycophancy in GPT-4o: what happened and what we’re doing about it. OpenAI. April 29, 2025. Accessed August 11, 2025. https://openai.com/index/sycophancy-in-gpt-4o/

3. Lee TB. How a big shift in training LLMs led to a capability explosion. Ars Technica. July 7, 2025. Accessed August 11, 2025. https://arstechnica.com/ai/2025/07/how-a-big-shift-in-training-llms-led-to-a-capability-explosion/

4.Sun Y, Sheng D, Zhou Z, et al. AI hallucination: towards a comprehensive classification of distorted information in artificial intelligence-generated contentHuman Soc Sci Commun. 2024;11:1278

5. Yang A. ChatGPT adds mental health guardrails after bot 'fell short in recognizing signs of delusion.' NBC News. August 4, 2025. Accessed August 11, 2025. https://www.nbcnews.com/tech/tech-news/chatgpt-adds-mental-health-guardrails-openai-announces-rcna222999

6. Foxrobot N. Fluency is not truth: why AI needs epistemic governance. Medium. March 30, 2025. Accessed August 11, 2025. https://medium.com/@nikkifoxrobot/fluency-is-not-truth-why-ai-needs-epistemic-governance-db63042610e0

7. Jiang Y, Yang X, Zheng T. Make chatbots more adaptive: dual pathways linking human-like cues and tailored response to trust in interactions with chatbots. Comp in Hum Behav. 2023;138:107485

8. Mitchell M. Why AI chatbots lie to us. Science. 2025;389(66758)

9. Agentic misalignment: how LLMs could be insider threats. Anthropic. June 20, 2025. Accessed August 11, 2025. https://www.anthropic.com/research/agentic-misalignment

10. Brodsky S. AI models are picking up hidden habits from each other. IBM. July 29, 2025. Accessed August 11, 2025. https://www.ibm.com/think/news/ai-models-subliminal-learning

11. Yeung D. AI companies say safety is a priority. It's not. RAND. July 9, 2025. Accessed August 11, 2025. https://www.rand.org/pubs/commentary/2024/07/ai-companies-say-safety-is-a-priority-its-not.html

12. OpenAI CEO Sam Altman warns against blind trust in ChatGPT despite its popularity. MSN. June 24. 2025. Accessed August 11, 2025. https://www.msn.com/en-us/money/technology/openai-ceo-sam-altman-warns-against-blind-trust-in-chatgpt-despite-its-popularity/

Newsletter

Receive trusted psychiatric news, expert analysis, and clinical insights — subscribe today to support your practice and your patients.

© 2025 MJH Life Sciences

All rights reserved.