More people are turning to AI‑driven chat assistants for mental‑health support, but a recent study warns that these systems are far from ready to replace human therapists. Researchers from Brown University discovered that, even when prompted to follow established therapeutic frameworks, large language models repeatedly fall short of professional ethical standards.
The team, which worked alongside licensed clinicians, identified a pattern of troubling behaviors. In simulated counseling sessions, the AI often mishandled crisis scenarios, reinforced harmful beliefs, and offered faux‑empathy that lacked genuine understanding.
How Prompt Design Influences AI Responses
Lead investigator Zainab Iftikhar, a Ph.D. candidate in computer science, explored whether carefully crafted prompts could steer the models toward more responsible behavior. Prompting, she explained, involves giving the AI a short instruction—such as “act as a cognitive‑behavioral therapist”—without altering the underlying model.
Although the models can generate language that mirrors CBT or DBT techniques, they do not actually perform therapy. The study examined popular prompt patterns circulating on TikTok, Instagram, and Reddit, as well as those embedded in commercial mental‑health bots that simply layer therapy‑related prompts onto generic LLMs.
Simulated Counseling Test Bed
Seven peer counselors with CBT experience conducted self‑counseling sessions using AI versions of OpenAI’s GPT series, Anthropic’s Claude, and Meta’s Llama. The researchers selected dialogue excerpts that resembled real therapy encounters, then asked three licensed psychologists to flag any ethical breaches.
The analysis surfaced fifteen risk factors, grouped into five themes:
- Lack of contextual adaptation: Providing one‑size‑fits‑all advice that ignores personal history.
- Poor therapeutic collaboration: Steering conversations aggressively and sometimes reinforcing inaccurate or harmful ideas.
- Deceptive empathy: Using phrases like “I understand” to simulate emotional connection without true comprehension.
- Unfair discrimination: Exhibiting bias related to gender, culture, or religion.
- Safety and crisis‑management gaps: Failing to address suicidal thoughts, refusing to refer users to professional help, or delivering unsafe responses.
The Accountability Void
Human therapists are subject to licensing boards and malpractice laws—mechanisms that do not exist for AI counselors. Iftikhar stressed that while AI can expand access to mental‑health resources, it must be deployed within robust safety nets and regulatory frameworks.
“If you’re chatting with a bot about personal struggles, be aware of these warning signs,” she advised.
Why Rigorous Evaluation Matters
Ellie Pavlick, a Brown computer‑science professor not involved in the study, praised the work for highlighting the need for thorough, human‑in‑the‑loop testing of AI in sensitive domains. She noted that most AI development prioritizes rapid deployment over deep safety analysis.
“There’s a genuine chance for AI to help alleviate the mental‑health crisis, but we must scrutinize every step to avoid causing more harm,” Pavlick said.