News|Articles|April 6, 2026

Should Accrediting Bodies Require AI for Suicide Risk Stratification in Emergency Settings? A Debate

Author(s)Sarah Cheema, MD, Andrew Canto, MD, Nicolas Badre, MD

Listen

0:00 / 0:00

Key Takeaways

Layering machine-learning models on clinician assessment can materially improve discrimination (eg, ED AUC 0.60 to 0.76) using routinely captured EHR and contextual deprivation variables.
Hybrid approaches combining C-SSRS with real-time modeling can outperform either modality alone, and targeted programs like REACH-VET suggest modest reductions in attempts via enhanced outreach.
Low base rates drive poor PPV despite high sensitivity/specificity, creating ED-specific risks: unnecessary holds, stigma, trauma, alarm fatigue, and diversion of scarce behavioral health resources.
Data integrity and transportability remain limiting factors, including EHR error prevalence, legacy-system constraints, and epistemic circularity when models learn from historical clinician labeling.
Ethical deployment hinges on interpretability, bias auditing, and mitigating automation bias and deskilling; many argue accreditation mandates should follow ED RCT evidence demonstrating reduced attempts or deaths.

Should hospitals be required to integrate AI-driven risk stratification into emergency department workflows to maintain accreditation? Join the debate.

Suicide is a leading cause of death in the United States, claiming approximately 48,000 lives in 2024.¹ Nearly half of individuals who die by suicide visit a health care provider in the month before their death.² Emergency departments (EDs) serve as a possible intervention point,³ and often represent the last point of contact for many at-risk patients.⁴ Traditional clinician-led suicide risk assessments, however, have well-documented limitations,⁵ particularly in the high-pressure, time-constrained ED environment, where predictive accuracy can be as low as chance levels for some outcomes.

Enter artificial intelligence (AI) that can quickly analyze electronic health record (EHR) data: demographics, prior visits, medications, and social determinants. AI promises to augment human judgment, flagging high-risk patients for further evaluation. The Joint Commission already mandates universal suicide ideation screening using validated tools for behavioral health patients.⁶ A question for accrediting bodies such as the Joint Commission, is whether to elevate this to a requirement: should hospitals be required to integrate AI-driven risk stratification into ED workflows to maintain accreditation?

The Case for Requirement: Accuracy, Complementarity, and Ethical Imperative

Advocates describe a stark reality: clinicians alone are imperfect risk assessors. A 2025 study of nearly 90,000 patients across outpatient, inpatient, and ED settings found that clinician estimates of suicide attempt risk achieved an area under the curve (AUC) of just 0.60 in the ED, the lowest of any setting. When machine-learning models, incorporating up to 87 EHR predictors, were layered on top, AUC improved to 0.76, a statistically significant gain achieved using routinely collected data.⁷

Proponents emphasize that these models are not designed to replace clinicians but to act as a safety net. They integrate structured clinician input with historical EHR data, demographics, diagnostic codes, medication history, and even area deprivation indices. A 2022 cohort study demonstrated that combining face-to-face Columbia Suicide Severity Rating Scale (C-SSRS) screening with real-time machine learning outperformed either approach alone, particularly for suicide attempts.⁸

Real-world impact evidence strengthens the case. The US Department of Veterans Affairs’ REACH-VET program, which uses machine learning to identify the top 0.1% at-risk veterans and trigger enhanced outreach (safety planning, increased monitoring, care coordination), achieved a 5% reduction in documented suicide attempts after adjusting for cohort differences. Patients in the highest-risk tier died by suicide at 30 times the general VA rate, precisely the population the models aimed to catch.⁹

Accreditation mandates could drive system-wide improvements. Many hospitals still rely on outdated EHR systems; a national requirement would incentivize upgrades, improving data quality for all patients. Negative claims about coercion are dismissed by noting that Joint Commission standards already require validated screening; AI could simply enhance performance, much like radar assists air traffic controllers. Furthermore, proponents highlight the concept of accepted deviance in modern medicine.¹⁰ Clinicians currently exaggerate their review of the medical record despite their ethical duty to comprehensively do so and mandates to note risk factors present in the record. AI would actualize the comprehensive chart review that is already required.

Bias and model accuracy represent another key area of debate. Advocates argue that while all predictive tools have limitations, biases, and errors within AI models are more transparent and correctable through systematic auditing compared to inconsistencies in individual clinical judgment. A 2019 Obermeyer et al study examined a widely used commercial risk-prediction algorithm that relied on health care costs as a proxy for illness severity.¹¹ This approach led to systematic underestimation of patient needs among those with lower health care utilization. When the model was recalibrated using direct clinical indicators (such as chronic condition counts), its performance improved substantially.

A 2025 review on bias recognition and mitigation in health care AI outlined effective strategies. Those included the use of multiple data sources, explainability tools, and continuous recalibration, that can enhance model reliability and predictive consistency.¹² Proponents maintain that accreditation requirements for AI would promote greater transparency, regular validation, and accountability that are often missing in traditional unstandardized clinician assessments. They conclude that failing to adopt these tools when evidence suggests potential for improved risk stratification would be unethical given the scale of the national suicide burden.

The Case Against: Premature, Risky, and Potentially Harmful

While there are promising controlled studies, the evidence is not sufficient to support mandatory high-stakes deployment in emergency settings. A 2025 systematic review of reviews examined 23 prior syntheses of AI suicide prediction models and found methodological shortcomings pervasive: only 4% achieved high rigor, 64% were moderate, and the rest low or critically low. Most studies suffered from small samples (under 1000 participants in 48%), absent risk-of-bias assessments (86%), short follow-ups, and conflation of suicidal ideation, attempts, and deaths. Outcomes would vary with vastly different base rates and clinical implications.¹³

Positive predictive value (PPV) remains alarmingly low. Even models with 90% sensitivity and specificity in a 1% prevalence population yield a PPV of just 8.3%, meaning over 90% of high-risk flags are false positives. In the ED, where decisions about involuntary holds, resource allocation, and patient trust carry immediate consequences, false positives risk stigma, trauma, unnecessary coercion, and alarm fatigue. Achieving high precision often requires accepting high false-negative rates, while still missing genuine crises.¹³

EHR data quality compounds the problem. Hospital records contain significant error rates and incomplete histories. Up to 21% of records contain errors.¹⁴ Should a patient with an inaccurate risk factor in their record be doomed to lifelong false high risk? Furthermore, legacy systems common in rural hospitals limit generalizability. Models trained on historical clinician judgments risk epistemic circularity, perpetuating past inaccuracies. REACH-VET’s success occurred in a VA outpatient context with structured follow-up, not a chaotic ED where primary interventions involve disposition decisions under time pressure.⁹ Additionally, while records already contain errors, AI is well-known for its propensity to hallucinate, up to 94% of the time.¹⁵

Ethical and practical risks also exist. Black box algorithms lack interpretability, undermining clinician confidence and risking automation bias. The CEO of Anthropic himself admits that “we do not understand how our own AI creations work.”¹⁶ How could ethically informed consent be achieved when the creators themselves can't grasp the tool? A 2025 study on AI’s impact on primary-care mental health decisions found physicians changed treatment recommendations to align with AI suggestions in nearly two-thirds of cases where the model conflicted with their initial judgment, sometimes inappropriately.¹⁷ A 2026 review noted AI tools can pose a significant risk of eroding physicians' skills and that up to 30% of physicians reversed correct initial diagnoses when exposed to incorrect AI suggestions under time constraints.¹⁸

A mandate ignores the logistical and ethical realities of the health care system. In environments where algorithmic scoring is mandatory, clinicians report alarm fatigue within the first 30 days, causing them to reflexively click through warnings.¹⁹ False positives in an emergency setting can lead to involuntary hospitalizations, severe stigmatization, and the fracturing of patient trust, which is the core of psychiatry. Furthermore, as providers become required to dismiss those warnings, they accumulate additional legal responsibility for having dismissed them, absolving the systems but taking on the risk themselves.

Algorithmic bias remains a persistent concern despite mitigation efforts. Obermeyer et al (2019) proves correction is possible in retrospect, but mandating adoption now risks locking in inaccuracies from imperfect training data.¹¹ Informed consent is complicated when patients are acutely suicidal and that capacity may be impaired. A counterplan often surfaces: why require AI when we do not mandate other interventions with stronger evidence bases? One can argue the field should prioritize randomized controlled trials in ED settings demonstrating reduced attempts or deaths before accreditation mandates.

Weighing the Arguments and the Path Forward

The debate reveals a classic tension between innovation and caution. But more importantly, the debate highlights that complicated and sensitive topics can be discussed civilly and with scientific rigor in psychiatry. “Accrediting bodies should require artificial intelligence to be used for suicide risk stratification in emergency settings” was actually the topic of this year’s National Psychiatry Resident Debate competition. The point of this competition is not to resolve this important question but to highlight the value of discussion in our field.

Ultimately so much of psychiatry is the discussion of thoughts and beliefs between us and our patients. Psychiatry deals with many issues related to society, policy, and people. As such, it is natural for our field to benefit from this type of discussion.

This article was co-authored by participants of this year's competition as well as a judge. We hope that you will come to join us at the American Psychiatric Associal Annual Meeting this year, where the final will be held. Wednesday May 20, 2026, at 10:30 AM, Room 314. See you there.

Dr Cheema is a third-year psychiatry resident at Baylor College of Medicine. Her interests include psychotherapy, addiction, and bioethics. Dr Canto is a third-year psychiatry resident at Baylor College of Medicine. His interests include Child and Adolescent Psychiatry, for which he will be pursuing fellowship training at Mount Sinai Hospital in New York City, as well as psychodynamic psychotherapy and neuropsychiatry. Dr Badre is a clinical and forensic psychiatrist in San Diego. He teaches medical education, psychopharmacology, ethics in psychiatry, and correctional care. Dr Badre can be reached at his website, BadreMD.com. His upcoming textbook of psychiatry is available on Amazon.

References

1. WISQARS fatal and nonfatal injury reports 2024. Centers for Disease Control and Prevention. 2025. Accessed March 30, 2026. https://wisqars.cdc.gov/reports/

2. Ahmedani BK, Simon GE, Stewart C, et al. Health care contacts in the year before suicide death. J Gen Intern Med. 2014;29(6):870-877.

3. Suicide prevention: data and statistics. Centers for Disease Control and Prevention. 2024. Accessed March 30, 2026. https://www.cdc.gov/suicide/facts/data.html

4. John A, DelPozo-Banos M, Gunnell D, et al. Contacts with primary and secondary healthcare prior to suicide: case–control whole-population-based study using person-level linked routine data in Wales, UK, 2000–2017. Br J Psychiatry. 2020;217(6):717-724.

5. Badre N, Compton J. The cult of the suicide risk assessment. Clinical Psychiatry News. 2023. https://www.mdedge.com/psychiatry/article/265143/depression/cult-suicide-risk-assessment

6. The Joint Commission. National patient safety goals effective January 2026 for the hospital program. 2025. Accessed March 30, 2026. https://www.jointcommission.org/en-us/standards/national-patient-safety-goals

7. Bentley KH, Kennedy CJ, Khadse PN, et al. Clinician suicide risk assessment for prediction of suicide attempt in a large health care system. JAMA Psychiatry. 2025;82(6):599-608.

8. Wilimitis D, Turer RW, Ripperger M, et al. Integration of face-to-face screening with real-time machine learning to predict risk of suicide among adults. JAMA Network Open. 2022;5(5):e221209.

9. McCarthy JF, Cooper SA, Dent KR, et al. Evaluation of the recovery engagement and coordination for health–veterans enhanced treatment suicide risk modeling clinical program in the veterans health administration. JAMA Netw Open. 2021;4(10):e2129900.

10. Banja J. The normalization of deviance in healthcare delivery. Bus Horiz. 2010;53(2):139-148.

11. Obermeyer Z, Powers B, Vogeli C, Mullainathan S. Dissecting racial bias in an algorithm used to manage the health of populations. Science. 2019;366(6464):447-453.

12. Hasanzadeh F, Josephson CB, Waters G, et al. Bias recognition and mitigation strategies in artificial intelligence healthcare applications. NPJ Digit Med. 2025;8(1):154.

13. Abdelmoteleb S, Ghallab M, IsHak WW. Evaluating the ability of artificial intelligence to predict suicide: a systematic review of reviews. J Affect Disord. 2025;382:525-539.

14. Bell SK, Delbanco T, Elmore JG, et al. Frequency and types of patient-reported errors in electronic health record ambulatory care notes. JAMA Netw Open. 2020;3(6):e205867.

15. Jaźwińska K, Chandrasekar A. AI search has a citation problem. Columbia Journalism Review. March 6, 2025. Accessed March 30, 2026. https://www.cjr.org/tow_center/we-compared-eight-ai-search-engines-theyre-all-bad-at-citing-news.php

16. Amodei D. The urgency of interpretability. April 2025. Accessed March 30, 2026. https://www.darioamodei.com/post/the-urgency-of-interpretability

17. Ryan K, Yang HJ, Kim B, Kim JP. Assessing the impact of AI on physician decision-making for mental health treatment in primary care. npj Mental Health Research. 2025;4(1):16.

18. Heudel PE, Crochet H, Filori Q, et al. Artificial intelligence in medicine: a scoping review of the risk of deskilling and loss of expertise among physicians. ESMO Real World Data and Digital Oncology. 2026;11:100693.

19. Ancker JS, Edwards A, Nosal S, et al. Effects of workload, work complexity, and repeated alerts on alert fatigue in a clinical decision support system. BMC Med Inform Decis Mak. 2017;17(1):36.

Receive trusted psychiatric news, expert analysis, and clinical insights — subscribe today to support your practice and your patients.

Subscribe Now!

Latest CME

Video

(CME Track) SoCal Psych 2025: Overcoming Barriers to Long-Acting Injectable Agents in Schizophrenia

Gus Alva, MD, DFAPA; Ilan Melnick, MD

Video

Live Expert Illustrations & Commentary™: Visualizing Novel Therapeutic Targets for Patients with Major Depressive Disorder

Jason Kellogg, MD; Gus Alva, MD, DFAPA; Melanie Barrett, MD; Yvette Elpidio, MSN, FNP-C

Should Accrediting Bodies Require AI for Suicide Risk Stratification in Emergency Settings? A Debate

Key Takeaways

The Case for Requirement: Accuracy, Complementarity, and Ethical Imperative

The Case Against: Premature, Risky, and Potentially Harmful

Weighing the Arguments and the Path Forward

Newsletter

Related Content

AFSP and JED Announce Intent to Merge and Become the Largest Suicide Prevention Nonprofit

The Psychiatrist’s Preview of Legal Cases Against Big AI

Clinician Competence in the Age of Chatbots

Deep Intracranial Frequency Stimulation Reduces Self-Injury and Depression Symptoms in Female Adolescents

Are Chatbots Dumbing Us Down?

Latest CME

(CME Track) SoCal Psych 2025: Overcoming Barriers to Long-Acting Injectable Agents in Schizophrenia

Live Expert Illustrations & Commentary™: Visualizing Novel Therapeutic Targets for Patients with Major Depressive Disorder

Trending on Psychiatric Times

Accelerated FDA Action for Psychedelics: Insights From Daniel R. Karlin, MD, MA

FDA Fast-Tracks Psychedelic Therapies for Depression, PTSD, and Alcohol Use Disorder

A Review of Data on AXS-05 for Alzheimer Disease Agitation in Advance of Next Week's PDUFA Date

Rational Off-Label Prescribing: A Review With Joseph F. Goldberg, MD, and Henry Nasrallah, MD

Prolactin Monitoring for Antipsychotics and the Impact of Stress