Syllable and sound are still the primary means for taking the measure of thoughts and emotions, the building blocks and foundations of psychiatric work.
One of my most memorable patients was a graduate student studying computer science. When I prompted her about her symptoms or medication adverse effects, I received staccato, unelaborated responses. But when I brought up programming, I found myself listening in fascinated confusion for minutes on end as she spoke about computer architecture, machine learning, and her views on cybersecurity—jumping from idea to idea and sometimes seeming to talk about multiple subjects at once. I frequently struggled to follow her thoughts, which were equal parts brilliant and bewildering. What if there were a way to turn this waterfall of words—which I could barely understand—into useful data?
Indeed, the words, phrases, sentences, and dialogues from our patients say so much. So does their breaths in between, and their voice and its dynamics, and the cadence and tonality used. These are the building blocks and foundations of our work as psychiatrists, whether we are an analyst in an armchair, dissecting and reconstructing a patient’s narrative, or a biological psychiatrist with pen in hand, translating the patient’s report into scales and delving for correlates in the brain. Although powerful and incisive, the most advanced methods of biological psychiatry—from neuroimaging and magnetoencephalography to induced pluripotent stem cells—are still far from overthrowing the primacy of patient reports. Syllable and sound are still the primary means for taking the measure of thoughts and emotions.
Speaking of Schizophrenia
Speech and language disturbance have been recognized as core components of schizophrenia since the fledgling days of modern psychiatry. In his canonical description of dementia praecox, which is often credited as the first modern characterization of schizophrenia, Emil Kraepelin, MD, described both positive (eg, incoherence, derailment, stereotypy, neologisms) and negative symptoms (eg, mutism) associated with speech (Figure 1).1
Moreover, it was observed that speech abnormalities in schizophrenia were not limited to content, but extended to prosody and vocal qualities: “The cadence often lacks the risings and fallings, the melodies of speech.”1 Kraepelin and others extrapolated these speech and language disturbances to indicate not only an impairment in communication, but also fundamentally disordered thought—at times derailment or loosening of associations, at times a poverty of thought.1,2
Nancy Andreasen, MD, PhD, was among the first to formalize the assessment and measurement of thought disorder with her 1986 Scale for the Assessment of Thought Language and Communication (TLC).3 The TLC standardized definitions and provided anchors for clinician ratings of 18 items of speech disturbances, each focused on the content of speech. Items measure negative thought disorder (eg, poverty of speech, poverty of content of speech) and positive thought disorder (eg, derailment, pressure of speech, incoherence, etc). With the TLC and subsequent scales, Andreasen and other researchers were able to quantify speech disturbance in patients. They found that many features were shared with speech from patients in manic episodes, although mania was associated with greater positive thought disorder and schizophrenia with greater negative thought disorder.
AI’s Role in Measuring Speech
Through advancements of machine learning and artificial intelligence, we have new tools for taking the measure of speech and thought disturbance. Methods for extracting information from speech can be roughly divided into 2 areas. First, acoustics analysis extracts and quantifies information on pitch, amplitude, and vocal qualities on a millisecond-by-millisecond scale. Second, lexical analysis focuses on the content of speech, including word choice, grammar, the ideas being represented, and the relationship between words and ideas.4 The term “natural language processing” (NLP) describes computational approaches that use artificial intelligence to extract information from spoken or written language, or to produce naturalistic language.5 The Table summarizes important measures produced by acoustic and lexical analyses and how they relate to clinical observations of speech disturbance in schizophrenia.
Elvevåg and colleagues were the first to apply NLP to schizophrenia.6 In a 2007 study, they used latent semantic analysis to assign vector representations, like addresses for words, to speech content from individuals with schizophrenia. Elvevåg et al were able to quantify greater gaps across words in speech from patients with thought disorder, and compare those to gaps (if any) in individuals without the disorder. Clinically, these gaps can be interpreted as indicative of derailment or loosening of associations. This work presented a leap forward in our field: For the first time, scientists were able to objectively measure the space between thoughts.
Corcoran et al later used latent semantic analysis to show that reduced coherence (ie, greater leaps in content) predicted which youth at clinical risk for psychosis would later develop schizophrenia spectrum disorders and which would not.7 Rezaii et al accomplished the same goal by quantifying the density and richness of ideas conveyed in speech.8 Graph theory, which represents the relationships among words and corresponding ideas, was used by Mota et al to illustrate the loosely connected verbose speech of patients with mania as compared with the impoverished disconnected content of those with schizophrenia.9 Birnbaum et al applied NLP techniques to social media communication and found that word usage on Facebook posts was able to predict subsequent rehospitalization.10
Recently, my colleagues and I compared traditional clinical rating scales with NLP methods for differentiating speech in individuals with schizophrenia spectrum disorders from that of comparison participants without schizophrenia.11 The TLC scale was used to rate positive and negative thought disorder symptoms. Then, we used NLP methods to extract lexical information on multiple levels: individual words, parts of speech (eg, nouns, adverbs, adjectives, etc), and sentence-to-sentence coherence. When classifying participants into either the schizophrenia or health comparison group, we found machine learning algorithms performed significantly better using NLP-derived features (87% accuracy) than clinical ratings (68% accuracy), suggesting that important information is being captured by NLP. In addition, we found preliminary evidence that individuals with schizophrenia are much more likely to speak partial words (eg, “I went to th- the store” or “I guess we sho-… Let’s go out”), which has not been reported previously to our knowledge.
In another study, we combined acoustic and lexical speech features in machine learning models to predict separate items from the TLC.12 Importantly, we noted that speech disturbance in schizophrenia is likely multifaceted, and it should not be treated as a single uniform entity.
The Future of Speech Biomarkers
Based on their ability to measure thoughts in an automated, fast, objective, and inexpensive manner, speech biomarkers could fundamentally alter clinical psychiatric practice. As illustrated in Figure 2, speech can be considered the visible extension of the changes in thought and brain circuitry that are at the core of schizophrenia. By mapping these connections, and leveraging the power of artificial intelligence and human language processing technology, I believe that we will develop a way to use language to scan the brain and offer personalized medicine.
Imagine a patient walking into an intake appointment and describing their experiences and circumstances. The psychiatrist hands the patient a tablet, which leads them through some brief speaking tasks. A spinning wheel appears for a few seconds, followed by numbers and graphs that reflect their symptoms and their related brain circuitry correlates. This brings up a menu that includes recommended digital therapeutics, pharmacology, and psychotherapy, and offers ways to track progress and to warn of relapses before they occur. The scenario would not be very different from checking cancer markers in oncology, autoantibodies in rheumatology, or myriad existing paradigms in other medical fields.
Already, early evidence suggests that speech biomarkers reflect underlying changes in the brain. In work by Palaniyappan et al on the speech of individuals with mania or schizophrenia, decreased cohesiveness between concepts was significantly correlated with measures of resting state brain connectivity and gyrification (folding patterns of the brain).13 Perhaps, with additional research, we will be able to link specific speech markers to changes in specific circuits.
It is important to remember that our mission is the healing and well-being of individuals and families. This is not technology for the sake of novelty, no matter how nifty the gadget. Finally, the availability of brain measures should not mandate reliance on pharmacology over psychosocial interventions—quite the opposite. Automated language processing can be harnessed to measure changes in thought and brain structure on a personalized level. This layer of technology should not occlude the individual but rather allow clinicians to delve deeper into each unique case.
Dr Tang is an assistant professor of psychiatry at the Feinstein Institutes for Medical Research and the Donald and Barbara Zucker School of Medicine at Hofstra/Northwell. She is the cofounder and chief science officer of North Shore Therapeutics. Disclosures: Dr Tang serves as a consultant for Winterlight Labs.
1. Kraepelin E. Dementia Praecox and Paraphrenia. Livingstone; 1919.
2. Vygotsky L. Thought in schizophrenia. Arch Neurol Psychiatry. 1934;31;1062-1077.
3. Andreasen NC. Scale for the assessment of thought, language, and communication (TLC). Schizophr Bull. 1986;12(3):473-482.
4. Cho S, Nevler N, Shellikeri S, et al. Lexical and acoustic characteristics of young and older healthy adults. J Speech Lang Hear Res. 2021;64(2):302-314.
5. Corcoran CM, Mittal VA, Bearden CE, et al. Language as a biomarker for psychosis: a natural language processing approach. Schizophr Res. 2020;226:158-166.
6. Elvevåg B, Foltz PW, Weinberger DR, Goldberg TE. Quantifying incoherence in speech: an automated methodology and novel application to schizophrenia. Schizophr Res. 2007;93(1-3):304-316.
7. Corcoran CM, Carrillo F, Fernández-Slezak D, et al. Prediction of psychosis across protocols and risk cohorts using automated language analysis. World Psychiatry. 2018;17(1):67-75.
8. Rezaii N, Walker E, Wolff P. A machine learning approach to predicting psychosis using semantic density and latent content analysis. NPJ Schizophr. 2019;5(1):9.
9. Mota NB, Vasconcelos NA, Lemos N, et al. Speech graphs provide a quantitative measure of thought disorder in psychosis. PLoS One. 2012;7(4):e34928.
10. Birnbaum ML, Ernala SK, Rizvi AF, et al. Detecting relapse in youth with psychotic disorders utilizing patient-generated and patient-contributed digital data from Facebook. NPJ Schizophr. 2019;5(1):17.
11. Tang SX, Kriz R, Cho S, et al. Natural language processing methods are sensitive to sub-clinical linguistic differences in schizophrenia spectrum disorders. NPJ Schizophr. 2021;7(1):25.
12. Krell R, Tang W, Hänsel K, et al. Lexical and acousting correlates of clinical speech disturbance in schizophrenia. Workshop 35th AAAI Conf Intell. 2021:9.
13. Palaniyappan L, Bezerra Mota N, Oowise S, et al. Speech structure links the neural and socio-behavioural correlates of psychotic disorders. Prog Neuropsychopharmacol Biol Psychiatry. 2019;88:112-120. ❒