Computer Speech Recognition in Psychiatry
Computer Speech Recognition in Psychiatry
As her patient leaves the consulting room, Susan Roth, M.D., picks up her computer's microphone and begins dictating. "Wake up. Open template recurrent major depression. Patient identification: Mr. Johnson is a 64-year-old married white male. Chief complaint: difficulty sleeping, loss of appetite and depressed mood with suicidal ideation for the last three weeks."
As Roth speaks, words appear on her portable computer screen to complete the patient's intake evaluation. Adding an electronic signature, she prints out a hard copy of her evaluation and saves the medical record to her hard drive.
Using a voice-activated command to her computer, Roth faxes her findings and a cover letter to the referring physician and sends prescription orders to the patient's pharmacy.
Continuous Speech Recognition
A remarkable technology for personal computers, called continuous speech recognition, is emerging for clinical use. Through continuous speech recognition, dictation can be transcribed directly into a medical report. As the clinician speaks into a microphone, the computer "recognizes" his or her speech and "types" it out immediately.
Early speech recognition systems were based on a technology called discrete speech recognition. The user had to pause after each word for the computer to recognize and translate the sounds into text. As computer processor speed has grown exponentially and speech recognition systems have improved, continuous speech recognition has supplanted discrete speech recognition.
Continuous speech recognition lets the user speak naturally into the computer at a normal pace to dictate patient assessments, follow-up notes, consultation letters, medication management notes and even the patient's prescriptions. One can also dictate memos, notes or other documents directly into Microsoft Word, enter numbers into spreadsheets and execute common computer commands via voice, rather than by keyboard or mouse.
Hardware and Setup
Speech recognition programs require specific computer hardware configurations. Minimum system requirements for good performance of speech recognition systems are a Pentium 200 MHz computer with 64 MB of RAM and a CD-ROM drive for software installation. The sound card standard is SoundBlaster 16-bit compatible, but 32-bit works just as well. The wrong sound card or hardware configuration can adversely effect the accuracy of speech recognition.
A headset microphone is often supplied with the speech recognition software, but most physicians find headset microphones rather unwieldy. An innovative product is the SpeechMike from Philips, a handheld computer microphone with built-in trackball and speaker. SpeechMike Professional features dictation-specific features such as microphone off and on, play and pause buttons.
For portable computer users, the computer's compatibility or certification for speech recognition software is essential. I recommend the Micron TransPort configured with the Pentium 266 MHz processor speed and 64 MB or RAM a solid and well-built portable computer. Micron Electronics Inc. has endorsed the emerging technology of continuous speech recognition, including Dragon NaturallySpeaking, with all new portables.
In order for speech recognition software to be used, the microphone must be plugged into the computer's sound card, and the speech recognition software installed. While all speech recognition products can be used after initial setup, the best results come after the system is trained to recognize the user's own patterns of speech while a passage of text is read for about 30 minutes. The program will continue to learn to improve its accuracy of speech recognition through repeated dictation.
Speech recognition technology has tremendous potential for medical information systems, facilitating the transition from manual charting on a paper-based medical record to the long- awaited computerized patient record. The National Academy of Sciences Institute of Medicine included speech recognition in its 1991 "gold standard" study on the computer-based patient records (CPR) with the following ideal CPR attributes:
- The ideal CPR supports direct physician entry by dictation into the patient record.
- The ideal CPR further supports integrated and interfaced voice dictation via icon on the clinical workstation.
- The ideal CPR capitalizes on the repetitive nature of medicine by using templates at a point-of-care facility.
There are currently five suppliers of speech recognition systems: IBM, Philips, Kurzweil, Dragon Systems and Lernout & Hauspie. IBM, Philips and Kurzweil have introduced medical speech recognition systems over the last two years, the first of which was designed for radiology. Only two companies, Voice Input Technologies and Voice Activated Systems Technologies, have developed systems for the psychiatric practice, which are based on the Philips and Dragon Systems.
IBM MedSpeak, released in September 1996, was the first real-time continuous speech package available for physicians. IBM has released two professional modules: MedSpeak/Radiology and MedSpeak/Pathology but has no current plans for a psychiatry-specific system. IBM also offers two consumer speech recognition packages called IBM Via Voice and Via Voice Gold. I do not recommend these consumer products as stand-alone medical dictation modules, although they may help give novices a feeling for speech recognition by personal computer.