Basics of Digital Speech Recognition Technology (SRT)

Need multiple seats for your university or lab? Get a quote
The page below is a sample from the LabCE course Digital Speech Recognition Systems in an Anatomical Pathology Practice. Access the complete course and earn ASCLS P.A.C.E.-approved continuing education credits by subscribing online.

Learn more about Digital Speech Recognition Systems in an Anatomical Pathology Practice (online CE course)
Basics of Digital Speech Recognition Technology (SRT)

The basic components of speech recognition system include the:
  • Microphone
  • Sound card
  • Recognition engine
  • Vocabulary
  • Speaker profile
  • Language model
SRT is a process by which a computer transcribes verbal dictation directly into text, eliminating the need for human transcription. The microphone and sound card convert analog human speech into a digital waveform. A noise-canceling microphone is often preferred to eliminate background noise.
A recognition engine is an integrated software that contains medical dictionaries and thus can recognize words specific to the anatomical pathology laboratory. Some engines can recognize up to 300,000 medical terms. This vocabulary, in conjunction with a high-speed processor, allows the user to speak at a natural rate while recording their speech. Some engines can also understand verbal commands to open applications or edit text.
For the most accurate transcription, a speaker profile needs to be created; the user’s speech needs to be compared with the system’s vocabulary to ensure accuracy in speech recognition. The purpose of the profile is to account for differing accents and ways of saying certain terms. Creating a speech profile requires reading a 6,000-word document filled with words commonly used in a pathology laboratory. The user reads that document into the system and then edits any misinterpretations. With a uniqe speech profile created, the engine should result an accurate transcription for that individual.
The recognition engine can also be integrated with language models. Language models rely on cloud-based processing and ongoing data collection projects to continuously improve their ability to recognize and understand a wider variety of words, phrases, specialized medical terminology, accents, and languages.

General architecture of Digital Speech Recognition Systems