Careless Whisper: Speech-to-Text Hallucination Harms

Careless Whisper: Speech-to-Text Hallucination Harms

June 3–6, 2024 | ALLISON KOENECKE, ANNA SEO GYEONG CHOI, KATELYN X. MEI, HILKE SCHELLMANN, MONA SLOANE
Careless Whisper: Speech-to-Text Hallucination Harms Automated speech-to-text systems aim to transcribe audio accurately and are increasingly used in everyday life, such as in voice assistants and customer interactions. This study evaluates OpenAI's Whisper, a state-of-the-art speech-to-text service, for potential hallucinations—unreal phrases or sentences not present in the audio. While many transcriptions are accurate, approximately 1% contain hallucinated content. Thematic analysis reveals that 38% of hallucinations include harmful content, such as perpetuating violence, inaccurate associations, or false authority. The study also finds that individuals with aphasia (a speech disorder) are more likely to experience hallucinations, possibly due to longer non-vocal durations in their speech. The researchers call for improvements in Whisper to reduce hallucinations and raise awareness of potential biases in downstream applications. The study involved 13,140 audio segments from TalkBank's AphasiaBank, with 437 participants, including 390 white individuals, 27 African American individuals, and 20 of other races. The audio segments were analyzed for hallucinations, with 1.4% of transcriptions containing hallucinations. The study found that 38% of hallucinations are harmful, including perpetuation of violence, inaccurate associations, and false authority. The researchers also tested other speech-to-text services, such as Google's, and found no hallucinations in those systems. The study highlights the need for improvements in speech-to-text technology to reduce hallucinations and ensure fairness and accuracy, particularly for individuals with speech impairments. The findings suggest that hallucinations can have serious consequences, including misrepresentation, inaccurate information, and potential harm. The study calls for further research and action to address these issues and improve the reliability and fairness of speech-to-text systems.Careless Whisper: Speech-to-Text Hallucination Harms Automated speech-to-text systems aim to transcribe audio accurately and are increasingly used in everyday life, such as in voice assistants and customer interactions. This study evaluates OpenAI's Whisper, a state-of-the-art speech-to-text service, for potential hallucinations—unreal phrases or sentences not present in the audio. While many transcriptions are accurate, approximately 1% contain hallucinated content. Thematic analysis reveals that 38% of hallucinations include harmful content, such as perpetuating violence, inaccurate associations, or false authority. The study also finds that individuals with aphasia (a speech disorder) are more likely to experience hallucinations, possibly due to longer non-vocal durations in their speech. The researchers call for improvements in Whisper to reduce hallucinations and raise awareness of potential biases in downstream applications. The study involved 13,140 audio segments from TalkBank's AphasiaBank, with 437 participants, including 390 white individuals, 27 African American individuals, and 20 of other races. The audio segments were analyzed for hallucinations, with 1.4% of transcriptions containing hallucinations. The study found that 38% of hallucinations are harmful, including perpetuation of violence, inaccurate associations, and false authority. The researchers also tested other speech-to-text services, such as Google's, and found no hallucinations in those systems. The study highlights the need for improvements in speech-to-text technology to reduce hallucinations and ensure fairness and accuracy, particularly for individuals with speech impairments. The findings suggest that hallucinations can have serious consequences, including misrepresentation, inaccurate information, and potential harm. The study calls for further research and action to address these issues and improve the reliability and fairness of speech-to-text systems.
Reach us at info@study.space
Understanding Careless Whisper%3A Speech-to-Text Hallucination Harms