Visual speech speeds up the neural processing of auditory speech

Visual speech speeds up the neural processing of auditory speech

January 25, 2005 | Virginie van Wassenhove, Ken W. Grant, and David Poeppel
Visual speech enhances the neural processing of auditory speech by speeding up cortical responses within 100 ms of signal onset. This study shows that visual speech leads to articulator-specific temporal facilitation and a nonspecific amplitude reduction in auditory event-related potentials (ERPs). The latency facilitation depends on the visual signal's ability to predict auditory targets, suggesting that abstract internal representations constrain subsequent speech processing. This supports the "analysis-by-synthesis" mechanism in auditory-visual speech perception. The research involved three experiments using EEG and behavioral tasks to examine the effects of visual speech on auditory ERP components (N1 and P2). Visual speech was found to reduce the amplitude of N1 and P2 auditory ERPs compared to auditory-only conditions, indicating multisensory interaction. The amplitude reduction was not simply additive, suggesting genuine multisensory integration. Visual speech also facilitated the latency of auditory ERPs, with the effect varying based on the visual input's predictability. For example, visually identifiable articulatory targets led to greater temporal facilitation. In McGurk fusion, where visual and auditory inputs conflict, no temporal facilitation was observed, suggesting that visual attention modulates auditory processing. The study highlights the ecological validity of AV speech and the role of predictive coding in multisensory integration. Visual speech inputs precede auditory signals and provide predictive information, which helps the auditory system process speech more efficiently. The results suggest that AV speech processing involves two stages: an early feature-based stage where visual information predicts auditory input, and a later perceptual stage where the system processes in a bimodal mode. The findings support the idea that multisensory integration in AV speech is not solely based on general principles but involves specific rules and time scales. The study also emphasizes the importance of visual attention in modulating auditory processing, with visual attention enhancing the biasing effect of weak predictors. Overall, the research provides evidence for the role of predictive coding and analysis-by-synthesis in auditory-visual speech perception.Visual speech enhances the neural processing of auditory speech by speeding up cortical responses within 100 ms of signal onset. This study shows that visual speech leads to articulator-specific temporal facilitation and a nonspecific amplitude reduction in auditory event-related potentials (ERPs). The latency facilitation depends on the visual signal's ability to predict auditory targets, suggesting that abstract internal representations constrain subsequent speech processing. This supports the "analysis-by-synthesis" mechanism in auditory-visual speech perception. The research involved three experiments using EEG and behavioral tasks to examine the effects of visual speech on auditory ERP components (N1 and P2). Visual speech was found to reduce the amplitude of N1 and P2 auditory ERPs compared to auditory-only conditions, indicating multisensory interaction. The amplitude reduction was not simply additive, suggesting genuine multisensory integration. Visual speech also facilitated the latency of auditory ERPs, with the effect varying based on the visual input's predictability. For example, visually identifiable articulatory targets led to greater temporal facilitation. In McGurk fusion, where visual and auditory inputs conflict, no temporal facilitation was observed, suggesting that visual attention modulates auditory processing. The study highlights the ecological validity of AV speech and the role of predictive coding in multisensory integration. Visual speech inputs precede auditory signals and provide predictive information, which helps the auditory system process speech more efficiently. The results suggest that AV speech processing involves two stages: an early feature-based stage where visual information predicts auditory input, and a later perceptual stage where the system processes in a bimodal mode. The findings support the idea that multisensory integration in AV speech is not solely based on general principles but involves specific rules and time scales. The study also emphasizes the importance of visual attention in modulating auditory processing, with visual attention enhancing the biasing effect of weak predictors. Overall, the research provides evidence for the role of predictive coding and analysis-by-synthesis in auditory-visual speech perception.
Reach us at info@study.space