Connected Speech-Based Cognitive Assessment in Chinese and English

Connected Speech-Based Cognitive Assessment in Chinese and English

18 Jun 2024 | Saturnino Luz, Sofia De La Fuente Garcia, Fasih Haider, Davida Fromm, Brian MacWhinney, Alyssa Lanz, Ya-Ning Chang, Chia-Ju Chou, Yi-Chien Liu
This paper presents a novel benchmark dataset and prediction tasks for assessing cognitive function through analysis of connected speech in Mandarin Chinese and English. The dataset includes speech samples and clinical information for individuals with varying levels of cognitive impairment and those with normal cognition. The data were carefully matched by age and sex using propensity score analysis to ensure balance and representativeness in model training. The prediction tasks include mild cognitive impairment (MCI) diagnosis and cognitive test score prediction. The framework encourages the development of language-independent approaches for speech-based cognitive assessment. Baseline models using language-agnostic and comparable features achieved an unweighted average recall of 59.2% in diagnosis and a root mean squared error of 2.89 in score prediction. The dataset was used as a benchmark for speech processing and machine learning tasks related to detecting cognitive decline through connected speech analysis. It formed the basis of the TAUKADIAL Challenge at Interspeech 2024. The dataset includes speech samples from participants engaged in picture description tasks, along with clinical and neuropsychological test data. The dataset is age- and gender-balanced, with 507 speech samples (261 Chinese and 246 English) totaling 528 minutes. It is available to the research community via DementiaBank. The study aimed to assess speech as a behavioral marker of cognition in a global health context by investigating its application to modeling cognitive health indicators in Chinese and English. The dataset was collected from participants aged 60-90 years with memory concerns, and participants were classified as either normal cognition (NC) or MCI based on clinical criteria. The dataset includes speech samples from picture description tasks and is used for both classification and regression tasks. The classification task involves distinguishing NC speech from MCI speech, while the regression task involves predicting MMSE scores based on connected speech data. The study used a combination of acoustic and linguistic features for model training. Acoustic features included eGeMAPs and wav2vec features, while linguistic features included token and type counts, type-to-token ratio, and other linguistic metrics. The models were trained using multi-layer perceptron (MLP) architectures with Adam solver and ReLU activation. The results showed that the baseline model achieved a test-data unweighted average recall (UAR) of 59.18% and a root mean squared error (RMSE) of 2.89 for MMSE prediction. The results were similar across both languages, with English achieving a UAR of 60.00% and Chinese achieving a UAR of 60.04%. The study highlights the potential of speech-based cognitive assessment as a tool for early detection and monitoring of cognitive impairment. The dataset and models provide a valuable resource for the research community in the field of cross-lingual cognitive assessment.This paper presents a novel benchmark dataset and prediction tasks for assessing cognitive function through analysis of connected speech in Mandarin Chinese and English. The dataset includes speech samples and clinical information for individuals with varying levels of cognitive impairment and those with normal cognition. The data were carefully matched by age and sex using propensity score analysis to ensure balance and representativeness in model training. The prediction tasks include mild cognitive impairment (MCI) diagnosis and cognitive test score prediction. The framework encourages the development of language-independent approaches for speech-based cognitive assessment. Baseline models using language-agnostic and comparable features achieved an unweighted average recall of 59.2% in diagnosis and a root mean squared error of 2.89 in score prediction. The dataset was used as a benchmark for speech processing and machine learning tasks related to detecting cognitive decline through connected speech analysis. It formed the basis of the TAUKADIAL Challenge at Interspeech 2024. The dataset includes speech samples from participants engaged in picture description tasks, along with clinical and neuropsychological test data. The dataset is age- and gender-balanced, with 507 speech samples (261 Chinese and 246 English) totaling 528 minutes. It is available to the research community via DementiaBank. The study aimed to assess speech as a behavioral marker of cognition in a global health context by investigating its application to modeling cognitive health indicators in Chinese and English. The dataset was collected from participants aged 60-90 years with memory concerns, and participants were classified as either normal cognition (NC) or MCI based on clinical criteria. The dataset includes speech samples from picture description tasks and is used for both classification and regression tasks. The classification task involves distinguishing NC speech from MCI speech, while the regression task involves predicting MMSE scores based on connected speech data. The study used a combination of acoustic and linguistic features for model training. Acoustic features included eGeMAPs and wav2vec features, while linguistic features included token and type counts, type-to-token ratio, and other linguistic metrics. The models were trained using multi-layer perceptron (MLP) architectures with Adam solver and ReLU activation. The results showed that the baseline model achieved a test-data unweighted average recall (UAR) of 59.18% and a root mean squared error (RMSE) of 2.89 for MMSE prediction. The results were similar across both languages, with English achieving a UAR of 60.00% and Chinese achieving a UAR of 60.04%. The study highlights the potential of speech-based cognitive assessment as a tool for early detection and monitoring of cognitive impairment. The dataset and models provide a valuable resource for the research community in the field of cross-lingual cognitive assessment.
Reach us at info@study.space