Advancing EEG-Based Gaze Prediction Using Depthwise Separable Convolution and Enhanced Pre-Processing

Advancing EEG-Based Gaze Prediction Using Depthwise Separable Convolution and Enhanced Pre-Processing

6 Aug 2024 | Matthew L Key*, Tural Mehtiyeve*, and Xiaodong Qu
This study introduces a novel method, EEG Deeper Clustered Vision Transformer (EEG-DCViT), which combines depthwise separable convolutional neural networks (DS-CNNs) with vision transformers to enhance EEG-based gaze prediction. The approach incorporates a pre-processing strategy involving data clustering, leading to a significant improvement in performance. The model achieves a Root Mean Square Error (RMSE) of 51.6 mm, establishing a new benchmark in EEG-based applications. The research addresses two key questions: how depthwise separable convolution affects predictive accuracy in EEG-based gaze prediction models, and how advancements in pre-processing techniques influence model accuracy. The study evaluates the effectiveness of pre-processing techniques and the impact of depthwise separable convolution on EEG-based vision transformers (ViTs) in a pretrained model architecture. The EEGEyeNet dataset, which includes extensive EEG and eye-tracking data, is used to assess the performance of the proposed model. The hybrid vision transformer (ViT) has shown potential in gaze prediction, challenging conventional convolution-based approaches. The integration of ViTs with EEG-based gaze prediction marks a significant advancement, utilizing deep learning to interpret the complexities of brain data. The research explores the synergy between CNNs and transformers, demonstrating that combining CNNs for local feature extraction with transformers for global dependency modeling can enhance the accuracy and generalization of EEG-based applications. The study also highlights the potential of transformers in EEG signal analysis, extending their application beyond gaze prediction to tasks like epileptic seizure prediction. The proposed model, EEG-DCViT, integrates data clustering with depthwise separable convolution to improve the model's ability to recognize underlying patterns in EEG data. The model's performance is evaluated using RMSE, with the best result achieved by combining both techniques, resulting in an RMSE of 51.6 mm. The study also discusses the computational complexity of the model, noting that the addition of depthwise separable convolution does not significantly increase training time or memory usage. The results indicate that specialized training involving data clustering and DS-CNNs can significantly improve the accuracy of deep learning models in estimating absolute positions from EEG data. The study also provides insights into the model's performance, including the identification of challenging eye positions and the potential for future improvements in model accuracy. The findings contribute to the advancement of EEG-based brain-computer interfaces and machine learning, offering new insights into the interpretation of complex neural data.This study introduces a novel method, EEG Deeper Clustered Vision Transformer (EEG-DCViT), which combines depthwise separable convolutional neural networks (DS-CNNs) with vision transformers to enhance EEG-based gaze prediction. The approach incorporates a pre-processing strategy involving data clustering, leading to a significant improvement in performance. The model achieves a Root Mean Square Error (RMSE) of 51.6 mm, establishing a new benchmark in EEG-based applications. The research addresses two key questions: how depthwise separable convolution affects predictive accuracy in EEG-based gaze prediction models, and how advancements in pre-processing techniques influence model accuracy. The study evaluates the effectiveness of pre-processing techniques and the impact of depthwise separable convolution on EEG-based vision transformers (ViTs) in a pretrained model architecture. The EEGEyeNet dataset, which includes extensive EEG and eye-tracking data, is used to assess the performance of the proposed model. The hybrid vision transformer (ViT) has shown potential in gaze prediction, challenging conventional convolution-based approaches. The integration of ViTs with EEG-based gaze prediction marks a significant advancement, utilizing deep learning to interpret the complexities of brain data. The research explores the synergy between CNNs and transformers, demonstrating that combining CNNs for local feature extraction with transformers for global dependency modeling can enhance the accuracy and generalization of EEG-based applications. The study also highlights the potential of transformers in EEG signal analysis, extending their application beyond gaze prediction to tasks like epileptic seizure prediction. The proposed model, EEG-DCViT, integrates data clustering with depthwise separable convolution to improve the model's ability to recognize underlying patterns in EEG data. The model's performance is evaluated using RMSE, with the best result achieved by combining both techniques, resulting in an RMSE of 51.6 mm. The study also discusses the computational complexity of the model, noting that the addition of depthwise separable convolution does not significantly increase training time or memory usage. The results indicate that specialized training involving data clustering and DS-CNNs can significantly improve the accuracy of deep learning models in estimating absolute positions from EEG data. The study also provides insights into the model's performance, including the identification of challenging eye positions and the potential for future improvements in model accuracy. The findings contribute to the advancement of EEG-based brain-computer interfaces and machine learning, offering new insights into the interpretation of complex neural data.
Reach us at info@futurestudyspace.com
[slides] Advancing EEG-Based Gaze Prediction Using Depthwise Separable Convolution and Enhanced Pre-processing | StudySpace