6 Aug 2024 | Matthew L Key*, Tural Mehtiyev*, and Xiaodong Qu
This study evaluates the effectiveness of pre-processing techniques and the impact of depthwise separable convolution on EEG vision transformers (ViTs) in a pre-trained model architecture for gaze prediction. The authors introduce the EEG Deeper Clustered Vision Transformer (EEG-DCViT), which combines depthwise separable convolutional neural networks (DS-CNNs) with vision transformers and enhances pre-processing through data clustering. The new approach demonstrates superior performance, achieving a Root Mean Square Error (RMSE) of 51.6 mm, setting a new benchmark. The study addresses two key research questions: the influence of depthwise separable convolution on predictive accuracy and the impact of advanced pre-processing techniques on model accuracy. The results highlight the importance of both pre-processing and model refinement in enhancing EEG-based applications, particularly in gaze prediction. The study also discusses computational complexity, test error visualization, and the performance of the EEG-ViT model, providing insights into the model's strengths and areas for improvement. The findings suggest that specialized training involving data clustering and DS-CNNs can significantly improve the accuracy of deep learning models in estimating absolute positions from EEG data.This study evaluates the effectiveness of pre-processing techniques and the impact of depthwise separable convolution on EEG vision transformers (ViTs) in a pre-trained model architecture for gaze prediction. The authors introduce the EEG Deeper Clustered Vision Transformer (EEG-DCViT), which combines depthwise separable convolutional neural networks (DS-CNNs) with vision transformers and enhances pre-processing through data clustering. The new approach demonstrates superior performance, achieving a Root Mean Square Error (RMSE) of 51.6 mm, setting a new benchmark. The study addresses two key research questions: the influence of depthwise separable convolution on predictive accuracy and the impact of advanced pre-processing techniques on model accuracy. The results highlight the importance of both pre-processing and model refinement in enhancing EEG-based applications, particularly in gaze prediction. The study also discusses computational complexity, test error visualization, and the performance of the EEG-ViT model, providing insights into the model's strengths and areas for improvement. The findings suggest that specialized training involving data clustering and DS-CNNs can significantly improve the accuracy of deep learning models in estimating absolute positions from EEG data.