Back

Investigating Hybrid Deep Learning Architectures for Speech Envelope Reconstruction from EEG

Gottipalli, U. S.; Jha, A.; Miyapuram, K. P.

2026-05-27 neuroscience
10.64898/2026.05.24.727471 bioRxiv
Show abstract

Reconstructing speech envelopes from electroen-cephalography (EEG) signals is a challenging but valuable task for brain-computer interfaces (BCIs), with applications in assistive communication for individuals with speech impairments. While deep learning has improved reconstruction accuracy, most existing approaches are restricted to single-layer architectures such as convolutional neural networks (CNNs). This limits their ability to capture the full complexity of spatio-temporal and structural EEG patterns. In this work, we systematically extend the VLAAI framework by evaluating 26 architectures that integrate CNNs, long short-term memory networks (LSTMs), and graph convolutional networks (GCNs) in both single-layer and hybrid configurations. Experiments on the 64-channel Spar-rKULee dataset demonstrate that CNNs remain the strongest standalone models, but hybrid designs--particularly CNN-LSTM and CNN-GCN-LSTM--achieve competitive or superior performance. These results highlight the importance of combining spatial, temporal, and graph-based processing, and provide practical guidelines for hybrid architecture design. Our study offers the first large-scale comparative analysis of hybrid models for EEG-based speech envelope reconstruction, advancing robust BCI systems for non-invasive speech decoding.

Matching journals

The top 1 journal accounts for 50% of the predicted probability mass.