A retrospective public external benchmark of healthy-to-stroke lower-limb EEG transport identifies constraints from source construction, adaptation burden, and confound sensitivity
Choi, D.; Choi, A.; Lam, Q.; Park, J.
Show abstract
BackgroundLower-limb EEG is a rehabilitation-facing control signal for stroke neurorehabilitation and future non-invasive brain-spine interfaces, but a public external benchmark that jointly audits source construction, minimal adaptation burden, and confound sensitivity is lacking. We therefore tested whether lower-limb effort-versus-rest decoders trained on healthy public EEG transport to a stroke target domain. MethodsWe conducted a retrospective public-data external benchmark using three public EEG datasets harmonised to a common lower-limb effort-versus-rest target. Classical and deep models were compared under zero-shot transport, 10-shot temperature calibration, and 10-shot fine-tuning. For few-shot analyses, each target participant contributed a trial-disjoint subject-internal support set of 10 labelled trials per class and a held-out remainder test set. Prespecified analyses audited source construction, support-resampling sensitivity, and montage controls. Uncertainty was summarised with participant-level bootstrap confidence intervals. ResultsWithin this benchmark, healthy-to-stroke zero-shot transport was weak. The best zero-shot result was classical rather than deep, with CSP+LDA reaching area under the receiver operating characteristic curve (AUROC) 0.603, whereas EEGNet remained near chance (AUROC 0.527). Ten-shot calibration improved operating behaviour more than discrimination: for CSP+LDA, expected calibration error fell from 0.267 to 0.035 and specificity increased from 0.180 to 0.485, whereas AUROC remained essentially unchanged (0.603 to 0.604). Ten-shot fine-tuning produced only modest gains; the best overall AUROC was 0.605 for pooled dataset-balanced CSP+LDA, numerically tied with pooled raw CSP+LDA (0.605). MILimbEEG-only source training was consistently weak, exploratory deep domain-generalisation variants did not rescue transport, and frontal and temporal montage controls remained relatively competitive. ConclusionsWithin this public benchmark, source construction and minimal adaptation burden mattered more than model novelty, and retrospective montage controls limited motor-specific interpretation. The results support harmonised prospective validation of lower-limb EEG transport over further retrospective model iteration.
Matching journals
The top 6 journals account for 50% of the predicted probability mass.