Neurocomputing — Latest Matching Preprints

1

Equilibrium Propagation with Predictive Learning in Leaky Integrate-and-Fire Spiking Neural Networks

Kubo, Y.

2026-05-21 neuroscience 10.64898/2026.05.19.726261 medRxiv

Top 0.1%

2.8%

Show abstract

Equilibrium propagation (EP) is a biologically plausible alternative to backpropagation that has demonstrated competitive performance across a range of machine learning tasks. Recent work has extended EP to spiking neural networks (SNNs), leveraging leaky integrate-and-fire (LIF) neurons and spike-based plasticity rules to improve biological realism while maintaining strong performance. In this work, we propose an EP-based SNN framework that combines LIF neural dynamics with a predictive learning rule, replacing conventional spike-timing-dependent plasticity (STDP) with a learning rule more directly aligned with predictive coding principles. We evaluate the proposed model on multiple image classification benchmarks, including MNIST, KMNIST, and Fashion-MNIST, and compare its performance with a BP-trained LIF SNN baseline. Our results show that the proposed EP-based LIF model (EP+LIF) achieves competitive accuracy across datasets, with performance approaching that of the BP-trained counterpart (BP+LIF) while relying on a biologically motivated local learning rule. In addition, analysis of hidden-layer spiking activity reveals that EP+LIF produces more persistent hidden-state activity, whereas BP+LIF yields sparser spiking representations. These results demonstrate that predictive learning can support effective EP-based training in LIF spiking networks, while also highlighting differences in activity patterns that motivate future work on activity regulation and sparse spiking dynamics.

2

A brain-inspired framework for memory prioritization in neural networks based on valence

Zbaranska, S.; Rajeev, A.; Josselyn, S.; Laschowski, B.

2026-05-08 neuroscience 10.64898/2026.05.05.723022 medRxiv

Top 0.1%

2.5%

Show abstract

Improving long-term memory in artificial neural networks remains an open challenge. To address this, we developed a novel brain-inspired framework for memory prioritization based on the principle of emotional valence. Our framework includes: (i) a valence-weighted cross-entropy loss that scales the learning signal by the valence magnitude, analogous to neuromodulation; (ii) an amygdala-inspired module that learns high-dimensional valence embeddings; and (iii) a hippocampus-inspired module that integrates valence embeddings into the attention mechanism to modulate information retrieval. We demonstrated the generalization of our framework across spatial, episodic, and language-based memory tasks, consistently improving memory prioritization and long-term retention of high-salience information. In addition to improving long-term memory, we also showed that our framework can help mitigate the "lost-in-the-middle" problem in language modeling. More generally, this research provides further evidence of the potential of brain-inspired algorithms to advance the field of machine learning.

3

Geometric Kinematics of Human Eyes

Turski, J.

2026-05-10 neuroscience 10.64898/2026.04.10.716809 medRxiv

Top 0.1%

2.5%

Show abstract

In previous studies by the author on binocular vision with the asymmetric eye (AE), which models a healthy human eye with misaligned optical components, the results were primarily presented in the Rodrigues vector (RV) framework and supported by simulations and 3D visualizations in GeoGebras dynamic geometry environment. In this paper, the novel geometric kinematics of the human eye, that is, the eye with misaligned optics, and simplified assumptions about the eye rotations (the eyes translational movements are disregarded), are developed within the framework of rigid-body rotations. The originality of the analysis lies in a precise geometric decomposition of a full rotation of the eyes posture into a torsion-free rotation (the geodesic part) and a torsional rotation (the non-geodesic extension of the geodesic part). This decomposition is extended to the corresponding decomposition of the angular velocity. A novel derivation of the eyes angular velocity from the RV formulation of the eye kinematics is proposed.

4

Automatic Bevacizumab Response Prediction in Ovarian Cancer from Digital Pathology Images via Novel AI-based Computational Pipeline

Alsaiari, A.; Turki, T.; Taguchi, Y.-h.

2026-05-04 bioinformatics 10.64898/2026.04.29.721782 medRxiv

Top 0.1%

2.1%

Show abstract

Ovarian cancer is one of the gynecological cancer types, which, if metastasized and not detected early, can cause deaths among women. Therefore, there is a need to accurately predict drug responses to ovarian cancer. A gynecological pathologist inspects abnormality in tissues, followed by providing a report about patients; however, such a diagnostic process is (1) hard; (2) requires experience; and (3) time consuming. Moreover, existing tools are far from perfect. Hence, we present a computational pipeline to improve predicting drug response pertaining to ovarian cancer, derived as follows. First, we download digital pathology images pertaining to ovarian bevacizumab response from the cancer imaging archive repository. We employed histogram of oriented gradients to images, constructing feature vectors, provided to Fisher linear discriminant analysis to change the representation through dimensionality reduction. Then, we provide reduced-dimensionality data for regression analysis through support vector regression coupled with various kernels and calculating the area under the ROC curve (AUC). Experimental results against transformer-based models (ViT and Swin) and other deep learning (DL) models (VGG16, ResNet50, InceptionV3, MobileNetV2, and EfficientNetB6) demonstrate that our approach with radial kernel (named SVRD+R) yielded an AUC performance improvements of 17% against the best-performing transformer-based model (ViT) while obtaining an AUC performance improvements of 14.9% when compared against the best DL-based model (MobileNetV2). These results demonstrate the superiority and feasibility of our AI-based pipeline when tackling prediction problems pertaining to gynecologic cancer studies. MSC92B05; 68T09

5

A continuum of asynchronous states in cerebral cortex networks, and how they determine responsiveness

Bassat, M.; Tesler, F.; Destexhe, A.

2026-05-09 neuroscience 10.64898/2026.05.06.723408 medRxiv

Top 0.3%

1.5%

Show abstract

The awake brain is known to display asynchronous (AS) states during periods of attention and arousal, but the responsiveness properties of such states remain unclear. Here, we investigate this question using computational models of spiking networks of excitatory and inhibitory neurons, mimicking recurrently-connected networks in layer 2/3 of the cerebral cortex. The networks can generate a continuum of AS states, but with different responsiveness characteristics. By using a mean-field model to infer the dynamic properties of the system, we find that there are two families of AS states, which we call "underdamped" (UD) and "overdamped" (OD). Responsiveness is maximised at the transition between OD and UD states, which identifies a "working point" that may present advantageous computational properties.

6

Failure detection in medical image classification under realistic distribution shifts: A large-scale benchmark

Steinmetz, P.; Frouin, F.; Morard, V.; Buvat, I.

2026-05-05 radiology and imaging 10.64898/2026.05.04.26350496 medRxiv

Top 0.3%

1.3%

Show abstract

Medical images (MI) exhibit variability due to different acquisition protocols, devices, and patient populations, making failure detection at inference time essential for reliable deployment of clinical classifiers. As existing evaluations of failure detection methods use different settings, it is difficult to compare results and identify the best strategy, if any. We present a comprehensive benchmark of eight confidence scoring functions and two score-aggregation strategies across eight MI tasks spanning diverse modalities, backbone architectures, training setups, and failure sources. The confidence ranking ability and classification error mitigation are jointly evaluated. While no single method systematically dominated across settings, aggregation of confidence scores consistently matched or approached the best individual method and substantially reduced silent failure rate. The failure detection performance was strongly correlated with classifier accuracy for all tested settings. These findings provide large-scale evidence regarding the strengths and limitations of confidence scoring strategies and offer actionable guidance for mitigating silent failures under realistic distribution shifts in MI.

7

A biologically-grounded cerebellar spiking network model with realistic synaptic transmission captures complex circuit dynamics.

De Grazia, M.; Benozzo, D.; Rodarie, D.; Marchetti, F.; D'Angelo, E.; Casellato, C.

2026-05-14 neuroscience 10.64898/2026.05.12.724100 medRxiv

Top 0.4%

1.0%

Show abstract

Cerebellar neural circuit dynamics rely on a rich repertoire of synaptic and excitability mechanisms, which are thought to determine network computation in physiological and pathological conditions. In this work, we develop and validate a biologically-grounded spiking neural network of the cerebellar cortex, embedding key mechanisms of cellular excitability and synaptic transmission, and assess their impact on signal processing. Neuronal input-output functions, short-term synaptic plasticity, receptor-specific kinetics, and NMDA channel voltage-dependent gating were calibrated against detailed multicompartmental models through automatic tuning procedures. Incorporating these realistic biological properties allowed the network model to simulate key features observed in recordings from acute cerebellar slices. The neuronal discharge and local field potentials elicited by mossy fiber stimulation faithfully reproduced the natural patterns with millisecond precision. Then, selective receptor switch-off revealed the contribution of NMDA, GABA, and AMPA receptors to the frequency-dependent input-output function of the granular layer and Purkinje cells, linking previous empirical findings to specific synaptic mechanisms. This model combines high computational performance with biological realism and offers a computationally efficient framework to investigate neurophysiological phenomena and the neural correlates of behavior in large-scale long-lasting simulations, such as those needed to address the neural underpinnings of learning and of cerebellar pathologies.

8

DISCERN: A Clinical Impact-aware Framework for Radiology Report Comparison

Sharma, R.; Beeche, C.; Dong, J.; Zhuang, R.; Qu, H.; Zhang, R.; Gangaram, V.; Goswami, P.; Xin, J.; Ballard, J.; Goldberg, A.; Sagreiya, H.; Long, Q.; Chen, T.; Witschey, W. R.

2026-05-27 radiology and imaging 10.64898/2026.05.26.26353612 medRxiv

Top 0.4%

1.0%

Show abstract

The surge in medical imaging has spurred the development of vision-language models (VLMs) to alleviate radiologist workloads. However, clinical deployment is hindered by the lack of meaningful evaluation frameworks. Current metrics - ranging from semantic similarity to large language model (LLM) based judges - often fail to distinguish between clinically trivial and critical discrepancies, poorly reflecting real-world clinical judgment. To address this, we introduce DISCERN (Discordance and Significance-aware Entity-level Radiology Report Comparison). DISCERN is a significance-aware framework that weighs report errors based on their potential impact on patient care. Our results demonstrate that DISCERN powered by closed source LLMs aligns more closely with expert radiologist assessments than traditional metrics or current LLM evaluators, providing a more interpretable and clinically relevant benchmark. By modeling radiologist prioritization and entity-level feedback, DISCERN facilitates targeted model refinement and ensures the safer integration of generative AI into clinical workflows.

9

Dual-Stream Compression of High Bit-Depth Medical Images with Application to DNA Storage

Su, H.; Fan, W.; Peng, J.; Zhang, Y.

2026-05-20 bioinformatics 10.64898/2026.05.17.724501 medRxiv

Top 0.4%

1.0%

Show abstract

High bit-depth medical images preserve subtle intensity variations that are important for quantitative analysis and clinical interpretation, but their large dynamic range poses challenges for efficient compression. We propose a bit-plane-aware dual-stream compression framework for 16-bit medical images by separately modeling the most significant bit (MSB) and least significant bit (LSB) components. The MSB structural stream is encoded using JPEG coding with a Duplicate Segment Skipping (DSS) strategy to exploit spatial and segment-level redundancy, while the LSB detail stream is compressed using learned image compression to represent residual variations and fine-grained details. Experiments on four MRI and CT datasets show that the proposed method consistently outperforms representative traditional and learning-based codecs, achieving the lowest bit rate across all datasets. Meanwhile, it preserves high reconstruction fidelity. As a downstream application, we further demonstrate that the compressed bitstreams can be effectively integrated with DNA encoding and converted into sequences with favorable biochemical properties.

10

A Hybrid Quantum-Classical Multiscale LSTM Framework for Subject-Level EEG-Based Depression Detection

E, S.; Wang, C.; Rao, T. D.; Kumar, T. S.

2026-05-20 psychiatry and clinical psychology 10.64898/2026.05.18.26353461 medRxiv

Top 0.4%

0.9%

Show abstract

Major depressive disorder (MDD) is a common psychiatric disorder that requires reliable and objective assessment for early clinical intervention. Electroencephalography (EEG) is widely used for this purpose because it provides a non-invasive and low-cost measure of brain activity with high temporal resolution. However, EEG-based depression detection remains challenging due to the nonlinear nature of EEG signals, inter-subject variability, and the limited availability of subject-independent evaluation. To address these issues, this paper proposes a hybrid quantum-classical multiscale long short-term memory with parameterized quantum circuit branches (MS-LSTM-PQC) framework for subject-level EEG-based depression detection. The proposed model extracts temporal representations at multiple scales using parallel LSTM branches and incorporates eyes-closed (EC) and eyes-open (EO) condition information through condition-aware feature fusion. To further enhance the learned representations, scale-specific LSTM features are processed using PQC-based quantum branches implemented with TensorFlow Quantum (TFQ), providing an additional nonlinear feature transformation before classification. Experiments were conducted on the Mumtaz EEG depression dataset using EC-only, EO-only, and merged EC+EO conditions with 1-s, 2-s, and 3-s EEG windows. To reduce subject-level data leakage, all experiments were evaluated using 5-fold and 10-fold GroupKFold validation. The best overall accuracies across the evaluated settings were 92.05% and 95.08% under 5-fold and 10-fold GroupKFold validation, respectively. The 2-s merged EC+EO setting provided the most stable performance across validation protocols. In addition, Integrated Gradients (IG)-based explainability analysis showed that frontal and fronto-central channels, especially Fz, showed higher contributions to the model decision. These results suggest that multiscale temporal learning with quantum-enhanced feature transformation can support subject-level EEG-based depression detection under leakage-controlled evaluation.

11

Anatomy-Guided 3D Graph Networks for Couinaud Segmentation in Tumor Affected Livers

You, L.; Dang, H.; Wang, H.; Matta, E.; zhou, X.

2026-05-14 bioinformatics 10.64898/2026.05.11.724316 medRxiv

Top 0.4%

0.9%

Show abstract

Image-based liver Couinaud segmentation is designed to automatically provide the locations of suspicious objects in liver CT/MR images. Once achieved, the physicians will be guided to the target slice and area where the suspicious node is located. However, conventional algorithms trained primarily on healthy liver images often fail to generalize to Hepatocellular Carcinoma (HCC) cases due to pathological structural distortions. In this work, we propose a robust two-stage framework that integrates a 3D Unet with a 3D Anatomical Structure-Guided Graph Convolutional Network (3D GCN). This two-stage strategy effectively isolates the liver volume to eliminate structural noise from neighboring organs, such as the spleen, allowing the framework to focus exclusively on the complex 3D anatomical relationships among the eight segments. To ensure the topological consistency required for global spatial reasoning, we implement a standardized preprocessing pipeline that normalizes liver-only volumes to exactly 50 frames along the z-axis. By combining a lightweight 3D UNet backbone with the 3D GCN for refined boundary reasoning, our model demonstrates superior generalization performance on unseen clinical datasets, achieving a mean Dice score of 0.828 in blind testing. By releasing our code and pretrained weights, we aim to provide the first publicly available deep learning resource for robust Couinaud segmentation.

12

A Competitive Framework for Modeling EEG Microstate Durations

GOMEZ, C. M.; Angulo Ruiz, B. Y.

2026-05-22 neuroscience 10.64898/2026.05.20.726605 medRxiv

Top 0.5%

0.8%

Show abstract

BackgroundThis study examines a competition-based model (C-model) designed to capture the temporal dynamics of successive brain microstates derived from electroencephalography (EEG) recordings during eyes-open conditions. The analyzed data were obtained from a public repository comprising microstate sequences from 60 sessions of a single subject [1]. When applied to microstate dynamics, the C-model posits a stochastic competition among neural circuits underlying the expression of individual microstates. MethodsThe model is formulated at a conceptual level (computational level in Marrs framework) and employs a geometric distribution to account for the long right tail of microstate duration distributions, interpreted as the probability of "failure" of the currently active microstate to persist. To account for the short-lived left tail, the model incorporates a transient increase in the stability of the currently active network, or equivalently, a temporary decrease in the activation probability of competing microstates (refractory period). ResultsThe model provides a good fit to the microstate duration distributions across all 60 sessions. One third of sessions showed microstate identity sequential dependency with respect to the previous microstates. DiscussionThese results suggest that the C-model captures key aspects of microstate temporal structure. Moreover, because microstate probabilities can be modulated by psychophysiological conditions--including the influence of previously active networks--the model may serve as a building block for more comprehensive neurobiological frameworks of neural and behavioral dynamics. In such frameworks, microstate sequences could emerge from structured competition and flow among neural networks supporting microstate expression.

13

Sex-related differences in healthy aging: changes in neuroelectric brain activity reconstructed from resting-state MEG

Ustinin, M.; Boyko, A.; Rykunov, S.

2026-05-11 neuroscience 10.64898/2026.05.06.723197 medRxiv

Top 0.5%

0.8%

Show abstract

Sex-related differences in the aging of the human brain were studied using large array of experimental data. The open archive CamCan was used as a source of data: the magnetic encephalograms, co-registered with magnetic resonance images of the head, were obtained for each of 434 subjects (ages 18-87 years, mean age 54.7 {+/-}18.4): 217 females (ages 18-87 years, mean age 54.5 {+/-}18.4) and 217 males (ages 18-84 years, mean age 54.8 {+/-}18.3). Recordings were split in 10-year age cohorts, each cohort consisted of equal number of men and women to calculate average intersex characteristics correctly. By massively solving the inverse problem, functional tomograms were calculated - the spatial distribution of elementary spectral components. Physiological noise was eliminated by joint analysis of MEG-based functional tomogram and magnetic resonance image for each subject. Then multichannel spectra were transformed into time series of the power of elementary current dipoles. Summary electric powers were calculated in six conventional frequency bands (1-4 Hz - delta; 4-8 Hz - theta; 8-13 Hz - alpha; 13-21 Hz - beta1; 21-30 Hz - beta2; 30-48 Hz - gamma), and sex differences in age-related changes were examined. It was found that in the youngest age cohort (18-29 years) the summary electrical power of the brain for males is 1.5 times greater than such power for females. For adults (30-69 years), male and female powers are approximately equal, while in older cohorts (70-87 years), male total brain power is greater. Age dependencies in various frequency bands are generally different for men and women, excluding higher frequencies 21-48 Hz. Basic conclusion can be made that after intersex averaging total electric power of the human brain is invariant through the lifespan from 18 to 87 years. The proposed method of joint MEG and MRI analysis can be used for further study of the sex-related details of brain sources in their connection with age changes.

14

Dual-view Guided Context-aware Network for Automated Bone Lesion Segmentation and Quantification in Whole-body SPECT

chen, w.; Yang, X.; Lu, J.; Miao, M.; Huang, Y.; Zheng, S.; Zhang, C.; Xie, L.; Zhang, Y.

2026-05-12 bioinformatics 10.64898/2026.05.07.723665 medRxiv

Top 0.6%

0.8%

Show abstract

Whole-body SPECT bone scintigraphy reflects skeletal metabolic activity throughout the body and plays an indispensable role in the screening, treatment evaluation, and prognostic assessment of bone metastases in tumors. However, the automatic detection and segmentation of hypermetabolic bone lesions remain challenging due to low contrast, limited spatial resolution, and complex lesion distributions. In this study, we proposed Bone-Segnet, a dual-view guided automatic segmentation network for hypermetabolic bone lesions that integrated multi-scale feature modeling, global context modeling, and view-conditioned modulation. Pixel-level annotated anterior and posterior whole-body bone scintigraphy images were used for model training and prediction. The proposed network enhanced the recognition of low-contrast and small-scale lesions through small-lesion enhancement and multi-scale contextual modeling. A Transformer module was further introduced to strengthen global feature representation, while cross-view collaborative modeling was achieved by incorporating the complementary characteristics of anterior and posterior imaging. Experimental results demonstrated that the proposed method outperformed existing approaches across multiple evaluation metrics, with the Dice score improving from 0.7440 to 0.8750, indicating a substantial improvement in segmentation performance. Further quantitative analysis based on the segmentation results revealed significant differences among disease types in lesion count, pixel burden, and spatial distribution patterns, reflecting the heterogeneity of disease-related skeletal metabolic activity. Overall, the proposed method improved automatic lesion segmentation performance and enabled quantitative analysis of lesion burden and spatial distribution patterns, providing objective data support for the assessment of related diseases. Index Terms--Whole-body SPECT, bone lesion segmentation, dual-view modeling, quantitative analysis.

15

SeGA-GNN: Semantically Gated Augmented Graph Neural Networks for Wearable-Based Emotion Detection

Kurt, F.; Subasi, S. N.; Yakisan, E. S.; Subasi, A.

2026-06-01 health informatics 10.64898/2026.05.29.26354434 medRxiv

Top 0.6%

0.7%

Show abstract

Background: Wearable technologies enable scalable and continuous monitoring of emotional states through passive sensing of physiological and behavioral signals. However, conventional learning approaches often struggle to model the complex temporal, contextual, and relational dependencies underlying human emotions. To address these limitations, we propose a graph-based framework that represents multimodal wearable observations as heterogeneous knowledge graphs enriched with semantic information derived from Large Language Models (LLMs), enabling richer contextual understanding beyond raw sensor measurements. Methods: We constructed a heterogeneous knowledge graph using multimodal Fitbit physiological signals and affective self-report data collected from 45 users. Framing mood prediction and emotion detection was formulated as both binary and ternary node classification tasks. We evaluated five baseline heterogeneous Graph Neural Network (GNN) architectures and compared them with the proposed Semantically Gated Augmented Graph Neural Network (SeGA-GNN) framework, which dynamically integrates LLM-generated semantic embeddings into graph representations through a gated cross-modal fusion mechanism. Results: The baseline GNN models achieved strong performance, with classification accuracies ranging from 0.7525 to 0.9739 for binary classification and 0.6249 to 0.9699 for ternary classification. The proposed SeGA framework consistently improved predictive performance across most architectures. In particular, semantic augmentation transformed the HAN model from moderate baseline performance into near-perfect emotion recognition capability, achieving SeGA-HAN Accuracy = 0.9988 and AUC = 1.0000 for binary classification and Accuracy = 0.9979 and AUC = 1.0000 for ternary classification. Discussion and Conclusion: Integrating LLM-derived semantic contextualization into heterogeneous graph learning enables effective modeling of contextual information that is not directly captured by wearable physiological signals alone. The proposed SeGA-GNN framework demonstrates that adaptive semantic fusion substantially improves the accuracy, robustness, and interpretability of wearable-based emotion detection. These findings establish a promising direction for next-generation wearable affective computing systems and intelligent emotion-aware applications.

16

Multi-Agent AI for Chest Radiography: A Sequential Segmentation and LLM-Driven Consultative Tool for Medical Training

Kurt, F.; Subasi, A.

2026-06-01 health informatics 10.64898/2026.05.29.26354432 medRxiv

Top 0.6%

0.7%

Show abstract

Background: Traditional diagnostic models lack explainability, while multimodal language models prone to hallucination remain unsafe for medical education. An interactive, risk-free artificial intelligence framework is required to serve as a reliable clinical mentor for radiology trainees. Methods: We propose a multi-agent architecture decoupling deterministic image analysis from generative consultation. Specialized computer vision models perform anatomical localization and pathological segmentation. These quantitative outputs are synthesized into a structured payload, which grounds a locally hosted large language model (LLaVA 7B) using strict prompt guardrails and prerequisite protocols. Results: The system effectively eliminates visual hallucinations by intercepting unanchored queries. The artificial intelligence tutor successfully contextualizes spatial anomalies and baseline metrics, generating accurate conversational explanations and formally structured radiology reports while strictly enforcing medical safety disclaimers. Discussion and Conclusion: By anchoring language generation exclusively to verified algorithmic realities, this framework transforms opaque diagnostic models into safe, interactive educational simulators. This establishes a highly reliable paradigm for integrating explainable artificial intelligence into medical training.

17

An Information-Theoretic Analysis of Category Maps and Target Preservation

Dahl, C. D.

2026-05-05 neuroscience 10.64898/2026.05.01.722196 medRxiv

Top 0.6%

0.7%

Show abstract

Categorisation is often treated as a form of compression: a high-dimensional stimulus space is reduced to a smaller set of behaviourally or cognitively useful classes. However, compression alone does not determine whether a category map is useful. The present manuscript develops an information-theoretic framework for evaluating categorisation in terms of both category complexity and target-relevant information preservation. Across a set of synthetic demonstrations, alternative category maps over the same stimulus space are shown to preserve different target variables, including identity, action, nuisance, and hierarchical category structure. The framework is then extended to learned visual representations by analysing layer-derived category maps from a pretrained ResNet-50 network applied to CIFAR-10 images. Two scenarios are compared: a clean-only object run and a pooled nuisance run containing clean, blurred, pixelated, and noise-perturbed images. The results show that category maps can have substantial entropy while preserving information about a variable that is not aligned with the specified target, and that the value of a categorisation depends on the target variable to be preserved. The manuscript argues that categorisation should therefore be evaluated not only by compression or separability, but by the information retained about a specified cognitive, behavioural, or computational target.

18

Classification of Smartphone Interaction Using Multimodal Physiological Signals with a Brain-Body Spatio-Temporal Transformer

Mishra, P.; Kagathara, V.; Gandhi, T. K.

2026-05-07 neuroscience 10.64898/2026.05.03.722573 medRxiv

Top 0.6%

0.7%

Show abstract

Distinct smartphone interaction behaviors, like short-form video scrolling and mobile gaming, elicit qualitatively different cognitive and physiological responses. However, such distinctions is often overlooked by approaches that treat smart-phone use as a monolithic behavior. This paper presents Brain-Body Spatio-Temporal Transformer (BB-STT), a unified deep learning framework for classifying interaction-specific physiological signatures from multimodal signals, including EEG, EDA, PPG, and eye-tracking. BB-STT achieves 83.51% accuracy in distinguishing smartphone from non-smartphone activity and 74.13% accuracy in three-class classification of short-form video, gaming, and baseline viewing. The model demonstrates strong generalization with leave-one-subject-out (LOSO) performance that is also comparable to 5-fold cross-validation accuracy. Cross-modal attention emerges as the key component, improving three-class accuracy by 16.74 points through dynamic integration of multimodal signals. Interpretability analysis indicates a hierarchical organization of physiological responses. Eye-tracking features, particularly gaze depth, enable coarse separation between smartphone and non-smartphone activity. In contrast, finer discrimination between passive video viewing and active gaming on smartphones relies on the joint contribution of bilateral pupil dilation and central EEG features. Together, these results demonstrate the potential of multimodal physiological signals for objective, real-time assessment of digital engagement in naturalistic settings.

19

MurineCyto-Det: A High-Resolution Murine BALF Cytology Dataset for Leukocyte Segmentation and Detection

Le, T. X.; Tran, L.-A. T.; Farabi, D. A.; Wang, S.; Phan, A. T. Q.; Cormier, S. A.; Taada, A.; McGrew, D.; Du, Y.; Vu, L. D.

2026-05-12 bioinformatics 10.64898/2026.05.08.723893 medRxiv

Top 0.7%

0.7%

Show abstract

Automated analysis of murine bronchoalveolar lavage fluid (BALF) cytology is important for preclinical respiratory research, yet progress has been limited by the lack of publicly available, well-annotated mouse BALF image datasets. We present MurineCyto-Det, a high-resolution murine BALF cytology dataset comprising 333 image tiles of size 1024x1024 pixels, annotated across five cytological categories with both pixel-level segmentation masks and one-to-one matched bounding boxes. The dataset contains 14,551 annotated cell instances and supports two complementary analysis tasks: morphology-oriented cell segmentation and object-level cell detection. To establish reproducible benchmark baselines, we evaluated representative segmentation and detection models. The results demonstrate the practical utility of MurineCyto-Det while highlighting realistic challenges arising from class imbalance, small object size, irregular cell morphology, and ambiguous debris-like structures. MurineCyto-Det provides a standardized resource for developing, evaluating, and comparing automated methods for murine BALF cytology analysis. The dataset is publicly available at https://doi.org/10.5281/zenodo.17608677.

20

Geometric brain signatures of Alzheimer's disease progression and subtypes

Tong, B.; Cao, T.; Duong-Tran, D.; Davatzikos, C.; Thompson, P.; Andrew, S. J.; Fornito, A.; Shen, L.

2026-05-18 radiology and imaging 10.64898/2026.05.14.26353211 medRxiv

Top 0.7%

0.6%

Show abstract

Alzheimer's disease (AD) patients suffer from consequential diagnostic delay due to the lack of accessible biomarkers. They also show different responses to treatments due to disease heterogeneity and progression. Here, we developed a novel framework to identify disease progression and subtypes by using geometric brain signatures derived from multiple neuroimaging modalities, including [18F]-Florbetapir (AV45) Positron Emission Tomography (PET), [18F]-Fludeoxyglucose (FDG) PET, and structural Magnetic Resonance Imaging (MRI). These signatures were derived by decomposing corresponding maps of amyloid-beta levels, metabolic activity, and cortical thickness in terms of the fundamental, resonant modes-eigenmodes-of cortical geometry, each tied to a specific spatial resolution scale. Our results showed that geometric eigenmode-based features identified trajectories of disease progression, quantified as pseudotime, in distinct subtypes. The disease progression trajectories and subtypes are identified with high stability and are highly related to biological and cognitive measures. These performances are superior to those obtained using conventional localised features and remain robust across datasets, indicating that geometric signatures of brain structure and function can be used to uncover new markers of AD diagnosis and prognosis that are missed by conventional localisation approaches.