Back

Mathematics

MDPI AG

Preprints posted in the last 30 days, ranked by how well they match Mathematics's content profile, based on 11 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit.

1
Noisy periodicity in tropical respiratory disease dynamics

Yang, F.; Hanks, E. M.; Conway, J. M.; Bjornstad, O. N.; Thanh, N. T. L.; Boni, M. F.; Servadio, J. L.

2026-04-13 epidemiology 10.64898/2026.04.10.26350660 medRxiv
Top 0.2%
1.2%
Show abstract

Infectious disease surveillance systems in tropical countries show that respiratory disease incidence generally manifests as year-round activity with weak fluctuations and irregular seasonality. Previously, using a ten-year time series of influenza-like illness (ILI) collected from outpatient clinics in Ho Chi Minh City (HCMC), Vietnam, we found a combination of nonannual and annual signals driving these dynamics, but with unknown mechanisms. In this study, we use seven stochastic dynamical models incorporating humidity, temperature, and school term to investigate plausible mechanisms behind these annual and nonannual incidence trends. We use iterated filtering to fit the models and evaluate the models by comparing how well they replicate the combination of annual and nonannual signals. We find that a model including specific humidity, temperature, and school term best fits our observed data from HCMC and partially reproduces the irregular seasonality. The estimated effects from specific humidity and temperature on transmission are nonlinearly negative but weak. School dismissal is associated with decreased transmission, but also with low magnitude. Under these weak external drivers, we hypothesize that stochasticity makes a strong sub-annual cycle more likely to be observed in ILI disease dynamics. Our study shows a possible mechanism for respiratory disease dynamics in the tropics. When the external drivers are weak, the seasonality of respiratory disease dynamics is prone to the influence of stochasticity.

2
Analysis of biological networks using Krylov subspace trajectories

Frost, H. R.

2026-03-31 bioinformatics 10.64898/2026.03.29.715092 medRxiv
Top 0.2%
1.0%
Show abstract

We describe an approach for analyzing biological networks using rows of the Krylov subspace of the adjacency matrix. Specifically, we explore the scenario where the Krylov subspace matrix is computed via power iteration using a non-random and potentially non-uniform initial vector that captures a specific biological state or perturbation. In this case, the rows the Krylov subspace matrix (i.e., Krylov trajectories) carry important functional information about the network nodes in the biological context represented by the initial vector. We demonstrate the utility of this approach for community detection and perturbation analysis using the C. Elegans neural network.

3
Distributed elasticity: a high-reward, moderate-risk strategy for efficient control modulation in insect flight

Wang, L.; Zhang, C.; Asadimoghaddam, N.; Pons, A.

2026-03-25 systems biology 10.64898/2026.03.23.713675 medRxiv
Top 0.5%
0.7%
Show abstract

The environments inhabited by flying insects demand a balance between flight efficiency and flight manoeuvrability. In structural oscillators such as the insect indirect flight motor, efficiency (arising from resonance) and manoeuvrability (arising from kinematic modulation) are typically quid pro quo, with modulation incurring penalties to efficiency. Band-type resonance is a phenomenon that offers, in theory, a strategy to lessen these penalties via careful navigation through a band of efficient kinematic states. However, identifying this band is challenging: no methods exist to identify the complete band in realistic motor models, involving elasticity distributed across thorax and wing. Nor are the effects of elasticity distribution on the band known. In this work, we address both open topics. We present a suite of numerical methods for identifying the complete resonance band in general systems. Applying them to models of the insect flight motor with distributed elasticity--thoracic and wing flexion--reveals that distributed elasticity is moderate-risk but high-reward morphological feature. Well-tuned distributions expand the resonance band over fourfold whereas poorly-tuned distributions completely extinguish the resonance band. These results indicate that distributing elasticity across the insect flight motor can have adaptive value, and motivate broader work identifying distributions across species.

4
MOE-ECG: Multi-Objective Ensemble Fusion for Robust Atrial Fibrillation Detection Using Electrocardiograms

Peimankar, A.; Hossein Motlagh, N.; K. Khare, S.; Spicher, N.; Dominguez, H.; Abolghasemi, V.; Fujiwara, K.; Teichmann, D.; Rahmani, R.; Puthusserypady, S.

2026-03-30 health informatics 10.64898/2026.03.28.26349522 medRxiv
Top 0.6%
0.5%
Show abstract

Background: Atrial fibrillation (AFib) is the most common sustained arrhythmia in the world, imposing a heavy clinical and economic burden on global healthcare systems. Early detection of AFib can reduce mortality and morbidity, while helping to alleviate the growing economic burden of cardiovascular diseases. With the increasing availability of digital health technologies, computational solutions have great potential to support the timely diagnosis of cardiac abnormalities. Objectives: With the increasing availability of electrocardiogram (ECG) data from clinical and wearable devices, manual interpretation has become impractical due to its time-consuming and subjective nature. Existing automated approaches often rely on single classifiers or fixed ensembles that primarily optimize predictive accuracy while neglecting model diversity, which leads to limited robustness and generalization across heterogeneous datasets. Therefore, this study aims to develop a robust and diversity-aware framework for automatic AFib detection that simultaneously improves classification performance and model generalizability. To this end, we propose MOE-ECG, a multi-objective ensemble selection and fusion framework that explicitly optimizes both predictive performance and inter-model diversity for reliable AFib detection from ECG recordings. Methods: The proposed multi-objective ensemble (MOE) framework uses ensemble selection as a bi-objective optimization problem and employs multi-objective particle swarm optimization to identify complementary classifiers from a heterogeneous model pool. Unlike conventional ensembles, it explicitly optimizes both predictive performance and diversity and integrates Dempster-Shafer theory for uncertainty-aware decision fusion. After filtering the ECG signals to remove baseline wander and noise, they were segmented into windows of 20, 60, and 120 heartbeats with 50% overlap. The proposed approach was evaluated over five independent runs to assess its stability and generalization. Fifteen statistical and nonlinear features were obtained from the RR-intervals of the pre-processed ECG signals, of which eight features were selected with correlation analysis to capture subtle information from the ECG data. We trained and evaluated the performance of the proposed model in three open source databases, namely, the MIT-BIH Atrial Fibrillation Database, Saitama Heart Database Atrial Fibrillation, and Long-Term AF Database. Results: The proposed approach achieved the best overall performance on 60-beat segments, with an average accuracy of 89.85%, precision of 91.14%, recall of 94.19%, an F1-score of 92.64%, and area under the curve (AUC) of around 0.95. Statistical analysis using Holm-adjusted Wilcoxon tests confirmed significant improvements (p<0.05) compared to both the best individual classifier and the unoptimized average ensemble of all classifiers. These findings show that the proposed selection and evaluation methodology, rather than group aggregation alone, is the key driver of performance improvements. Conclusion: The results obtained demonstrate that the MOE-ECG model offers a robust, accurate, and reliable solution for the detection of AFib from short ECG segments. The empirical findings, in general, confirm that multi-objective ensemble fusion enhances diagnostic performance and offers robust predictions that will open up possibilities for real-time AFib detection in clinical and tele-health settings.

5
Fine-grained spatial data-driven ensemble modeling for predicting Sylvatic Yellow Fever environmental suitability in Brazil

Augusto, D. A.; Abdalla, L.; Krempser, E.; de Oliveira Passos, P. H.; Garkauskas Ramos, D.; Pecego Martins Romano, A.; Chame, M.

2026-04-01 epidemiology 10.64898/2026.03.26.26349443 medRxiv
Top 0.7%
0.4%
Show abstract

Sylvatic Yellow Fever (YF) is an infectious mosquito-borne disease with significant epidemiological relevance due to its widespread distribution and high lethality for human and non-human primates, particularly in tropical regions of the planet such as in Brazil. Identifying regions and periods of high environmental suitability for the occurrence of YF is essential for preventing or mitigating its burden, as it enables the efficient allocation of surveillance efforts, prevention, and implementation of control measures. Environmental modeling of YF occurrence has proven to be an effective approach toward this goal; however, its effectiveness strongly depends on the modeling framework's capabilities as well as the spatial and temporal precision of all associated data. We propose a fine-scale geospatial modeling of YF environmental suitability that is based on a generative machine-learning ensemble method built on a large set of high-resolution environmental covariates. First, we take the spatiotemporal statistical description of the environment of each of the 545 YF cases from 2019--2024 up to 30 m/monthly resolution at three buffer scales: 100 m, 500 m, and 1000 m ratios. Then, we perform a feature selection and train hundreds of One-Class Support Vector Machine submodels to form a robust ensemble model, whose predictions are projected to a 1x1 km resolution grid of Brazil under several metrics, exceeding seven million ensemble evaluations. The predictions ranked the Southern Brazil region with the highest mean suitability for YF, with a level of 0.64; Southeast comes next with 0.46, followed closely by Central-West region (0.44), North (0.39), and finally Northeast (0.28). The model exhibited high uncertainty for the North region, indicating that data collection efforts are much needed in this region. As for the environmental covariates, a feature analysis pointed out that Land use and cover accounts for the largest influence in the model output.

6
Improving Medicare Fraud Detection Accuracy in Deep Learning by Exploring Feature Selection and Data Sampling Techniques.

Ahammed, F.

2026-03-20 health informatics 10.64898/2026.03.18.26348763 medRxiv
Top 0.8%
0.4%
Show abstract

Fraud in the health landscape is an aggravating issue, with far-reaching consequences burdening the financial stability of the health industry and threatening the quality of medical care. It results from vulnerabilities within the current healthcare framework that are exploited by the fraudsters in their favor. In spite of many developed models that aim to detect fraudulent patterns in insurance claims, the accuracy of such models frequently suffers as a result of the imbalance issue of the Medicare dataset and irrelevant features. This study ventures to improve detection performance and accuracy by employing a deep learning model along with data sampling and feature selection techniques. Comparative analysis among different combinations is conducted to determine their efficacy to enhance the accuracy of the fraud detection model. Hence, the suggested model clearly demonstrates that a combination of myriad data sampling and feature selection techniques is helping to improve accuracy and performance. The accuracy was thus 95.4%, with negligible evidence of overfitting detected using both Chi-square and Synthetic Minority Over-sampling (SMOTE) techniques. Ultimately, the study findings underscore the significance of employing combined techniques instead of using only the baseline deep learning model for better performance in detecting Medicare insurance fraud.

7
A formula for the basic reproduction number of an infectious disease in a heterogeneous population with structured mixing

Colman, E.; Chatzilena, A.; Prasse, B.; Danon, L.; Brooks Pollock, E.

2026-03-30 epidemiology 10.64898/2026.03.27.26349419 medRxiv
Top 0.8%
0.4%
Show abstract

The basic reproduction number of an infectious disease is known to depend on the structure of contacts between individuals in a population. This relationship has been explored mathematically through two well-known models: one which depends on a matrix of contact rates between different demographic groups, and another which depends on the variability of contact rates over the population. Here we introduce a model that combines and generalises these two approaches. We derive a formula for the basic reproduction number and validate it through comparisons to simulated outbreaks. Applying this method to contact survey data collected in Belgium between 2020 and 2022, we find that our model produces higher estimates of the basic reproduction number and larger relative changes over periods when social contact behaviour was changing during the COVID-19 pandemic. Our analysis suggests some practical considerations when using contact data in models of infectious disease transmission.

8
A neurocomputational model of observation-based decision making with a focus on trust

Hassanejad Nazir, A.; Hellgren Kotaleski, J.; Liljenström, H.

2026-03-26 neuroscience 10.64898/2026.03.24.713845 medRxiv
Top 1%
0.3%
Show abstract

As social beings, humans make decisions partly based on social interaction. Observing the behavior of others can lead to learning from and about them, potentially increasing trust and prompting trust-based behavioral changes. Observation-based decision making involves different neural structures. The orbitofrontal cortex (OFC) and lateral prefrontal cortex (LPFC) are known as neural structures mainly involved in processing emotional and cognitive decision values, respectively, while the anterior cingulate cortex (ACC) plays a pivotal role as a social hub, integrating the afferent expectancy signals from OFC and LPFC. This paper presents a neurocomputational model of the interplay between observational learning and trust, as well as their role in individual decision-making. Our model elucidates and predicts the emotional and rational behavioral changes of an individual influenced by observing the action-outcome association of an alleged expert. We have modeled the neurodynamics of three cortical structures (OFC, LPFC, and ACC) and their interactions, where the neural oscillatory properties, modeled with Dynamic Bayesian Probability, represent the observers attitude towards the expert and the decision options. As an example of an everyday behavioral situation related to climate change, we use the choice of transportation between home and work. The EEG-like simulation outputs from our model represent the presumed brain activity of an individual making such a choice, assuming the decision-maker is exposed to social information.

9
Benchmark of biomarker identification and prognostic modeling methods on diverse censored data

Fletcher, W. L.; Sinha, S.

2026-04-01 bioinformatics 10.64898/2026.03.29.715113 medRxiv
Top 1%
0.3%
Show abstract

The practices of identifying biomarkers and developing prognostic models using genomic data has become increasingly prevalent. Such data often features characteristics that make these practices difficult, namely high dimensionality, correlations between predictors, and sparsity. Many modern methods have been developed to address these problematic characteristics while performing feature selection and prognostic modeling, but a large-scale comparison of their performances in these tasks on diverse right-censored time to event data (aka survival time data) is much needed. We have compiled many existing methods, including some machine learning methods, several which have performed well in previous benchmarks, primarily for comparison in regards to variable selection capability, and secondarily for survival time prediction on many synthetic datasets with varying levels of sparsity, correlation between predictors, and signal strength of informative predictors. For illustration, we have also performed multiple analyses on a publicly available and widely used cancer cohort from The Cancer Genome Atlas using these methods. We evaluated the methods through extensive simulation studies in terms of the false discovery rate, F1-score, concordance index, Brier score, root mean square error, and computation time. Of the methods compared, CoxBoost and the Adaptive LASSO performed well in all metrics, and the LASSO and elastic net excelled when evaluating concordance index and F1-score. The Benjamini-Hoschberg and q-value procedures showed volatile performances in controlling the false discovery rate. Some methods performances were greatly affected by differences in the data characteristics. With our extensive numerical study, we have identified the best performing methods for a plethora of data characteristics using informative metrics. This will help cancer researchers in choosing the best approach for their needs when working with genomic data.

10
Phase resetting of in-phase synchronized Hodgkin-Huxleydynamics under voltage perturbation reveals reduced null space

Gupta, R.; Karmeshu, ; Singh, R. K. B.

2026-03-24 neuroscience 10.64898/2026.03.21.713085 medRxiv
Top 1%
0.3%
Show abstract

Voltage perturbations to a repetitively firing Hodgkin-Huxley (HH) model of neuronal spiking in the bistable regime with coexisting limit cycle and stable steady node can either lead to the spikes phase resetting or collapse to the stable steady state. The latter describes a non-firing hyperpolarized quiescent state of the neuron despite the presence of constant external current. Using asymptotic phase response curve (PRC), the impact of voltage perturbations on a repetitively firing HH model is studied here while it is diffusively coupled to another HH model under identical external stimulation. It is observed that the pre-perturbation state of synchronization and the coupling strength critically determine the PRC response of the perturbed HH dynamics. Higher coupling strengths of perfectly in-phase (anti-phase) synchronized HH models shrink (expand) the combinatorial space of perturbation strengths and the oscillation phases causing collapse to the quiescent state. This indicates reduced (enlarged) basin of attraction, viz. the null space, associated with the steady state in the HH phase space. The findings bear important implications to the spiking dynamics of diverse interneurons, as well as special cases of pyramidal neurons, coupled through electrical synapses via. gap junctions, and suggest the role of gap junction plasticity in tuning vulnerability to quiescent state in the presence of biological noise and spikelets.

11
Postsynaptic integration of excitatory and inhibitory signals based on an adaptive firing threshold

Gambrell, O.; Singh, A.

2026-03-26 neuroscience 10.64898/2026.03.26.714497 medRxiv
Top 1%
0.3%
Show abstract

A key component of intraneuronal communication is the modulation of postsynaptic firing frequencies by stochastic transmitter release from presynaptic neurons. The time interval between successive postsynaptic firings is called the inter-spike interval (ISI), and understanding its statistics is integral to neural information processing. We start with a model of an excitatory chemical synapse with postsynaptic neuron firing governed as per a classical integrate-and-fire model. Using a first-passage time framework, we derive exact analytical results for the ISI statistical moments, revealing parameter regimes driving precision in postsynaptic action potential timing. Next, we extended this analysis to include both an excitatory and an inhibitory presynaptic connection onto the same postsynaptic neuron. We consider both a fixed postsynaptic-firing threshold and a threshold that adapts based on the postsynaptic membrane potential history. Our analysis shows that the latter adaptive threshold can result in scenarios where increasing the inhibitory input frequency increases the postsynaptic firing frequency. Moreover, we characterize parameter regimes where ISI noise is hypo-exponential or hyperexponential based on its coefficient of variation being less than or higher than one, respectively.

12
HybridNet-XR: Efficient Teacher-Free Self-Supervised Learning for Autonomous Medical Diagnostic Systems in Resource-Constrained Environments.

Mayala, S.; Mzurikwao, D.; Suluba, E.

2026-03-19 health informatics 10.64898/2026.03.16.26348570 medRxiv
Top 1%
0.2%
Show abstract

Deep learning model classification on large datasets is often limited in countries with restricted computational resources. While transfer learning can offset these limitations, standard architectures often maintain a high memory footprint. This study introduces HybridNet-XR, a memory-efficient and computationally lightweight hybrid convolutional neural network (CNN) designed to bridge the domain gap in medical radiography using autonomous self-supervised learning protocols. The HybridNet-XR architecture integrates depthwise separable convolutions for parameter reduction, residual connections for gradient stability, and aggressive early downsampling to minimize the video RAM (VRAM) footprint. We evaluated several training paradigms, including teacher-free self-supervised learning (SSL-SimCLR), teacher-led knowledge distillation (KD), and domain-gap (DG) adaptation. Each variant was pre-trained on ImageNet-1k subsets and fine-tuned on the ChestX6 multi-class dataset. Model interpretability was validated through gradient-weighted class activation mapping (Grad-CAM). The performance frontier analysis identified the HybridNet-XR-150-PW (Pre-warmed) as the optimal configuration, achieving a 93.38% average accuracy and 99% AUC while utilizing only 814.80 MB of VRAM. Regarding class-wise accuracy, this variant significantly outperformed standard MobileNetV2 and teacher-led models in critical diagnostic categories, notably Covid-19 (97.98%) and Emphysema (96.80%). Grad-CAM visualizations confirmed that the teacher-free pre-warming phase allows the model to develop sharper, anatomically grounded focus on pathological landmarks compared to distilled models. Specialized pre-warming schedules offer a viable, computationally autonomous alternative to knowledge distillation for medical imaging. By eliminating the requirement for high-performance teacher models, HybridNet-XR provides a robust and trustworthy diagnostic foundation suitable for clinical deployment in resource-constrained environments. Author summaryTraditional deep learning models for medical imaging are often too large for the low-power computers available in many global health settings. We developed a new model to bridge this computational gap. We designed HybridNet-XR, a highly efficient AI architecture, and trained it using a "teacher-free" method that doesnt require a massive supercomputer. We found a specific version (H-XR150-PW) that provides high accuracy while using very little memory. Our results show that high-performance diagnostic AI can be deployed on standard, low-cost hardware. Furthermore, using visual heatmaps (Grad-CAM), we proved that the AI correctly identifies medical landmarks like lung opacities, ensuring it is safe and reliable for real-world clinical use.

13
Physicochemical Characterization of Stingless Bees' (Meliponula beccarii L.) Honey from Wonchi District, Southwest Shewa Zone, Ethiopia

Gedefa, S. A.; Landina Lata, D.

2026-04-03 microbiology 10.64898/2026.04.01.715950 medRxiv
Top 1%
0.2%
Show abstract

This study was aimed at characterizing the physicochemical analysis of stingless bees honey (SBH) in the Wonchi district, Southwest Shewa Zone, Ethiopia. In this study, a total of 30 stingless bees honey samples were collected from Damu Dagele, Fite Wato, and Warabu Messe sites from underground soils through an excavation of natural nests. Physicochemical characterization of properties and proximate analysis of the honey were performed. The result showed a total mean of 20.12{+/-}1.14% moisture content, 8.62{+/-}2.73 meq./kg free acidity, 1.8{+/-}0.52 mS/cm electrical conductivity, 3.39{+/-}0.32 pH, 40.52{+/-}6.61 mg/kg HMF, 0.83{+/-}0.33% ash, 0.56{+/-}0.25% protein, 0.56{+/-}0.24% fat, and 0.59{+/-}0.23% WISC for physicochemical properties of stingless bees honey. Among sugar profiles of SBH, fructose constituted the highest proportion at 18.87 g per 100 g (53.87%), while sucrose exhibited the lowest concentration at 5 g per 100 g (14.33%). The result showed that the highest constituted mean of mineral composition was observed with potassium (K) of 16.64{+/-}0.257 mg/kg, while magnesium (Mg) showed the lowest concentration of 3.48{+/-}0.17 mg/kg. A substantial correlation was observed between K and Mg, with a correlation coefficient of 0.72 and 0.72, and similarly between K and Calcium (Ca); the correlation was highly significant, exhibiting a correlation coefficient of 0.65. Furthermore, the correlation between fatty and other physicochemical and proximate analyses showed very insignificant correlations. In general, this study showed that the SBH produced in the current study area has good physicochemical properties and moisture and contains high-quality honey, which may help its traditional medicinal uses. The findings of the study further suggests the potentiality of the area for quality honey, and to easily locate priority areas for stingless bee conservation, further detailed studies of other stingless species honey medicinal values are recommended.

14
Mechanistic Insights into Skin Sympathetic Nerve Activity Dynamics in Healthy Subjects Through a Two-Layer Signal-Analytical and Closed-Loop Physiological Modeling Framework

Lin, R.; Halfwerk, F. R.; Donker, D. W.; Tertoolen, J.; van der Pas, V. R.; Laverman, G. D.; Wang, Y.

2026-04-13 health informatics 10.64898/2026.04.11.26350680 medRxiv
Top 1%
0.2%
Show abstract

Objective: Skin sympathetic nerve activity (SKNA) has emerged as a promising non-invasive surrogate measure of sympathetic drive, but its relevant physiological characteristics remain ill-defined. This observational study aims to investigate its regulatory patterns during rest and Valsalva maneuver (VM) in healthy participants. Method: Using a two-layer strategy integrating signal analysis and physiological modelling, we analyzed data recorded from 41 subjects performing repeated VMs. The observational layer includes time-domain feature comparisons using linear mixed-effect models, and time-varying spectral coherence analysis. The mechanistic layer proposes a mathematical model to investigate whether baroreflex and respiratory modulation are sufficient to reproduce the observed HR and average SKNA (aSKNA) dynamics. Main Results: Mean integrated SKNA (iSKNA) showed more significant change than HRV for VM induced effects. We also found mean iSKNA increase during VM varies with BMI and sex. The coherence analysis indicated that iSKNA strongly synchronized with EDR under resting conditions. The proposed model successfully reproduced main characteristics of aSKNA dynamics, yielding a high median Pearson correlation coefficient of 0.80 ([Q1, Q3] = [0.60, 0.91]). In contrast, HR dynamics were only partially captured, with a median PCC of 0.37 ([Q1, Q3] = [0.16, 0.55]). These results likely suggest SKNA provides a more direct representation of sympathetic burst dynamics during VM in healthy subjects. Significance: This study provides convergent evidence that SKNA reflects known autonomic regulatory influences in healthy subjects. These findings strengthen the physiological interpretability of SKNA while clarifying its appropriate use as a practical biomarker of sympathetic function.

15
Automated detection of adult autism from vowel acoustics using machine learning

Georgiou, G. P.; Paphiti, M.

2026-04-04 health informatics 10.64898/2026.04.03.26350102 medRxiv
Top 2%
0.2%
Show abstract

Autism spectrum disorder (ASD) is a neurodevelopmental condition for which timely and accurate detection remains a major clinical priority. Early and reliable identification is important because it can facilitate access to assessment, diagnosis, and appropriate support; however, current diagnostic pathways still rely largely on behavioural evaluation and clinical judgement. In this context, machine-learning (ML) approaches have attracted growing interest because they can identify subtle and complex patterns in speech data that may not be easily captured through conventional methods. The current study capitalizes on this potential by developing and evaluating ML models for distinguishing autistic individuals from neurotypical individuals based on speech features. More specifically, acoustic features of vowels, including fundamental frequency (F0), first three formants (F1, F2, F3), duration, jitter, shimmer, harmonics-to-noise ratio (HNR), and intensity, were elicited from 18 autistic adults and 18 neurotypical adults through a controlled production task. Then, four supervised ML models were trained and evaluated on these features: LightGBM, Random Forest, Support Vector Machine, and XGBoost. All models demonstrated good classification performance, with the best-performing model achieving a strong discriminability of 89%. The explainability analysis identified F0 as the most influential predictor by a substantial margin, followed by intensity, F3, and F1, while duration, shimmer, HNR, jitter, and F2 contributed more modestly. These findings demonstrate that vowel acoustics contain clinically relevant information for distinguishing autistic from neurotypical adult speech and highlight the potential of interpretable, speech-based ML as a transparent and scalable aid for ASD screening and assessment.

16
A multi-flow approach for binning circular plasmids from short-reads assembly graphs

Epain, V.; Mane, A.; Della Vedova, G.; Bonizzoni, P.; Chauve, C.

2026-03-26 genomics 10.64898/2026.03.25.714305 medRxiv
Top 2%
0.2%
Show abstract

We address the problem of plasmid binning, that aims to group contigs - from a draft short-read assembly for a bacterial sample - into bins each expected to correspond to a plasmid present in the sequenced bacterial genome. We formulate the plasmid binning problem as a network multi-flow problem in the assembly graph and describe a Mixed-Integer Linear Program to solve it. We compare our new method, PlasBin-HMF, with state-of-the-art methods,MOB-recon, gplasCC, and PlasBin-flow, on a dataset of more than 500 bacterial samples, and show that PlasBin-HMF outperforms the other methods, by preserving the explainability.

17
Understanding patterns of variant emergence and spread in an ongoing epidemic

Nande, A.; Levy, M. Z.; Hill, A. L.

2026-03-30 epidemiology 10.64898/2026.03.27.26349560 medRxiv
Top 2%
0.2%
Show abstract

The COVID-19 pandemic saw successive emergence and global spread of novel viral variants, exhibiting enhanced transmissibility or evasion of immunity. While the genotypic and phenotypic basis of SARS-CoV-2 variants have been extensively characterized, the evolutionary factors governing their patterns of emergence are less well understood. In this study we systematically investigated how the invasion dynamics of viral variants depend on variant phenotype (increased transmissibility or immune evasion), source (local evolution vs importation), the timing of introduction, the distribution of population susceptibility, and the contact network structure. Using a stochastic multi-strain epidemic model, we find that strains with only a transmission advantage are more likely to emerge earlier in the epidemic, and rapidly and predictably dominate the viral population. In contrast, immune-escape variants tend to linger at low prevalence for extended time periods after emergence, avoiding detection, until a critical amount of immunity has built up in the population and they begin to rapidly outcompete existing strains. We find that two common features of realistic human contact networks---heterogeneity in contacts (overdispersion) and clustering---lead to more punctuated evolutionary dynamics. This work provides insight into past dynamics of SARS-CoV-2 variants and can help define planning scenarios for future epidemic modeling efforts.

18
Semantic-Aware Energy-Efficient Operation inSmart Capsule Endoscopy

Zoofaghari, M.; Rahaimifard, A.; Chatterjee, S.; Balasingham, I.

2026-03-19 bioinformatics 10.64898/2026.03.17.712375 medRxiv
Top 2%
0.1%
Show abstract

Goal-oriented semantic communication has recently emerged in wireless sensor-actuator networks, emphasizing the meaning and relevance of information over raw data delivery, thereby enabling resource-efficient telecommunication. This paradigm offers significant benefits for intra-body or implantable sensor-actuator networks, including dramatic reductions in bandwidth requirements, latency, and power consumption. In this paper, we address a patch-based energy-efficient anomaly detection method for smart capsule endoscopy. We propose a deep learningbased algorithm that employs the similarity between features extracted from measured images and a reference (normal) image as the detection metric. The algorithm is evaluated using a clinical dataset of capsule-captured images, combined with a simulated intra-body channel model. The results demonstrate that even with only 60% of the transmission power (relative to a standard link design for QPSK modulation) and 65% of the light intensity, the probability of anomaly detection remains above 85%, and it gradually improves as power and illumination levels increase. This improvement translates into a potential battery life extension of over 43%. The findings highlight the potential of semanticaware, energy-efficient intra-body devices for more sustainable and effective medical interventions.

19
A comprehensive computational analysis investigating the relationships between phage codon usage, infection style, and number of tRNA genes

Ross, N. D.; Doore, S. M.

2026-03-20 microbiology 10.64898/2026.03.19.712862 medRxiv
Top 2%
0.1%
Show abstract

It has been known for decades that bacteriophages encode tRNA genes, but their function and the factors contributing to their acquisition and retention are unclear. Although tRNAs are found in a variety of phages infecting a variety of bacteria, many large-scale computational studies investigating tRNA acquisition and retention in phages are specific to Mycobacterium phages; however, these findings may not be representative of other phages or bacteria. This work uses a broader sampling of phages and hosts to investigate the relationships between codon usage bias, infection cycle, and tRNA gene numbers in phage genomes. We analyzed 154 phages infecting 7 host genera, including Gram-negative (Escherichia, Shigella, Salmonella) and Gram-positive (Bacillus, Lactobacillus, Staphylococcus, Mycobacterium) bacteria. Phages included temperate and virulent representatives, plus a range of tRNA numbers and morphologies. All phages and hosts were analyzed using four metrics: GC content, Effective Number of Codons, Relative Synonymous Codon Usage, and tRNA Adaptation Index. On a global scale, virulent phages with many tRNA genes show greater differences in codon usage and codon adaptation compared to their respective hosts. Gram-negative bacteria and their phages generally exhibit greater differences in codon usage compared to Gram-positive bacteria and their phages. Phages infecting Gram-negative hosts also tend to encode more tRNA genes. In nearly all genus-level comparisons, Mycobacterium phages were different from any other host and from global patterns. This suggests previous computational studies performed in Mycobacterium phages are likely not applicable on a global scale or to phages infecting other host genera. AUTHOR SUMMARYBacteriophages, or phages, are viruses infecting bacteria. They are abundant in all environments, yet how they interact with their bacterial hosts is still not well-understood. Like other viruses, phages must rely on the host translational components to replicate and form new phage particles; and similarly to other parasites, phages have genomes that differ significantly from their hosts in terms of composition. In this work, we explore the relationship between phage lifestyle, number of tRNA genes encoded, and genome differences from the host using a variety of phages and their associated hosts. Phages can be either virulent (do not integrate into the host genome) or temperate (capable of integrating into the host genome), with differences from the host genome more pronounced in virulent phages. There are many phages that also carry tRNA genes, and having higher numbers of tRNAs is associated with larger differences from the host genome. The findings here indicate that virulent phages carrying large numbers of tRNAs diverge the most from host genome composition.

20
Simulation of neurotransmitter release and its imaging by fluorescent sensors

Gretz, J.; Mohr, J. M.; Hill, B. F.; Andreeva, V.; Erpenbeck, L.; Kruss, S.

2026-03-25 neuroscience 10.64898/2026.03.23.707923 medRxiv
Top 2%
0.1%
Show abstract

Cells release signaling molecules such as neurotransmitters that diffuse through the extracellular space and bind to receptors. These signaling molecules can be detected by fluorescent sensors/probes to provide images of the signaling process. Such images are not equivalent to a concentration because diffusion and sensor kinetics affect (convolute) them. Therefore, computational approaches are necessary to disentangle these contributions and allow interpretation of fluorescent sensor-based images. Here, we present a kinetic Monte Carlo framework (FLuorescence Imaging Kinetic Simulation, FLIKS) that simulates signaling molecules undergoing cellular release, stochastic diffusion and reversible binding to sensors in realistic cellular (2D or 3D) geometries. We apply it to model neurotransmitter (dopamine) release in synaptic clefts and for paracrine signaling by immune cells. We also show how sensor location, sensor kinetics and release location affect fluorescence images. For example, we show how sensor sensitivity depends on the distance from the synaptic cleft and changes when dopamine transporters (DAT) clear dopamine. The approach also allows to compare the performance of membrane bound (genetically encoded) sensors versus artificial sensors such as nanosensors placed outside under or around the cells. As an example, we also demonstrate how the images of catecholamine release by immune cells can be modeled and compared to experimental data to better understand the release pattern. This framework provides a quantitative basis for analyzing and interpreting fluorescent sensor imaging data.