Mathematics — Latest Matching Preprints

1

Modeling the Effectiveness of Antibiotic Therapies Against Sepsis Using Continuous-time Hidden Markov Models

Schmiegel, S.; Marchi, H.; Borgstedt, R.; Rehberg, S.; Fuchs, C.; Mews, S.

2026-07-10 health informatics 10.64898/2026.07.03.26357092 medRxiv

Top 0.3%

0.9%

Show abstract

Patients suffering from sepsis need to be treated with an effective antibiotic therapy within the first hour after sepsis onset to decrease their risk of death. Microbiological data that provide information about the suitability of antibiotic therapies, however, is usually available only after 72 hours. Consequently, the treating physicians need to judge a therapy's effectiveness based on the patients' measured health records and their general health condition. This medical assessment is complex and requires years of experience. In our study, we investigate how statistical modeling can contribute to assessing the effectiveness of antibiotic therapies. To that purpose, we describe the effectiveness of antibiotic therapies by modeling sepsis patients' health conditions using a three-state continuous-time hidden Markov model (ctHMM). In literature, procalcitonin (PCT) and lactate have proven to be helpful for deriving the health condition in this context. The state probabilities obtained by the ctHMM are subsequently used to quantify the effectiveness of antibiotic therapies. To this end, we apply two different approaches, namely (i) averaging of the state probabilities and (ii) a logistic regression model. For (i), we calculate the average of the state probabilities for the state indicating a sepsis-free condition over an antibiotic administration period of 48 hours. For (ii), we use the information about antibiotic susceptibility testings as dependent variable in the logistic regression model; as independent variables, we calculate the difference between state probabilities at the start of antibiotic administration and 48 hours later. With this work, we are able to better understand the relationship between laboratory values, in particular PCT and lactate, and the patients' health condition. We further provide approaches for quantifying the effectiveness. Therefore, our work contributes to developing a clinical decision support system which helps physicians assess the effectiveness of antibiotic therapies in patients with sepsis. Supported by such a system, a physician is able to quickly adjust an ineffective therapy which avoids antibiotic resistances and increases a patient's chance to survive a sepsis.

2

CerViX-Net: A Multi-Branch Fusion of Vision Transformer and Convolutional Neural Networks for Cervical Cancer Detection using Cytology Images

De, S.

2026-06-24 radiology and imaging 10.64898/2026.06.24.26356425 medRxiv

Top 0.3%

0.8%

Show abstract

Cervical cancer represents a pressing global health challenge, emphasizing the critical need for accurate and timely diagnostic methods to facilitate effective treatment and improve survival rates. In response to this challenge, the study presents CerViX-Net, an innovative classification framework designed to advance cervical cancer detection through enhanced computational efficiency and diagnostic accuracy. The development of CerViX-Net is motivated by the limitations of traditional diagnostic models, particularly in handling the computational and memory demands of large-scale data, while ensuring precise feature extraction and classification. CerViX-Net employs a hybrid deep learning architecture that combines the capabilities of ResNet50, EfficientNet-B0, and a Modified Vision Transformer (ViT) module. The ResNet50 branch extracts hierarchical features through stacked convolutional and identity blocks. In another path, the modified ViT module transforms image patches via linear projection, augments them with positional and class embeddings, and processes them using Parallel Transformer Encoder layers to model contextual relationships. Concurrently, EfficientNet-B0 utilizes MBConv blocks to extract multi-scale representations. The feature outputs from all three branches are integrated and passed through a classification head consisting of dropout layers and dense layers to ensure robust and accurate predictions. The proposed framework is rigorously evaluated on the Mendeley LBC dataset, achieving exceptional performance metrics with an accuracy of 99.69%, precision of 99.28%, recall of 99.48%, and an F1-score of 99.52%. The robustness of CerViX-Net is further validated on the SIPaKMeD and Herlev Pap Smear datasets, where it demonstrates comparable excellence, underscoring its efficacy and adaptability across diverse cytology datasets. Statistical validation using Friedman's test further reinforces its superiority over competing methods.

3

Optimizing automated classification for zooplankton in coastal conditions: the impact of model selection, imaging instruments, and colour information

Hovenkamp, P. D. L.; van Walraven, L.; Ollevier, A.; van Oevelen, D.; van der Stappen, A. F.

2026-07-13 bioinformatics 10.64898/2026.07.09.733739 medRxiv

Top 0.5%

0.5%

Show abstract

The advancement in deep learning techniques has made Convolutional Neural Networks (CNNs) a powerful tool for the fully automated classification of zooplankton images. In this study, we systematically investigate how network selection, colour information and differences in imaging instruments affect the classification of zooplankton images by comparing multiple state-of-the-art CNNs on images of zooplankton and marine snow from the in situ Continuous Particle Imaging and Classification Sensor (CPICS), Video Plankton Recorder (VPR), In Situ Ichtyoplankton Imaging System (ISIIS), and the on-board Plankton Imager (Pi-10). With differences between models of 7.8 to 19% in F1-score, we find that model selection strongly affects the classification performance, with EfficientNetV2S showing the most reliable overall performance. Moreover, differences between model architectures are largest for the least abundant classes (<100 labeled images), which implies that when these are present, careful model selection is most beneficial. The high image quality of the Pi-10 strongly increases the performance for the least abundant classes compared to the other instruments. In addition, we find a significant correlation (r = 0.597) between ImageNet the performance and F1-score on zooplankton images, which implies that more generally, a model that performs well on ImageNet will perform well for zooplankton classification. Colour information increases the F1-score of the best performing classifier with 2.8%, but provides a stronger benefit (25% F1-score) for classes with <100 images. The overall performance increase of colour information is less than expected and questions the advantage of recording colour information for zooplankton.

4

Generative embedding of sparse data with a tabular foundation model for dengue anticipatory action: a machine learning approach

Pelitro, K. J.; Manzano, J. F.; Matavia, T. O.; Soriano, K.; Bilbao, K.; Garcia, G. M.; Delos Angeles, A. J.; Lagmay, A. M.; Bandoy, D. D.

2026-07-06 health informatics 10.64898/2026.07.03.26357228 medRxiv

Top 0.5%

0.5%

Show abstract

Background Early outbreak detection often depends on complex, data-intensive models that have limited operational use in sparse surveillance settings. We developed a domain-mechanistic generative embedding that converts case counts and rainfall into a structured representation of dengue transmission for early epidemic-onset detection. Methods We constructed a 132-feature generative embedding from sparse dengue case and rainfall data. A tabular foundation model was evaluated using leave-one-year-out validation with paired cluster-bootstrap uncertainty intervals across 17 Philippine regions and eight dengue-endemic countries. Performance was benchmarked against raw input columns and catch22 time-series features. Findings Raw case and rainfall columns provided weak discrimination for dengue outbreak onset, with AUROC ranging from 0.56 to 0.70. The generative embedding improved prediction to AUROC 0.77 across countries and 0.89 across regions, corresponding to gains of +0.205 and +0.183 over raw columns, respectively, with paired cluster-bootstrap p[≤]0.006. Calibration error remained low at both regional and country scales, with expected calibration error of 0.067 and 0.149, respectively. Predictability was strongest in highly seasonal settings, including Philippine Type I regions, Mexico, Brazil, and the Philippines, whereas year-round transmission or opposing coastal rainfall regimes produced weaker performance. Country estimates based on only one or two retained epidemic seasons were unstable. Interpretation Under sparse surveillance conditions, the predictive capacity of a tabular foundation model depended strongly on the representation supplied to it. A generative embedding of climate and epidemiological dynamics translated limited case and rainfall inputs into actionable early-warning signals, with accuracy scaling according to local seasonal structure. These findings support mechanism-grounded embeddings as a practical route for extending prospective dengue outbreak surveillance in data-limited settings, especially at regional scales where calibration and deployment are most appropriate.

5

Mathematical models for influenza vaccination in homeless hostels

Xu, J.; Hutchinson, N.; House, T.; Pellis, L.; Hayward, A.; Hall, I.

2026-07-14 epidemiology 10.64898/2026.07.10.26357528 medRxiv

Top 0.5%

0.5%

Show abstract

The aim of this paper is to model homeless accommodation settings to investigate how vaccination mitigates the outbreaks, highlighting the importance of vaccination in vulnerable settings. We estimate the daily per capita contact rate with wider community, the internal transmission rate, and the achieved vaccine coverage. We present stochastic simulation of the final size of disease outbreaks given choices of internal and external transmission. We conclude that vaccine that has effect in reducing transmission will mitigate the outbreak in homeless hostels but it will have better results when the household population has large vaccination coverage, which may lead to more cost from the health economic perspective.

6

Assessing tensor decomposition quality of immune profiling data from a dictionary learning perspective

Konstorum, A.; Xing, J.; Aeron, S.; Kilmer, M.; Kleinstein, S.

2026-07-09 bioinformatics 10.64898/2026.07.03.736447 medRxiv

Top 0.6%

0.4%

Show abstract

Systems-level immune profiling data arising from longitudinal studies of vaccination or infection has an inherent multi-index array structure. While tensor decomposition of such datasets has gained popularity, choosing a rank and trial for a decomposition is not straightforward. We show that taking into account the experimental data model can inspire the development of new metrics to assess the quality of a Non-negative CANDECOMP/PARAFAC (NCPD) decomposition, and can thus be used to choose a rank and trial for the decomposition. Moreover, we show how framing the results via a dictionary learning framework can better enable interpretation of the components of the decomposition.

7

Adaptive multi-model ensembles for improved epidemic projections and decision support

Fiandrino, S.; Paolotti, D.; Bay, C.; Chinazzi, M.; Davis, J. T.; Bents, S. J.; Perofsky, A. C.; Turtle, J. A.; Riley, P.; Ben-Nun, M.; Moore, S. M.; Perkins, A.; Camargo Espana, G. F.; Srivastava, A.; Aawar, M. A.; Bandekar, S. R.; Bi, K.; Bouchnita, A.; Fox, S. J.; Meyers, L. A.; Venkatramanan, S.; Porebski, P.; Adiga, A.; Lewis, B.; Marathe, M.; Haghpanah, F.; Klein, E.; Loo, S. L.; Jung, S.-m.; Smith, C. P.; Contamin, L.; Hochheiser, H.; Carcelen, E. C.; Howerton, E.; Shea, K.; Yan, K.; Runge, M. C.; Viboud, C.; Pearson, C. A. B.; Truelove, S. A.; Lessler, J.; Borchering, R.; Biggerstaff,

2026-06-29 epidemiology 10.64898/2026.06.26.26356648 medRxiv

Top 0.6%

0.4%

Show abstract

In recent years, the use of multi-model ensemble projections in infectious disease modeling has become an established methodological approach to account for and integrate across uncertainties and structural differences present in individual models. However, the creation of long-term ensemble projections through these coordinated efforts is resource-intensive, demanding the input of multiple research teams and substantial computational power. This typically limits the ability to refine projections, update the selection of plausible epidemic trajectories, or expand the number of scenarios that can be assessed, even as new empirical data become available. To address this challenge, we define an adaptive ensemble approach that, analogously to a multi-model particle filtering method, dynamically selects individual model trajectories based on observed data throughout the epidemic projection period. We demonstrate the effectiveness of this methodology using the U.S. Flu Scenario Modeling Hub (SMH) projections for influenza hospitalizations in the United States during the 2023-2024 and 2024-2025 winter seasons. Our findings show that the adaptive ensemble yields improved predictive accuracy with respect to the original SMH ensemble projections across several scoring rules and geographical resolutions. Furthermore, the adaptive ensemble approach offers two additional applications: i) the dynamic assignment of posterior probabilities to epidemic scenarios, identifying the most plausible scenario, and representing how reality is captured by a combination of scenarios, and ii) the potential use for short-term forecasting. The adaptive ensemble approach is able to identify the most likely scenarios for the 2023-2024 and 2024-2025 U.S. influenza seasons, even in the early stages of the epidemic. It outperforms, retrospectively, a baseline model in short-term forecasting of influenza hospitalizations in the United States during the two seasons across various horizons and scoring rules, showing potential to contribute to real-time collaborative forecasting challenges such as CDC's FluSight. The proposed approach offers an efficient or low-resource strategy to increase the impact of multi-model epidemic projections by providing real-time support to modeling teams, public health authorities, and decision-makers.

8

GCBM-DCT-HV-Bio-NL-Grow-CHG-CSM-RHEC: A Unified Geometric, Biological, Causal, and Regenerative Framework for Mechanism-Aware Tissue and Connectome Modeling

Xu, T.; Hu, Z.; Sun, X.; Jin, L.; Xiong, M.

2026-06-29 bioinformatics 10.64898/2026.06.24.734320 medRxiv

Top 0.6%

0.4%

Show abstract

Modern biological prediction problems increasingly require models that go beyond Euclidean feature regression and local graph smoothing. Tissue, cellular, and connectome systems are nonlinear, geometry-dependent, intervention-sensitive, history-dependent, and subject to regenerative or homeostatic constraints. We propose GCBM/DCT/HV/Bio/NL/Grow/CHG/CSM/RHEC, a unified model for mechanism-aware biological prediction. The model integrates geometric connectome dynamics, differentiable charted tissue geometry, Hamiltonian latent transport, nonlinear biological kinetics, nested latent memory, continual growth without overwriting, causal hypergraph structure, causal structure modeling, and regenerative homeostatic error correction. Unlike Euclidean baselines, which treat observations as flat vectors, and local graph baselines, which use neighborhood smoothing without mechanistic structure, the proposed model represents biological states (Trapnell 2015) as coupled geometric, dynamical, causal, and regenerative objects. We evaluate the model on four synthetic toy studies, Toy A, B,C, D, designed to reflect increasing biological complexity: local Euclidean structure, nonlinear mechano-chemical interaction, causal intervention response, and out-of-distribution regenerative shift. Compared with Euclidean and local graph baselines, the full model achieves the lowest mean squared error across all four toy studies. Relative to the Euclidean baseline, the full model reduces MSE by approximately 63.0%, 89.1%, 89.0%, and 90.9% on Toy A, Toy B, Toy C, and Toy D, respectively. These results support the value of integrating geometry, mechanism, causal structure, adaptive growth, and regenerative correction into a single predictive architecture (Figure 1).

9

Mathematical Modeling of Rift Valley Fever in the Sahelian Zone

Djimramadji, H.; Ndonane, B.; Djaouga, P.; MARKHOUS, H. M.; Djoumountanan, E.; TOBAYE, K.; Abakar, F. M.

2026-07-17 epidemiology 10.64898/2026.07.15.26358164 medRxiv

Top 0.7%

0.4%

Show abstract

We develop a mathematical model of Rift Valley Fever integrating mosquito vectors, ruminants, and humans, based on an SEIR-type structure with vertical transmission in vectors. Local data from the Sudanian and especially the Sahelian zones are used to capture the impact of climatic variations on mosquito population dynamics. The mathematical analysis establishes the models positivity, determines the basic reproduction number R0, and demonstrates the local and global stability of the disease-free equilibrium. Sensitivity analysis (PRCC) highlights the most influential parameters, while the stochastic approach using a continuous-time Markov chain confirms the major role of seasonal rainfall. Numerical simulations reveal a peak in animal and human infections around the 9th month, correlating with periods of heavy rainfall. This model provides a relevant tool for surveillance and prevention within a "One Health" approach in Chad.

10

A New Method to Predict the Effect of an Intervention in the Host Population to Reduce the Magnitude of an Outbreak of a Vector-Borne Infection

Coutinho, F. A. B.; Amaku, M.; Kallas, E. G.; Massad, E.

2026-07-19 epidemiology 10.64898/2026.07.16.26358272 medRxiv

Top 0.9%

0.3%

Show abstract

In this paper, we propose a new model to estimate the impact of an intervention on human hosts of a vector-borne infection, such as dengue, which occurs in yearly outbreaks of different magnitudes. The model applies to these outbreaks and, in fact, is independent of their intensity, that is, it does not require the steady-state assumption. The model takes as input the officially reported age-dependent number of cases of a vector-borne infection. It is deterministic and does not account for stochasticity. Our objective is to estimate the impact of the intervention (the efficacy), and we rely on the observed fact that the age distribution of the proportion of cases of the infections transmitted by the same vector is independent of both the intensity of transmission and the geographic area studied, at least for Brazilian regions. This finding is highlighted in the main text and forms the basis of our calculations. A hypothetical intervention is simulated using a dengue vaccine, which allows the determination of the optimal strategy for a vaccination campaign.

11

Neural Processes with Normalizing Flows for Wheat Height Estimation

Boss, M.;Volpi, M.;Roth, L.

2026-07-09 Plant Biology 10.64898/2026.06.24.734247 medRxiv

Top 1%

0.2%

Show abstract

In this work, we investigate modeling plant traits over time using neural processes, a class of machine learning models that learn distributions over functions. Plant growth is an inherently stochastic process with complex dynamics measured mostly at irregular times throughout the growing seasons. While individual trait trajectories may be simple, their distributions are shaped by complex interactions between genotype, environment, and other factors. In particular, we focus on plant height in wheat, a deceptively simple-looking trait with complex dynamics. To model these trajectory distributions, we evaluate neural processes and in particular extensions using normalizing flows, with different combinations of genotype and environmental covariates. For controlled evaluations, we generate synthetic wheat height trajectories calibrated against Swiss weather station records and the FIP1 dataset. To fully evaluate these trajectory distributions, we use signatures, vector representations of sequential data, together with Sig-MMD and the recently introduced CSig-MMD. Sig-MMD enables direct pathwise comparison of predicted and simulator trajectory distributions, while CSig-MMD focuses this comparison on the tail, including lodged trajectories. Together, these metrics allow us to assess whether the models capture the full distribution of growth trajectories, including rare outcomes.

12

Odor Annoyance, Sensory Irritation or Relaxation: Acute Effects of Real Pinewood Emissions in Indoor Air Scenarios

Hucke, C. I.; Gallus, V.; Butter, K.; Reiser, J. E.; Ohlmeyer, M.; van Thriel, C.

2026-07-08 physiology 10.64898/2026.07.03.736270 medRxiv

Top 1%

0.2%

Show abstract

Wood is commonly used in the building sector, emitting volatile organic compounds (VOCs) contributing to indoor air quality. These VOC profiles can have a pleasant smell and positive effects e.g., induce relaxation. Contrarily, VOCs can have adverse health effects in higher concentrations. Therefore, some VOCs are regulated by guide values (GV). Potentially positive and negative effects of pinewood emissions, ranging from 0.2 mg/m3 (German GV I for bicyclic terpenes) to 2.0 mg/m3 (GV II) were investigated in an experimental 2 h exposure study using a within-subject design. Thirty-two healthy participants rated the perception, pleasantness, symptoms of irritation, and indicators of well-being. During a demanding working memory task (n-back) and a resting period, heart rate (HR) and HR variability (HRV) changes were measured. Before and after each session physiological markers of sensory irritation were assessed. Ratings indicated that the exposure to GV I and GV II were not perceived as more intense or pleasant. Mostly concentration-independent effects were revealed, indicating that inter-individual factors influenced the ratings rather than the VOCs. The pinewood odors during the n-back task did not cause distraction nor did it facilitate performance as previously suggested. HR/V changes indicated that pinewood odors during and after the n-back tasks did not induce relaxation. Only symptoms of nasal irritation showed some weak concentration-dependency, not supported by physiological markers or comparable ratings of sensory irritation. In conclusion, the fact that no distinct odor is detected suggests that interfering factors potentially prevent the regulation of odors at relevant indoor air concentrations.

13

Feature Selection with Quantum Annealing for Biomedical Machine Learning Applications

Dudgeon, S. N.; Lee, S. J.; Durant, T. J.; Nelson, B.; Young, H. P.; Ohno-Machado, L.; Taylor, R. A.; Schulz, W. L.

2026-07-06 health informatics 10.64898/2026.07.02.26357174 medRxiv

Top 1%

0.2%

Show abstract

Feature selection is a commonly used method in biomedical artificial intelligence and machine learning to identify a subset of high-quality variables that can be used to train downstream predictive models. It has been suggested that quantum feature selection (QFS), which takes advantage of the properties of quantum computers, may better identify variables that are correlated with the outcome while simultaneously reducing redundancy between selected variables. However, there are a limited number of studies evaluating their performance, particularly in real-world data sets. Here, we assess the performance of two QFS methods compared to random forest (RF) feature selection based on feature stability and the performance of a downstream classification algorithm when used to predict urinary tract infections in the emergency department from 211 original features extracted from the electronic health record. We found that a quantum binary quadratic model (BQM) and constrained quadratic model (CQM) had similar performance to RF feature selection (median F1 score of 0.60, 0.61, and 0.61 respectively) when 10 features were selected for an XGBoost classification model. The BQM and RF also had similar feature stability (0.91 and 0.94, respectively) while the CQM had lower stability (0.72). These findings show that QFS can be used with large, clinical data sets to identify features with high stability and predictive performance. As the capacity and quality of quantum computers continue to increase, these methods may offer additional benefits to classical feature selection methods.

14

Multi-model forecasting of respiratory disease activity in Germany during the 2024-2025 season

Bracher, J.; Wolffram, D.; Amaral Lind, R.; Bardeck, N.; Boehm, M.; Contreras, S.; Doenges, P.; Guenther, F.; Kaiser, R.; van de Kassteele, J.; Kuhlmann, A.; Lange, B.; Nemcova, B.; Priesemann, V.; Reinacher, U.; Rodiah, I.; Sandmann, F.; the RESPINOW Study Group, ; Schienle, M.

2026-07-21 epidemiology 10.64898/2026.07.20.26358471 medRxiv

Top 1%

0.2%

Show abstract

Respiratory diseases cause considerable morbidity in autumn and winter and are a priority in public health monitoring. In Germany, they are subject to a number of surveillance systems, including both pathogen-specific and syndromic indicators. In this paper we present a collaborative multi-target and multi-model real-time forecasting system rolled out during the 2024/25 season, and discuss differences to earlier efforts carried out during the COVID-19 pandemic. A total of nine models were run to generate forecasts of general practitioner consultations for acute respiratory infections (ARI), hospitalizations for severe acute respiratory infections (SARI) and confirmed cases of seasonal influenza and RSV. As all indicators were subject to retrospective revisions, forecasting models were combined with a nowcasting step. Whenever multiple models were available for the same indicator, we combined them into an ensemble. Nowcasts showed convincing performance, even though for some models Christmas break effects led to an upward bias in early January. Forecasts were overall well-calibrated and most models outperformed simple benchmark models. These improvements were generally more substantial for age-stratified than pooled targets, and concentrated at lead times of two to three weeks. Anticipating the peak timing and magnitude proved to be challenging, with many models predicting too flat curves with a too early turnaround (e.g. already in late January rather than mid-February for SARI). The combined ensemble forecast was among the best-performing approaches, but unlike in previous related projects did not consistently outperform individual models. We conclude by discussing learnings on the organization of collaborative forecasting projects in post-COVID-19 times and the potential of AI-supported modelling.

15

Effect of CSFV on Differential Genes of Histone Lactylation at H3K18 in the PI3K-AKT Signaling Pathway

Zhang, H.; Han, Z.; Zhao, X.; Zhu, J.; Shao, N.; Sun, K.; Li, W.; Yao, Y.; Liang, X.; Yang, M.; Gao, Y.; Chen, J.; Liang, Y.; Liu, Q.; Li, X.; Cao, Z.

2026-06-29 microbiology 10.64898/2026.06.26.734696 medRxiv

Top 1%

0.2%

Show abstract

Classical swine fever (CSF) is a highly contagious disease caused by Classical swine fever virus (CSFV), posing a serious threat to the global swine industry. This study aimed to investigate the effect of CSFV on differential genes of histone lactylation at the H3K18 site in the PI3K-AKT signaling pathway. The site with the most significant change in histone lactylation antibody level was screened by Western blot. Omics analysis was performed using CUT&Tag technology to identify differential genes in the PI3K-AKT pathway between the CSFV-infected group and the mock group, followed by validation using RT-qPCR. Functional analysis of significantly differential proteins was conducted, and the protein expression level of THBS4 was detected by Western blot. The results showed that after CSFV infection of 3D4/21 cells, the H3K18la site exhibited the most significant difference in antibody level. A total of 8,859 differential genes at the H3K18la site were identified by CUT&Tag analysis, including 6,349 up-regulated genes and 2,510 down-regulated genes. Further focusing on the PI3K-AKT signaling pathway, 10 differential genes were identified, comprising 6 up-regulated genes and 4 down-regulated genes. Compared with the control group, the mRNA expression levels of CD19, LAMA1, PDGFRA, BDNF, ANGPT4, and THBS4 were up-regulated in the CSFV-infected group, while FOXO3 and NRTN were down-regulated. Western blot results showed that the protein expression level of THBS4 increased after CSFV infection. These findings lay an important foundation for understanding the molecular mechanisms regulating viral replication and immune evasion, and have significant scientific implications and potential application value.

16

Seeing Nothing, Saying Something: The Lack of Visual Grounding and Confabulation in Gemini Models for Histopathology

Hasan, M. M.; Tozal, M. E.; Ayhan, M. S.

2026-07-07 health informatics 10.64898/2026.07.04.26357257 medRxiv

Top 1%

0.2%

Show abstract

Large vision-language models (VLMs) have demonstrated remarkable perfor- mance on computational pathology benchmarks, yet their reliability under adversarial or vacuous inputs remains poorly understood. This paper examines the visual grounding behaviour of two Gemini models Gemini 3.0 Flash Pre- view (gemini-flash) and Gemini 3.1 Pro Preview (gemini-pro) on a well known histopathology classification task, and probes for confabulation using a adver- sarial blank-image set. On the real histopathology dataset both models achieve near-perfect accuracy (98.75% - 100%) across three temperatures (0.0, 0.5, 1.0) and three independent runs. On a controlled adversarial set of blank white images labelled as either benign or malignant, however, a stark divergence emerges. Gemini-flash consistently acknowledges the absence of visual content and assigns zero confidence, while Gemini-pro fabricates detailed, clinically plausible histo- logical descriptions and reports high confidence (mean {approx} 0.95) across the same blank inputs, a behaviour we term confident confabulation. The confabulation rate of gemini-pro reaches 77.8% image-responses at temperature 0.0, dropping to 44.4% at temperature 0.5 and rising to 66.7% at temperature 1.0, while gemini- flash records 0% at all temperatures. These findings raise important questions about the safety and trustworthiness of VLMs in clinical decision-support con- texts, and underscore the need for comprehensive evaluation beyond standard accuracy metrics.

17

A guaranteed-convergence algorithm for coupled leaf photosynthesis–transpiration–stomatal conductance models

Masutomi, Y.;Kobayashi, K.

2026-07-08 Plant Biology 10.64898/2026.06.24.734164 medRxiv

Top 1%

0.1%

Show abstract

The photosynthesis-transpiration-stomatal conductance (An-E-gs) model framework is widely used for estimating photosynthesis, transpiration, and stomatal conductance in plants. The model equations are solved by numerical iteration, and the converged model values are deemed the solution. However, there has been no general guarantee that the iterative procedure converges to a solution or that the procedure leads to convergence. Building on the recent proof of the existence of a unique set of solutions, we herewith propose a numerical algorithm that is guaranteed to converge to the solution for the An-E-gs model framework. We first analytically prove that the proposed algorithm necessarily converges to a solution. We then demonstrate the convergence across contrasting combinations of leaf temperature, relative humidity, light, atmospheric CO2, and wind speed. We further demonstrate rapid convergence with the algorithm: no more than ca. 10 iterations for approximately 10-3 mol CO2 m-2 s-1 precision in net photosynthesis and no more than ca. 20 iterations for 10-7 mol CO2 m-2 s-1 precision. By guaranteeing convergence to the solution, this algorithm eliminates concerns about nonconvergence in leaf gas-exchange calculations and is expected to serve as a robust foundation for a range of studies from leaf-level gas exchange to global-scale carbon and water cycle dynamics.

18

Quantum Encoding Strategies for Drug Response Prediction: An Exhaustive Benchmark on a 20-Qubit Superconducting QPU

Derouich, R.; Mathlouthi, N. E. H.

2026-07-13 bioinformatics 10.64898/2026.07.08.737310 medRxiv

Top 1%

0.1%

Show abstract

We present the first systematic, hardware-executed benchmark of twelve distinct quantum data-encoding strategies for drug-response prediction on a real superconducting quantum processing unit (QPU). All experiments were conducted on the IQM Garnet 20-qubit QPU via the IQM Resonance cloud platform, using the Qrisp quantum-software framework (v 0.8.2). Each encoding was evaluated on n = 50 stratified samples drawn from the Genomics of Drug Sensitivity in Cancer dataset (GDSC2, 242 036 drug-cell-line pairs), targeting the natural-log IC50 response variable. Variational weights were optimised offline with the gradient-free COBYLA algorithm before hardware submission. Every circuit was executed with 1024 shots; the regression signal is the zero-qubit Pauli expectation value [<]Z0[>]. Results show that the QAOA-inspired encoding achieves the best RMSE of 3.314 and is statistically superior (p < 0.05, Wilcoxon signed-rank test) to six of the remaining eleven encodings. Hardware-efficient entanglement structures--specifically alternating cost and mixer layers--provide a systematic advantage over purely rotational or diagonal encodings under realistic noise conditions. This work constitutes a reproducible baseline for noise-aware quantum machine learning on pharmaceutical data; all code, data, and raw QPU outputs are publicly released.

19

Comorbidity structure as an inductive bias: Comparing output-head designs for multi-label prediction of diabetes and myocardial infarction complications

Asumboya, W. A.; Agbenorhevi, P. K.; Adams, C. F.; Ayariga, D. A.; Adjadeh, T.; Adams Ziblim, S.; Kwofie, S. K.

2026-06-23 bioinformatics 10.64898/2026.06.18.733068 medRxiv

Top 1%

0.1%

Show abstract

BackgroundClinical complications are often predicted with separate sigmoid outputs, even when the target labels arise from related pathophysiological processes. This paper asks whether output-layer choice should reflect both predictive convenience and the biological structure assumed among complications. The central premise is that label-dependence mechanisms are explicit hypotheses about comorbidity, not generic modelling additions. MethodsOutput-head assumptions were compared across two clinically distinct multi-label prediction tasks. In Type 2 diabetes (T2D), six heads were evaluated for nephropathy, neuropathy, and retinopathy: independent baseline, linear additive, multiplicative, symmetric conditional random field (CRF), residual multilayer perceptron (MLP), and combined additive-multiplicative. In myocardial infarction (MI), four heads were evaluated for ventricular tachycardia, ventricular fibrillation, and atrioventricular block: independent baseline, linear additive, multiplicative, and symmetric CRF. All experiments used five training data fractions and seven independent seeds, with the same shared-backbone protocol within each disease setting. ResultsIn T2D, the symmetric CRF gave the most consistent improvement pattern, ranking highest at full data and at the two lowest data fractions while adding only three interaction parameters. At 20% training data, it was the only interaction head whose aggregate mean exceeded the independent baseline. The residual MLP, despite 123 interaction parameters, remained below the baseline across all T2D fractions. In MI, rankings changed across fractions: the multiplicative head led at 80% and 60%, the CRF led at 100% and 20%, and the baseline led at 40%. The combined additive-multiplicative head did not improve robustness in T2D and showed the largest negative baseline-relative deviations at lower fractions. ConclusionThe findings support a biology-guided view of output-layer design. A small constrained mechanism was most useful when its symmetry matched the shared microvascular structure of T2D, whereas the heterogeneous electrophysiology of MI produced no stable winner. Output-layer choice should therefore be reported and defended as an assumption about disease structure instead of a routine hyperparameter decision. Author summaryMany clinical prediction models treat complications as separate outcomes, even when clinicians know they often arise together. We studied whether the last layer of a model should reflect that biological knowledge. We compared several output heads across two disease settings: Type 2 diabetes, where nephropathy, neuropathy, and retinopathy share a common microvascular origin, and myocardial infarction, where electrical complications arise from a mixture of shared and location-specific mechanisms. We found that a small symmetric CRF head was most useful in the diabetes task, especially when training data were limited, while no single interaction head dominated in myocardial infarction. This suggests that modelling comorbidity is not only a technical choice; it is a statement about how disease processes relate to one another. Our results encourage researchers to report and justify output-layer design as part of the clinical modelling argument, rather than treating it as a routine hyperparameter.

20

Overinflation and overconcentration: why Cauchy perturbation kernels are the right choice for ABC-SMC

Sturrock, M.; Shahrezaei, V.

2026-07-09 systems biology 10.64898/2026.06.24.734205 medRxiv

Top 2%

0.1%

Show abstract

Approximate Bayesian computation sequential Monte Carlo (ABC-SMC) propagates its particles with a perturbation kernel, and with the standard Normal kernel it degrades sharply as the parameter dimension grows, a failure usually attributed to dimension itself. We show instead that it is governed by the quality of the summary statistics, with dimension entering only through a separate and milder mechanism, and that the two must act together for the Normal kernel to break. The first ingredient is covariance overinflation: the kernel covariance, estimated from the particle cloud, overshoots the true posterior covariance by a factor set by information loss in the summary statistics. We derive this overscaling factor in closed form for a Gaussian model with sufficient statistics and show that it stays modest at any dimension, shrinking toward its baseline value as the tolerance tightens; the extreme values seen in practice (of order 103) are a signature of insufficient summaries, not of dimension. The second ingredient is perturbation overconcentration: the normalised Normal step size concentrates around one as the dimension grows, so every proposal overshoots by the same factor. Either ingredient alone is harmless; only their combination breaks the Normal kernel. A Cauchy kernel (multivariate t with one degree of freedom) removes the concentration, keeping a positive acceptance rate under arbitrary overscaling at a bounded worst-case cost of 1.87x in expected squared jump distance. In a Metropolis-Hastings framework we derive closed-form acceptance rates for both kernels that illustrate the advantage of the Cauchy kernel in this limit. A series of full ABC-SMC computational experiments on five problems at d = 12, including a hierarchical gene-expression model, show the Cauchy reducing the sliced Wasserstein distance to the reference posterior by factors of up to 50 with the same simulation budget. Since the summary statistics are commonly insufficient for the models that require ABC, overinflation is structural and the Cauchy perturbation kernel is the right default for problems in higher dimensions.