IFAC-PapersOnLine — Latest Matching Preprints

1

Physics-Informed Neural Networks for Parameter Recovery in the Repressilator Oscillatory Model

Casajuana, B.; Casals-Franch, R.; Lopez Garcia de Lomana, A.; Marti-Puig, P.; Villa-Freixa, J.

2026-05-15 bioinformatics 10.64898/2026.05.12.724679 medRxiv

Top 0.1%

1.9%

Show abstract

Parameter estimation in nonlinear biological dynamical systems is a difficult inverse problem because the governing equations are often stiff or oscillatory, the data are sparse and noisy, and the objective landscape is non-convex. Physics-informed neural networks (PINNs) offer an alternative to purely simulation-based calibration by representing state trajectories with neural networks while penalizing violations of the governing equations. This paper studies the empirical reliability of PINNs for recovering the parameters of the repressilator, a synthetic genetic oscillator formed by three cyclically repressive genes. We use synthetic time-series generated from the standard ordinary differential equation model and train inverse PINNs to estimate the production parameter {beta} and the Hill coefficient n. The study varies observation noise, partial observation of repressors, sampling density, sensitivity to initial parameter guesses, and the difference between stable and oscillatory regimes. The results show that PINNs can reconstruct trajectories accurately when the model structure is correct and the three repressors are observed, but parameter recovery is more fragile than trajectory fitting. Noise, sparse sampling, unobserved variables, and unfavorable initial guesses increase the risk of biased estimates. The stable regime is easier to reconstruct, whereas the oscillatory regime provides richer information but also exposes optimization sensitivity. These findings support PINNs as a useful reverse-engineering tool for small gene-regulatory ODE models, while highlighting the need for repeated runs, uncertainty reporting, and experimental designs that improve identifiability.

2

Neural Network Guided Calibration for Fast Virtual Twin Generation in Cardiovascular ODE Models

Cabeleira, M. T.; Ray, S.; Ovenden, N.; Diaz-Zuccarini, V.

2026-05-08 physiology 10.64898/2026.05.05.722845 medRxiv

Top 0.1%

0.7%

Show abstract

Calibration of closed-loop lumped-parameter cardiovascular models remains a major bottleneck for scalable digital-twin generation because inverse estimation is ill-conditioned and typically requires computationally expensive iterative forward simulation. This study investigates whether a supervised neural network (NN) can provide a fast inverse estimator for a paediatric sepsis cardiovascular ODE model by learning a direct mapping from prescribed haemodynamic target vectors to calibrated parameter sets. Training data are generated by sampling model parameters at random, forward-simulating the closed-loop system to steady state, and pairing the resulting target summaries with the corresponding parameters; the same target definitions and evaluation populations are used throughout for consistency. We evaluate NN inference by forward re-simulation to steady state and benchmark performance against a simulator-constrained calibration reference (Embedded Gradient Descent, EGD) using relative-error statistics, distributional similarity of achieved outputs and inferred parameters (median shift, IQR ratio, Wasserstein distance, KS statistic), and target-space localisation of parameter-space disparity (cosine distance). The NN reproduces the prescribed targets with predominantly small errors for most samples, while the largest discrepancies are confined to a well defined set of target configurations that also yield high residuals under the reference method, indicating feasibility limits of the target/model combination. Overall, NN-guided calibration provides a computationally efficient accelerator for virtual-twin generation and target-space screening, with simulator-based refinement and forward re-simulation retained to handle infeasible regimes and enforce mechanistic plausibility.

3

Efficient Stochastic Trace Generation for Transcription

Ferdowsi, A.; Fuegger, M.; Nowak, T.

2026-05-08 bioinformatics 10.64898/2026.05.05.722871 medRxiv

Top 0.1%

0.6%

Show abstract

Bursty transcription in single cells typically produces over-dispersed, skewed, and sometimes heavy-tailed expression distributions that are explained by two-state Markov models of the promoters. While the gold standard for simulation is exact stochastic sampling with Gillespies algorithm, obtaining thousands of timed traces is computationally costly. Surrogate models based on stochastic differential equations (SDEs) are widely used to speed up this simulation process. An example is the Chemical Langevin Equation based on Gaussian noise, which, however, does not capture heavy-tailed noise. In this work, we present a unified SDE framework that combines deterministic drift, Gaussian fluctuations, and additive sporadic jumps of arbitrary distributions, and provide an open-source Python implementation, bcrnnoise. The framework subsumes standard surrogate models and allows for vectorized generation of batches of transcription traces. We assess computational speed and accuracy of common surrogate models along with new models, showing that high accuracy can be obtained while reducing computational cost up to two orders of magnitude.

4

Model-supported patient stratification using multi-objective synergy optimization in combination therapy

Gevertz, J. L.; Kareva, I.

2026-05-07 pharmacology and toxicology 10.64898/2026.05.04.722754 medRxiv

Top 0.2%

0.5%

Show abstract

The challenge of stratifying patients for combination therapy is both technically demanding and clinically crucial. In previous work, we introduced a multi-objective optimization framework for identifying optimally synergistic combination protocols that are robust to competing definitions of additivity. This manuscript extends this methodology to quantify how inter-individual variability in drug sensitivity influences the combination doses that optimally balance the competing objectives of synergy of efficacy and synergy of potency (a proxy measure of toxicity). For this methodology, we introduce a voxel-based stratification approach to characterize individuals (model parameterizations) into subgroups based on sensitivity to each drug as a monotherapy and in combination. As a case study, we apply the method to a preclinical dataset of murine response to the combination of an immune checkpoint inhibitor and an antiangiogenic agent. We demonstrate that the algorithm can quantify how the robustly optimal combination therapies vary across different treatment response subgroups and how the algorithm can identify subpopulations for which no meaningfully efficacious combination exists. As applying the methodology requires knowledge of specific parameter values for which measurable biomarkers may be unavailable, we also propose an initiation protocol that permits identification of the parameters necessary to place an individual in a subgroup. This methodology is a step in the direction of determining the right combination therapy for a subgroup and finding the right subgroup for an existing therapy.

5

COCOA.jl: A Julia package for high-performance analysis of concordance and kinetic modules in biochemical networks

Schaffranke, A.; Kueken, A.; Nikoloski, Z.

2026-05-08 systems biology 10.64898/2026.05.05.722856 medRxiv

Top 0.2%

0.5%

Show abstract

SummaryRecent advances in analysis of biochemical networks have contributed the identification of their modular structure based on the concept of multi reaction dependencies and kinetic coupling of reaction rates (Kuken et al., 2022; Langary et al., 2025). Existing implementations of the algorithms to study modular structure do not scale well with the size of the networks, prohibiting their application with genome-scale networks. Here, we introduce COCOA.jl, a multithreaded Julia package for identification of concordant and kinetic modules, with applications in the study of concentration robustness. Availability and implementationCOCOA.jl is implemented in Julia 1.12.2 and is freely available under the MIT license at https://github.com/antoniofranky/COCOA.jl. It runs on Linux, macOS, and Windows; installation is supported via the Julia package manager. COCOA.jl can be called from Python via JuliaCall. Contactantonschaf@posteo.de; ankueken@uni-potsdam.de

6

A direct forcing immersed boundary method for biofluid simulations using a non-linear rotation free shell model on unstructured grids

Kim, T.; Malipeddi, A. R.; Capecelatro, J.; Figueroa, A.

2026-05-19 bioengineering 10.64898/2026.05.16.725689 medRxiv

Top 0.2%

0.5%

Show abstract

Thin structures such as heart valves and aortic dissection flaps interact dynamically with blood flow in human vessels. Their flexibility and capacity for large deformations generate complex, highly transient hemodynamic patterns over the cardiac cycle. Accurately resolving these interactions remains challenging for conventional boundary-fitted fluid-structure interaction approaches. We present an immersed boundary method for simulating thin structures in incompressible flow on unstructured grids. The method couples a stabilized finite element fluid solver with a nonlinear, rotation-free shell formulation through a direct forcing immersed boundary approach. The framework supports both weak (explicit) and strong (implicit) time-coupling strategies, enabling stable simulations over a wide range of solid-to-fluid density ratios. Hydrodynamic forces acting on thin structures are computed from fluid solutions sampled on both sides of the structure, allowing accurate force reconstruction for zero-thickness shells. To our knowledge, this is the first immersed boundary formulation that couples an unstructured finite element fluid solver with a two-dimensional, rotation-free shell model to simulate interactions between thin structures and incompressible flow. Fluid-structure coupling is achieved using predefined finite element shape functions, which provide consistent projection between Eulerian and Lagrangian fields without additional interpolation procedures. The framework is validated using three-dimensional benchmark problems involving thin structures. Then, valve-like model is used to compare strong and weak coupling strategies. Finally, the method is applied to an idealized type-B aortic dissection model. The proposed approach is implemented within the open-source software CRIMSON, a finite element platform for cardiovascular simulation.

7

A Differentiable dFBA Simulator for Scalable Bayesian Inference over Microbial Metabolic Models

Diederen, T.; Merzbacher, C.; Patz, M.

2026-05-08 bioinformatics 10.64898/2026.05.05.722888 medRxiv

Top 0.2%

0.4%

Show abstract

Medium optimisation for bioprocess design remains challenging and costly: fermentation recipes typically contain ten or more components, the design space expands combinatorially as ingredients are added, and each batch experiment requires over 24 hours. High-throughput 96-well plate screening can reduce experimental cost, but extracting actionable predictions from growth curves requires a mechanistic model that links medium composition to cellular metabolism. In this paper, we present a differentiable simulator for dynamic flux balance analysis (dFBA) that enables scalable Bayesian inference over microbial metabolic models. A distinguishing feature is that inference is driven entirely by OD600 measurements, a simple optical proxy for biomass, without substrate or product assays; internal fluxes, substrate consumption, and secreted metabolite profiles are recovered as latent variables constrained by the metabolic network stoichiometry. We resolve the core differentiability barrier of classical dFBA by reformulating the per-step linear or quadratic programme (LP/QP) as a smooth continuous ODE (the Relaxed Interior-Point ODE, R-iODE), establishing the mathematical framework for end-to-end gradient propagation through long fermentation trajectories in JAX; full gradient validation is ongoing. The result is a framework for principled inference over thousands of batch fermentations, providing a path toward model-guided medium design, cross-strain parameter transfer, and scale-up prediction from plate data.

8

Efficient Bayesian inference for ordinary differential equation models from experimental data with uncertain measurement times

Vanhoefer, J.; Nakonecnij, V.; Binder, N.; Hasenauer, J.

2026-05-13 systems biology 10.64898/2026.05.09.724053 medRxiv

Top 0.3%

0.3%

Show abstract

Time-resolved measurements are central to calibrating mechanistic dynamical models, but current inference frameworks typically assume that reported measurement times are exact. In practice, actual sampling times may deviate from reported times because of sample-handling delays, imper-fect synchronization, or reporting errors. Here, we present a Bayesian framework for parameter inference in ordinary differential equation models that explicitly accounts for uncertainty in measurement times. We formulate latent measurement times as random variables and derive a joint and marginalized posterior. To compute the marginal likelihood efficiently, we augment the original dynamical system with additional state variables that evaluate the required integrals during numerical simulation. This reduces the dimensionality of the estimation problems and allows for efficient and reliable Markov chain Monte Carlo sampling. Across synthetic examples and a published model of carotenoid cleavage in Arabidopsis thaliana, neglecting time uncertainty led to biased estimates and overconfident uncertainty quantification, whereas the proposed marginalized formulation recovered reliable parameter estimates while substantially improving sampling efficiency and scalability. These results identify measurement time uncertainty as an important source of variability in dynamic modeling and establish posterior marginalization as a practical strategy for robust mechanistic inference.

9

Homeostatic feedback model of energy metabolism with adaptive enzyme levels exhibits problem solving behavior

de Baat, A.; Levin, M.

2026-05-11 systems biology 10.64898/2026.05.07.721661 medRxiv

Top 0.3%

0.2%

Show abstract

Metabolic networks are typically viewed as homeostatic systems that stabilize flux, energy charge, redox balance, and metabolite availability under perturbation. However, it remains unclear whether the same feedback architectures that support metabolic robustness can also generate learning-like, experience-dependent adaptation. Here, we develop a coarse-grained dynamical model of mammalian energy metabolism to test whether prior perturbation can improve future metabolic responses. The model represents core glucose, glutamine, fatty acid, and oxidative phosphorylation pathways as coupled ordinary differential equations with Michaelis-Menten-type fluxes, product-inhibition feedback, adaptive enzyme-capacity regulation, and explicit ATP costs for enzyme adjustment. Rather than aiming to reproduce quantitative fluxes for a specific cell type, the framework is designed to expose how metabolic feedback, regulatory cost, repeated perturbation, and environmental variability interact. We use this model to ask whether adaptive enzyme regulation enables improved recovery after repeated challenges, whether such effects depend on energetic control costs, and whether environmental variability broadens or constrains the set of reachable adaptive states. This approach provides a tractable way to investigate how homeostatic metabolic regulation may give rise to experience-dependent metabolic plasticity.

10

Mathematical Modeling of the Canonical Aryl Hydrocarbon Receptor Pathway

Wieland, V.; Blum, T.; Iriady, I.; Reverte-Salisa, L.; Pathirana, D.; Foerster, I.; Weighardt, H.; Hasenauer, J.

2026-05-08 systems biology 10.64898/2026.05.05.722708 medRxiv

Top 0.4%

0.2%

Show abstract

The aryl hydrocarbon receptor (AhR) is a ligand-activated transcription factor involved in xenobiotic sensing, as well as development, immunity, and tissue homeostasis. AhR signaling can proceed through a canonical and non-canonical pathway; the present study focuses on the canonical pathway. While ligand-dependent differences in binding affinities and direct ligand degradation kinetics are well known, and subtle differences in ligand binding can shape downstream signaling, it is still unclear which biochemical reaction steps within the canonical pathway are responsible for distinct ligand-specific transcriptional responses. Here, we developed a mechanistic ordinary differential equation model of the canonical AhR pathway. We calibrated the model to time-resolved qPCR measurements of Cyp1a1 and Ahrr mRNA in mouse bone-marrow-derived macrophages exposed to structurally diverse, environmentally relevant ligands with known immunomodulatory activity (3-methylcholanthrene, indolo[3,2-b]carbazole, and bisphenol A) using global optimization under a heteroskedastic likelihood. To dissect ligand specificity, we evaluated 528 candidate models that allow one or two ligand-involving reaction rate constants to vary. Akaike-based model selection reveals a dominant dynamical regime governed by promoter occupancy and target-gene mRNA synthesis, indicating that ligand-specific transcriptional responses are primarily encoded at the level of transcriptional regulation rather than upstream signaling events. The resulting model is made available in SBML and PEtab formats for reproducibility, and to enable further research into whether ligand-specific effects are conserved or rewired across cell types.

11

Probabilistic Cardiac Digital Twins for Robust Patient-Specific Modeling

Giovanis, D. G.; Zhang, K.; Tso, J.; Maggioni, M.; Kevrekidis, I. G.; Trayanova, N.

2026-05-12 bioengineering 10.64898/2026.05.07.723610 medRxiv

Top 0.4%

0.2%

Show abstract

Uncertainty quantification (UQ) in computational heart models is essential for reliable cardiac digital twins (DTs) in personalized medicine, yet remains challenging. Traditional Monte Carlo and stochastic Galerkin methods often become impractical in the high-dimensional, nonlinear state variable and parameter spaces of cardiac electrophysiology and mechanics. This article introduces a framework for learning a joint probability density over cardiac observables and model parameters, enabling the characterization of statistical dependencies across a large number of variables in patient-specific cardiac DTs. By sampling from this density and conditioning on available data, useful predictive distributions can be constructed, allowing uncertainty to be propagated through the model and quantified in terms of variability. Conditional regression can then be performed directly on this learned density, enabling systematic exploration of interdependencies among observables for both predictive inference and model design. The statistical methodology adopts a geometry-aware generative learning framework, recently introduced by the authors, that decouples the learning of data geometry from sampling. First it identifies a low-dimensional latent representation that captures the intrinsic structure of the data and its multiscale geometric features. A stochastic differential equation is then formulated directly in the low-dimensional latent space to generate samples efficiently; these are subsequently mapped back to the high-dimensional space of cardiac states and parameters through a smooth lifting operator. We demonstrate the approach on a ventricular arrhythmia prediction benchmark, where the learned joint probability density enables the construction of predictive distributions over key parameters (e.g., conductivities, fibrosis patterns) through sampling and conditioning. This enables uncertainty to be propagated and quantified through sampling and conditioning on the learned joint density, with substantially fewer model evaluations than conventional UQ methods.

12

Denoised MDS-UPDRS Part-III Scores Yield New Patterns of Progression Heterogeneity in Early Stage Parkinson's Disease

Koss, J.; Tinaz, S.; Tagare, H.

2026-05-08 bioinformatics 10.64898/2026.05.04.722810 medRxiv

Top 0.4%

0.2%

Show abstract

Parkinsons Disease (PD) Motor Scores (MDS-UPDRS Part III) are quite noisy. This paper proposes a new methodology for processing these scores by first denoising the scores to enhance the underlying progression signal, and then conducting a high-dimensional analysis which does not sum the scores into a total movement score. The analysis gives novel insights into PD progression heterogeneity: it reveals that the heterogeneity is continuously variable rather than clustered into "subtypes" and that the variability is along two easily understood axes. This analysis also resolves some of the discrepancies in previously reported progression subtypes. Finally, the analysis reveals that patient-specific progression cannot be predicted from baseline using only MDS-UPDRS Part III scores.

13

Fiber dispersion in the right ventricle: A comparison of constitutive neural network predictions with experimental data

Ingalkar, P.; Kakaletsis, S.; Rausch, M.; Kuhl, E.; Martonova, D.

2026-05-14 bioengineering 10.64898/2026.05.11.724139 medRxiv

Top 0.4%

0.2%

Show abstract

The mechanical behavior of right ventricular (RV) myocardium is governed by its anisotropic microstructure, yet constitutive models that account for fiber dispersion and enable reliable parameter identification remain limited. In this study, we propose a physics-embedded constitutive neural network framework for automated discovery of strain energy functions and microstructural parameters from experimental data. The model is formulated within an incompressible, orthotropic hyperelastic setting using invariant-based representations. Fiber, sheet, and normal directions are incorporated through a rotated structural basis, and dispersion effects are modeled using a generalized structure tensor approach. The framework is trained on multi-axial mechanical data from ovine RV myocardium, including uniaxial tension-compression and simple shear tests. We investigate two training scenarios: (i) full datasets containing both tensile and compressive regimes and (ii) datasets restricted to tensile loading. In both cases, the model accurately reproduces the measured stress-strain responses and identifies sparse, interpretable constitutive models which involve isotropic, anisotropic, and coupling invariants. However, the identifiability of microstructural parameters strongly depends on the available loading conditions. While tensile-only data yield higher predictive accuracy, they result in non-unique or biased estimates of fiber dispersion. In contrast, inclusion of compressive data enables consistent identification of dispersion parameters by separating fiber and matrix contributions. These results highlight the importance of multi-axial loading data for robust parameter identification and demonstrate the capability of constitutive neural network-based approaches for data-driven modeling of anisotropic soft tissues.

14

Spatiotemporal Modeling of GPCR Signaling: The Role of Endosomal Dynamics and Receptor Recycling

Weckel, C.; Gourdon, J.; Darrigade, L.; Jugnarain, V.; Crepieux, P.; Reiter, E.; Jean-Alphonse, F.; Haar, S.; Yvinec, R.

2026-05-04 systems biology 10.64898/2026.04.29.721559 medRxiv

Top 0.4%

0.2%

Show abstract

Cells communicate via extracellular ligands, such as hormones, which bind to plasma membrane receptors and trigger intracellular signaling cascades. G Protein-Coupled Receptors (GPCRs) exemplify this mechanism by initiating signaling both at the cell surface and, from intracellular compartments such as endosomes. The kinetics and spatial localization of these signals are critical determinants of cellular responses, yet receptor trafficking-including internalization, endosomal sorting, and recycling-remains a pivotal but often overlooked component of theoretical GPCR models. In this study, we present a mathematical framework that integrates receptor trafficking and signaling compartmentalization into generic GPCR dynamic models. Using a compartmentalized approach based on systems of ordinary differential equations (Chemical Reaction Networks), we analyze how receptor internalization and recycling modulate ligand-induced responses. Our results show that the balance between plasma membrane and endosomal signaling can significantly enhance or diminish ligand efficacy. Calibrated with high-throughput kinetic data, our model offers a refined tool for ligand pharmacological characterization and advances the understanding of GPCR signaling spatial organization.

15

Spurious correlation inflates performance in single-cell perturbation prediction

Nicol, P. B.; Shivakumar, S.; Irizarry, R.

2026-05-12 bioinformatics 10.64898/2026.05.07.723486 medRxiv

Top 0.4%

0.2%

Show abstract

The increasing number of computational methods designed to predict the effects of genetic perturbations on cellular gene expression profiles has led to a need for rigorous evaluation metrics. Recent benchmarking studies rely on correlation or cosine similarity of differential expression relative to a shared population of control cells. We show that these metrics are systematically inflated by statistical bias induced by reusing the same control population to define both quantities being compared. As a result, even non-informative methods can appear to perform well, particularly in datasets with limited numbers of control cells. Reanalysis of published datasets using a simple control-splitting procedure that removes this bias leads to a substantial reduction in performance previously attributed to biological signal.

16

Muscle-driven hand simulations emphasize the critical role of the extensor mechanism

Carvajal, M.; Murray, W. M.; Miller, L. E.; Firouzabadi, P.; Rizzoglio, F.; Darbhe, V.; Cotton, J.

2026-05-14 bioengineering 10.64898/2026.05.11.723556 medRxiv

Top 0.5%

0.2%

Show abstract

Biomechanical simulations of complex hand motions remain scarce, due to challenges that span computation and data acquisition. Using a computer vision-based motion capture approach, a 23-degree of freedom musculoskeletal model, and direct collocation optimization, we performed muscle-driven simulations to track hand kinematics from 7 participants performing American Sign Language gestures. While proximal joints were tracked accurately, interphalangeal joint tracking was significantly worse, with a consistent flexion bias. Modifications to finger extensor muscle paths that incorporated the dual-inserting nature of the extensors improved accuracy, suggesting better representation of extensor force distribution across distal joints may be necessary for accurate hand simulations.

17

Ensemble kinetic modelling links residual enzyme activity to clinical symptoms in mitochondrial β-oxidation defects

Odendaal, C.; Krebs, O.; Bakker, B. M.

2026-05-08 systems biology 10.64898/2026.05.05.722902 medRxiv

Top 0.5%

0.1%

Show abstract

The mitochondrial fatty acid {beta}-oxidation (mFAO) is an important source of energy when carbohydrate stores are depleted. It is also involved in many diseases, including inherited fatty-acid oxidation deficiencies (mFAODs). Patients with the same genetic variant often present with clinically heterogeneous phenotypes, but the mechanisms contributing to this heterogeneity are poorly understood. To investigate the underlying pathophysiology of different mFAODs, we constructed a computational model of mFAO in human liver, based on experimentally determined enzyme kinetics. A recognised, but seldom addressed challenge in metabolic modelling is the substantial uncertainty about kinetic parameter values. Whereas experimental values of some mFAO parameters are quite reproducible, others vary by up to four orders of magnitude between different reports. To address this, we generated an ensemble of kinetic models, each with the same reaction stoichiometry and rate equations, but different kinetic parameters, sampled from distributions of literature-derived values. We also comprehensively report these values and the arguments based on which they were evaluated. The resulting models were validated against available flux data, yielding a final ensemble of 51 valid models. These models recapitulate recent findings about the accumulation of medium-chain acyl-CoAs and the concomitant depletion of free CoA (CoASH) in medium-chain acyl-CoA dehydrogenase deficiency. We applied the ensemble to a set of known mFAODs, separating them into long-chain (LC-) and short-/medium-chain (S/MC-)mFAODs. The residual activity at which clinical symptoms are known to occur corresponded well with the residual activity in the model at which pathway flux was significantly decreased in LC-mFAODs. Residual activity in S/MC-mFAODs correlated less strongly with pathway flux, but these deficiencies did show a combined flux- and CoASH-reduction effect. This comparison is of importance to researchers and clinicians, as it identifies possible ways in which insights about one mFAOD may be applied to another based on shared biochemical properties. Author SummaryWhen building computer models of metabolic pathways, it is typical to take the "best" experimental data and use that as input into the model. However, especially when working with human cells, ethical and practical constraints often mean that even the "best" experimental data is still subject to substantial uncertainty. We explicitly modelled the uncertainty about the inner workings of fat burning (fatty acid oxidation). The resulting model is known as an "ensemble". The ensemble predicts ranges instead of single outcomes, allowing us to assess the confidence level of our predictions. We assess a set of inherited diseases - enzyme deficiencies - simulating them at different levels of severity with the ensemble. We find that the model does a good job of predicting the severity of the deficiencies at which symptoms will occur. It also allows us to identify a key difference between two subgroups within this group of deficiencies: long-chain and medium-/short-chain, depending on the size of the fats being metabolised. The long-chain variant is predicted to correlate most straightforwardly with the severity of the deficiencies, due to its effect on energy generation. Medium-/short-chain deficiencies, in contrast, have more complex consequences.

18

Synthetic Data Generation and Nonparametric Techniques for Assessing Multivariate Similarity to Address Small-Sample Size Challenges

Heine, J.; Fowler, E.; Eschrich, S. A.; Schell, M.

2026-05-07 bioinformatics 10.64898/2026.05.04.722226 medRxiv

Top 0.7%

0.1%

Show abstract

Data modeling in biomedical research often operates in the small-sample regime, where the number of observations is small relative to the data dimensionality; the detrimental effects of limited sample sizes are well documented in cancer studies. Synthetic data offers a potential solution to data shortfalls provided that the data generated is an adequate facsimile of the underlying distribution; the adequacy of such synthetic data remains an open-ended problem. In this work, we evaluate a synthetic generator proposed previously. The generator applies a series of transformations to the observed data to accommodate the small-sample size resulting in an uncoupled representation, where uncorrelated marginal distributions are modeled with optimized univariate kernel density estimation. In this report, (1) we develop a nonparametric method for assessing multivariate similarity based on the Cramer-Wold theorem and random projection testing, (2) investigate when the absence of bivariate correlation approximates independence in a non-normal setting, and (3) evaluate artifacts induced by data compression. The presentation is primarily methodological; low-dimensional data were used so each stage of the generation process could be analyzed explicitly. A formal testing framework was developed by comparing random projection level outcomes with a two-sample test, modeling these outcomes as Bernoulli trials, aggregating replicate outcomes within each projection direction, and pooling outcomes across many directions, yielding a scalable standardized normal test-statistic. The key innovation was decoupling the two-sample test significance level from that governing finalized normal inference. We showed the same projection framework also evaluates the full multivariate covariance structure. The generator produced high-fidelity multivariate synthetic data when the bivariate correlation approximates independence in the non-normal setting; in highly compressed data, residual modes were best modeled as normally distributed regardless of their intrinsic distributional form. Ongoing work includes applying these methods to higher-dimensional, diverse data.

19

Run or glide: muscles are indifferent while the tendon takes the strain

Gloersen, O.; Lundervold, A.; Werkhausen, A.

2026-05-15 synthetic biology 10.64898/2026.05.15.725315 medRxiv

Top 0.7%

0.1%

Show abstract

Conventional diagonal stride skiing traditionally includes a glide phase, characterised by a period of relatively passive gliding on one ski. While the glide phase may take advantage of low ski-snow friction, it does not exhibit the same whole-cycle mechanical energy fluctuations seen in running or walking on foot. A new sub-technique, known as running style, substantially reduces the glide phase and may alter the role of elastic tissues, making the movement pattern more similar to uphill running on foot in its temporal organisation. We examined knee extensor and plantar flexor muscle-tendon behaviour in eight competitive skiers performing conventional diagonal and running techniques on a treadmill inclined at 10{degrees}. Using synchronised ultrasonography, 3D kinematics, ski forces and EMG, we quantified gastrocnemius medialis and vastus lateralis fascicle and muscle-tendon unit (MTU) dynamics in both the running (RUN) and conventional (CON) styles. Shorter glide and total cycle durations during RUN shifted MTU peak length and velocity earlier during the kick phase. Fascicles in both muscles operated at similar velocities across techniques, showing MTU-fascicle decoupling. Vastus lateralis fascicles shortened at higher absolute peak velocities than gastrocnemius in both conditions, while normalised velocities were similar. RUN increased preactivation and advanced EMG timing, while integrated EMG during the kick was lower compared to CON. These findings suggest that, despite large shifts in external mechanics between glide-based and more running-like skiing, elastic tissues may help stabilise fascicle behaviour and preserve a similar contractile strategy across muscles and techniques.

20

Composite Certainty: Addressing Metric Degeneracy in Parameter Inference for Model-Based Diagnostics

Koshe, A.; Sobhani Tehrani, E.; Jalaleddini, K.; Motallebzadeh, H.

2026-05-13 bioengineering 10.64898/2026.05.09.724027 medRxiv

Top 0.7%

0.1%

Show abstract

Quantifying the diagnostic dispersion of inferred parameter distributions is a challenge in uncertainty-aware modeling. Scalar summaries such as credible interval width are topology-blind; fundamentally different posterior morphologies can yield identical scores, obscuring whether a parameter is precisely estimated or constrained to a range. We propose a Composite Certainty Framework that addresses this metric degeneracy by aggregating five complementary uncertainty metrics including interquartile range, standard deviation, full width at half maximum, Shannon entropy, and mass width. These metrics are aggregated through non-parametric Borda rank voting into a single, unitless consensus certainty score. Applied to a simulation-based inference pipeline for a finite-element model of the human middle ear tuned to cadaveric acoustic measurements, the framework reveals parameter-specific identifiability profiles invisible to any individual metric. It produces two actionable clinical thresholds: (1) the maximum tolerable measurement noise for reliable parameter recovery, and (2) the minimum simulation budget for posterior convergence. We demonstrated that no single metric captures all aspects of posterior dispersion, as spread-based metrics and entropy diverge systematically for clinically critical parameters, whereas their aggregation produces a consensus reflecting genuine diagnostic certainty. The framework is generalizable to any model-based diagnostic pipeline where posterior distribution not merely its coverage, but determines clinical certainty.