Back

Science

American Association for the Advancement of Science (AAAS)

Preprints posted in the last 7 days, ranked by how well they match Science's content profile, based on 429 papers previously published here. The average preprint has a 1.05% match score for this journal, so anything above that is already an above-average fit.

1
Host Genetic Regulation of NLRP3 Inflammasome Cytokines Reveals Immune and Vascular Pathways in HIV

Chung, R.; Chalasani, N. S.; Barbehenn, A. S.; Lundgren, E.; Savur, S.; Shome, S.; Sheikhzadeh, C. H.; Sarvadhavabhatla, S.; Donaire, M. S.; Pae, V.; Chu, X.; Winder, D.; Maguire, C. T.; Topal, S.; Ganesan, A.; Yabes, J. M.; Larson, D. T.; Lalani, T.; Ewers, E. C.; Colombo, R. E.; Dugan, E.; Rathore, U.; Marson, A.; Agan, B. K.; Tomalka, J. A.; Sekaly, R.-P.; Loannidis, N. M.; Lee, S. A.

2026-06-10 hiv aids 10.64898/2026.06.08.26355202 medRxiv
Top 0.5%
18.7%
Show abstract

People with HIV exhibit elevated inflammation and cardiovascular risk despite antiretroviral therapy. To define the genetic architecture of inflammasome-associated inflammation, we performed whole-genome sequencing and quantified plasma IL-6, IL-1{beta}, and IL-18 in 1,000 ART-suppressed PWH from the U.S. Military HIV Natural History Study. Genome-wide analyses identified 14 loci implicating antiviral defense (DDX17, DDX41, EEA1, BCL11A), lipid metabolism (ABCA1, ABCA12, ABCC1, AGMO), and vascular remodeling (KLHL29, RNF213, ETV1). Transcriptome-wide analyses across cardiovascular and immune tissues identified regulatory programs linking interferon signaling, immune activation, and vascular biology to circulating cytokine levels. Mendelian randomization analyses supported causal relationships between inflammasome-associated cytokines and vascular events. Functional integration with genome-wide CRISPR perturbation datasets in primary CD4 T cells linked cytokine-associated loci to HIV antiviral pathways and cytokine regulatory networks. External validation in cohorts without HIV demonstrated pathway-level convergence despite limited variant-level overlap. These findings define genetic mechanisms linking inflammasome signaling, antiviral defense, and cardiovascular risk.

2
Surfacing Suicidal Risk Through Simulated Social Interaction: Per-Person Language Model Agents as Communicative Stress Tests

shao, w.; Ammerman, B.; Jacobucci, R.

2026-06-06 psychiatry and clinical psychology 10.64898/2026.06.04.26354928 medRxiv
Top 7%
4.8%
Show abstract

Suicidal risk may be encoded in everyday communication patterns but diluted in routine digital interactions. We introduce a method for surfacing this latent signal: training per-person language model agents on individuals' authored text (the on-screen text each participant typed, captured whenever a keyboard was visible in screenshots) and placing those agents in simulated social interactionsa communicative stress test. Using data from 79 adults with recent suicidal ideation, we ne-tuned individual LoRA adapters on Qwen3-8B using each participant's authored text, then placed agents in standardized conversations with probe personas. Agent-generated risk language was associated with EMA-measured suicidal ideation (r= .576, p < .001), with a single neutral small-talk probe performing nearly as well (r= 551). A shue control conrmed the signal is person-specic (r= .071 when adapters were mismatched), and automated descriptions of participants' general smartphone activity produced no signal, conrming specicity to interpersonal communication. A prompt ablation demonstrated partial robustness to removal of disclosure-encouraging language (r = .430). This proof-of-concept demonstrates that simulated social interaction can amplify latent vulnerability signals, bridging digital phenotyping, generative AI, andsuicide theory.

3
A canary in the mind: A single baseline brain scan predicts adolescent depression and anxiety one year later

Deco, G.; Sanz Perl, Y.; Vohryzek, J.; Garcia-Guzman, E.; Pizzagalli, D. A.; Laukkonen, R.; Chandaria, S.; Kringelbach, M. L.

2026-06-10 psychiatry and clinical psychology 10.64898/2026.06.08.26355206 medRxiv
Top 8%
3.9%
Show abstract

Mood and anxiety disorders emerge predominantly in adolescence, yet they are usually identified only once symptoms have consolidated, when intervention can only be reactive. A marker that registers the loss of healthy brain function before symptoms crystallise would allow earlier and more targeted treatment, much as caged canaries once warned miners of danger before it became apparent. Here we report such a marker using a single baseline resting-state functional MRI scan in 150 adolescents in the Human Connectome Project Boston Adolescent Neuroimaging of Depression and Anxiety (HCP BANDA) cohort, allowing us to prospectively predict depression and anxiety symptoms one year later in held-out participants at r = 0.60, substantially above the effect-size ceiling reported for functional connectivity in the same data. The marker is not computed from raw functional connectivity but read out from a whole-brain generative model fitted to each individual's dynamics, which gives access to interference structure that covariance-based features cannot represent. The regions driving the prediction, including precuneus, ventromedial prefrontal and anterior cingulate cortices, are among those previously implicated in internalising disorders, and the same signature tracks cognitive variation in healthy participants and is mechanistically linked to the efficiency of task-related computation. These findings establish a mechanistically interpretable and prospectively predictive marker of adolescent mental health and define a clear path towards external validation and clinical use.

4
Distinct and shared genetics of kidney filtration function versus albuminuria revealed by multi-trait GWAS

de Hesselle, H. C.; Garben, B.-F.; Stark, K. J.; Warth, R.; Teumer, A.; Pattaro, C.; Heid, I. M.; Winkler, T. W.

2026-06-09 genetic and genomic medicine 10.64898/2026.06.08.26355141 medRxiv
Top 9%
3.7%
Show abstract

Chronic kidney disease is characterized by decreased glomerular filtration rate (eGFR, estimated from serum creatinine or cystatin C) or increased urinary albumin-to-creatinine-ratio (UACR). Genome-wide association studies provided the genetic make-up of these traits, but their overlap remained largely unknown. Our multi-trait GWAS (N=1M) identified 812 signals and multi-trait fine-mapping sharpened the identification of likely causal variants. Of 333 signals classified for filtration function or albuminuria, only 11 overlapped. Their effects on eGFR and UACR were directionally concordant, dominated by eGFR and independent of HbA1c or mean arterial pressure. Mapped genes pinpointed mechanisms related to glomerular filtration area (SHROOM3, EPB41L5) and sodium-mediated intraglomerular pressure (NRBP1, DPEP1/CHMP1A). Genetics of fluid intake resulted in shadow effects on UACR without albumin leakage into urine. Our multi-trait approach sharpened the identification of likely causal genes for kidney traits, demonstrated largely distinct genetics for filtration function versus albuminuria, and provided new biological insights into the overlap.

5
A single-nucleus transcriptomic atlas of human basal ganglia during development forwarding diagnosis and therapy of pediatric movement disorders

Lange, B. K. A.; Graceffo, E.; Stenzel, W.; Biebermann, H.; Schuelke, M.; Wilpert, N.-M.

2026-06-04 nephrology 10.64898/2026.06.04.26354648 medRxiv
Top 9%
3.6%
Show abstract

Gene therapy is rapidly emerging as a transformative treatment for monogenic neurological disorders, including pediatric movement disorders such as aromatic L-amino acid decarboxylase (AADC) deficiency. However, its success critically depends on defining target cells and windows for therapeutic intervention. Here, we present an open-access single-nucleus transcriptomic atlas of the human basal ganglia spanning a therapy-relevant window from second/third trimester to the perinatal period and adulthood. Across 35,755 nuclei, we identify major (non-)neuronal cell types, retrace developmental trajectories, and characterize gene-regulatory networks. We identify so far unrecognized human-specific expression of key neuronal signaling genes, including GNAO1 and ADCY5, and discuss the implications for targeted gene replacement therapies. Unexpectedly, we found that the Huntingtin gene (HTT) is already expressed during prenatal stages of human brain development, supporting a previously proposed neurodevelopmental component of Huntington's disease, which should be considered in diagnostic and therapeutic strategies. Moreover, FOXG1 expression and regulon activity are predominantly located in a prenatal time window, suggesting constraints on the effectiveness of postnatal interventions. Our findings highlight the importance of datasets capturing human brain development in real time and provide a publicly available resource to guide precision gene therapy strategies in the future.

6
Parental educational attainment polygenic scores contribute to phenotypic heterogeneity in offspring with autism

Gao, S.; Sui, Y.; Tian, P.; Rao, X.; Yan, C.; Xu, Y.; Wang, T.

2026-06-08 genetic and genomic medicine 10.64898/2026.06.03.26354779 medRxiv
Top 10%
3.5%
Show abstract

Educational attainment-related polygenic scores have been implicated in autism spectrum disorder (ASD), but how parental polygenic scores shape offspring phenotypes remains unclear. Using genotyping and exome-sequencing data from 142,357 individuals (55,252 ASD cases) in a large ASD cohort, we dissected the direct and indirect genetic effects of educational attainment-related polygenic scores on ASD phenotypes. Trio-model analyses showed that parental polygenic scores for educational attainment (PGSEA ) were associated with milder core ASD symptoms, including social deficits and repetitive behaviors, predominantly through indirect genetic effects, whereas their associations with comorbidities were driven predominantly by direct genetic effects. PGSEA was also significantly negatively associated with rare variant burden and prenatal factors, although these factors contributed largely independently to most phenotypes. Adjustment for full-scale intelligence quotient (FSIQ) and socioeconomic status (SES) partially attenuated the indirect effects of PGSEA on offspring phenotypes. Finally, higher parental PGSEA was associated with later age at diagnosis in offspring, partly through its protective effects on ASD phenotypes. These findings indicate that indirect genetic effects of parentalPGSEA contribute substantially to phenotypic variation in ASD and highlight family-mediated pathways as an important component of ASD heterogeneity.

7
Placental molecular subtypes of severe preeclampsia reveal divergent aging trajectories and fetal growth outcomes

Du, Y.; Benny, P. A.; Lahiri, S.; AlAkwaa, F. M.; Huang, Q.; Liu, Y.; Lassiter, C. B.; Astern, J.; Riel, J.; Garmire, L. X.

2026-06-04 sexual and reproductive health 10.64898/2026.06.02.26354756 medRxiv
Top 12%
2.4%
Show abstract

Severe preeclampsia (sPE) is a major cause of maternal and fetal morbidity worldwide, yet its placental molecular heterogeneity remains poorly defined by current clinical diagnosis. To resolve the molecular architecture of sPE, here we integrated DNA methylation and proteomic profiling from a multi-ethnical cohort of 444 placentas from the Hawaiian Biorepository (HiBR), including 169 sPE cases, matched preterm controls and full-term controls. To address cellular heterogeneity in bulk placental tissue, we developed HOMED (Hierarchically Optimized Methylation Deconvolution), a single-cell-guided hierarchical framework for inferring placental cell-type composition from DNA methylation data. HOMED-adjusted integrative analyses identified extensive subtype-specific alterations involving hypoxia, angiogenesis, immune activation, trophoblast differentiation and metabolic remodeling. Molecular stratification revealed two reproducible sPE subtypes with divergent placental aging trajectories. One subtype exhibited a pre-mature placental state marked by accelerated placental aging, whereas the other displayed slower accelerated placental aging but a substantially increased risk of small-for-gestational-age birth (P = 0.028). These subtypes were independently replicated across six external cohorts and further supported by proteomic signatures achieving a classification accuracy of 0.88. Integrative epigenomic and proteomic analyses linked the growth-restricted subtype to hypoxia-associated glycolytic remodeling, suggesting distinct pathogenic mechanisms underlying clinically diagnosed sPE. Together, our findings redefine severe preeclampsia as a biologically heterogeneous placental disorder composed of molecularly distinct subtypes with divergent aging trajectories and fetal growth outcomes, providing a framework for mechanism-based stratification and precision obstetric medicine.

8
Serological thresholds of risk reduction for infant group B streptococcus disease

Cantrell, L.; Karampatsas, K.; Andrews, N.; Beach, S.; Bentley, E.; Berardi, A.; Bijlsma, M. W.; Cagil Kocana, C.; Daniel, O.; French, N.; Hall, T.; Izu, A.; Khalil, A.; Kwatra, G.; Kyohere, M.; Madhi, S. A.; Mboizi, R.; Miselli, F.; Nielsen, M.; Thorn, N.; van de Beek, D.; Walker, K.; Heath, P. T.; Le Doare, K.; Voysey, M.; PREPARE WP3 Study Group,

2026-06-06 epidemiology 10.64898/2026.05.29.26353453 medRxiv
Top 12%
2.4%
Show abstract

Vaccines to prevent infant group B streptococcus (GBS) disease are advancing, with licensure likely based on safety and immunologic endpoints rather than clinical efficacy data. This approach requires robust, generalisable serological thresholds of risk reduction (SToRRs). We combined data from six case-control studies in Europe and Africa to define SToRRs for early-onset (EOD) and late-onset (LOD) GBS disease. Across diverse epidemiological and healthcare settings, anti-capsular polysaccharide IgG concentrations were consistently higher in infants who remained disease free than in those who developed disease. Higher antibody concentrations were required to reduce the risk of EOD than LOD, and higher concentrations were required for serotype Ia than for serotype III. This study provides a quantitative framework to support correlates-based evaluation and potential licensure of maternal GBS vaccines.

9
Integrating patient movement and pathogen genomics to support hospital infection prevention with PathoPath: a method development study

Sajib, M. S.; Tanmoy, A. M.; Kanon, N.; Jui, A. B.; Islam, M. S.; Dola, N. Z.; Hossain, M. M.; Mobarak, R.; Shahidullah, M.; Hoque, M.; Ahmed, A. N. U.; Holmes, A. H.; Saha, S. K.; Saha, S.; Wan, Y.; Hooda, Y.

2026-06-05 infectious diseases 10.64898/2026.06.03.26354630 medRxiv
Top 12%
2.1%
Show abstract

Background Healthcare-associated infections pose a major burden to neonatal health worldwide and remain difficult to track in low-resource hospitals because patient movement data and pathogen genomic data are rarely integrated into actionable transmission models. Existing approaches are often restricted to specific settings, highly structured electronic health records (EHRs), or analyses focused on either patient movements or pathogen characteristics alone. To address this gap, we developed PathoPath, an open-source integrative modelling platform, and evaluated its utility in a high burden paediatric hospital in Dhaka, Bangladesh. Methods PathoPath is an open-source R package that combines electronic health records with whole genome sequencing data to generate contact networks from direct and indirect contacts using minimal structured inputs. We retrospectively applied PathoPath to 373 cases of Klebsiella pneumoniae species complex (KpSC) infection identified in 2021 at the largest paediatric referral hospital in Dhaka, Bangladesh. Ward level patient movement trajectories were used to reconstruct contact networks, and genomic data from isolates from children <60 days were integrated to identify probable dissemination of bacterial clones and antimicrobial resistance plasmids. Findings PathoPath identified 750 direct contacts among 317 patients, forming 25 connected components, with the largest including 93 patients. KpSC infections were identified across 21 of 37 wards, with the neonatal intensive care unit accounting for 77.9% of all cases. Integration of genomic and network data distinguished sustained clustering of ST147 from multiple probable inter-clonal dissemination events involving IncFII plasmids carrying blaNDM-5 and/or blaOXA-181 within ST16. Four dominant sequence types accounted for 65.6% of sequenced isolates, and carbapenemase genes were detected in 95.8%. Interpretation PathoPath reconstructs hospital-wide contact networks and integrates them with pathogen genomics to map probable dissemination of pathogens and antimicrobial resistance using minimal structured clinical data. It could support more targeted infection prevention and control in hospitals where granular digital records are not available.

10
Age-specific burden of medically attended respiratory virus disease in high-income countries: a scoping review and meta-analysis

Gupta, M.; Zoega, H.; Stopard, I. J.; Liu, B.; Macartney, K.; Wood, J. G.; Hogan, A. B.

2026-06-10 epidemiology 10.64898/2026.06.09.26354660 medRxiv
Top 13%
1.9%
Show abstract

Introduction: Respiratory infections are a leading cause of morbidity. Newly available vaccines to prevent respiratory syncytial virus (RSV) disease and encouraging clinical progress on vaccines for human metapneumovirus (hMPV) and parainfluenza (PIV) could reduce the disease burden beyond existing influenza and SARS-CoV-2 immunisation programs. However, evidence on the contribution of these viruses to respiratory disease burden across the lifespan remains limited. Methods: We reviewed studies from 01/2002-11/2025 reporting age-stratified, medically attended cases of influenza, and at least one of RSV, hMPV, or PIV, in high-income countries, excluding periods substantially overlapping with the COVID-19 pandemic. Using only studies that tested for all four viruses, we estimated the age-specific proportion of cases that were non-influenza (total across RSV, hMPV and PIV) compared to influenza using a mixed-effects logistic regression model. Results: Following exclusions and screening, 61 studies were included in the primary analysis comprising >500,000 detections of the four viruses. We found that a substantial proportion of medically attended respiratory illness in infants and young children was due to PIV, hMPV and RSV, rather than influenza, with a non-influenza virus proportion of 90.2% (95% CI 85.9-93.2%) in young infants aged 0-6 months. The converse was true for school-aged children, with a non-influenza virus proportion of 34.8% (95% CI 26.5-44.2%) in children aged 5-18 years. In adults aged 65+ years, non-influenza causes of medically attended disease were common at 60.2% (95% CI 50.0-69.5%). Restricting to studies reporting hospitalised cases (n=19) produced broadly similar age-specific trends in relative virus burden contributions. Discussion: We highlight the significant burden of medically attended illness due to PIV, hMPV and RSV across ages, particularly in infant and preschool-aged children and older adults, supporting the need for effective vaccines targeting this burden.

11
Pooled testing for SARS-CoV-2 surveillance in schools: real-world evaluation of transmission control, testing resources, and educational disruption

Colosi, E.; Calmon, L.; Fässli, M.; Koch, K.; Bielicki, J. A.; Colizza, V.

2026-06-04 infectious diseases 10.64898/2026.06.03.26354821 medRxiv
Top 14%
1.8%
Show abstract

Pooled testing programs were introduced during the COVID-19 pandemic to expand surveillance capacity while preserving testing resources, but evidence on their epidemiological impact in schools under real-world conditions remains limited. We analyzed data from the pooled testing program implemented in public primary schools of the canton of Basel-Landschaft, Switzerland, during the Fall-Winter 2021 Delta wave. We used an agent-based transmission model informed by pooled and individual testing results, school characteristics, contact networks, and community incidence. The model was fitted to pooled positivity ratios in four clusters of administrative areas with similar epidemic trajectories. We compared pooled testing with alternative protocols in terms of school transmission, testing volume, and student-days lost. During the study period, pooled testing was offered to 21'187 students across 62 public primary schools, with high and stable participation across clusters (mean 71-79%). The fitted model reproduced observed pool positivity trends well. Compared with pooled testing, reactive class closure, reactive screening, and symptomatic testing were associated with higher in-school transmission, with excess ranging from 50% to 87%, 63% to 104%, and 72% to 133% across clusters. Weekly individual screening achieved similar reductions in transmission but required 15-25 times more tests. Relaxing class closure after depooling substantially reduced student-days lost without increasing transmission. Under real-world conditions, pooled testing provided an effective and resource-efficient strategy to reduce SARS-CoV-2 transmission in primary schools. Combining early detection of asymptomatic infections with low testing demands, pooled testing offers a scalable approach to school surveillance and control for pandemic response in educational settings.

12
Clonal Hematopoiesis of Indeterminate Potential Refines Cardiovascular Risk Stratification in Cardiovascular-Kidney-Metabolic Syndrome Stages 0-3

Lu, J.; Sun, S.; Deng, Z.; Wang, S.; Wei, C.; Jiang, S.; Li, W.

2026-06-08 epidemiology 10.64898/2026.06.04.26354963 medRxiv
Top 14%
1.7%
Show abstract

Background: Chronic low-grade inflammation drives cardiovascular-kidney-metabolic (CKM) syndrome. Clonal hematopoiesis of indeterminate potential (CHIP), an age-related driver of systemic inflammation, is linked to several cardiometabolic disorders. However, whether CHIP modifies CKM progression and contributes to heterogeneity in cardiovascular disease (CVD) risk within the CKM framework remains uninvestigated. Methods: This cohort study included 307,025 UK Biobank participants at CKM stages 0-3 free of baseline CVD. CHIP status was identified via whole-exome sequencing (WES). The association between CHIP and baseline CKM severity was examined, along with the independent and joint effects of CHIP and CKM stages on incident CVD risk. The joint effects of CHIP and polygenic risk scores (PRS) were further assessed, and the incremental predictive value of incorporating CHIP into the AHA PREVENT equations was evaluated. Results: CHIP carriers were more likely to present with advanced CKM stages [OR 1.14 (1.09-1.20), P < 0.001] and exhibited higher incident CVD risk during follow-up [HR 1.13 (1.08-1.18), P < 0.001]. Significant joint effects between CHIP and CKM stages were observed, with the highest risk among CHIP carriers at CKM stage 3 [HR 1.63 (1.50-1.78), P < 0.001]. Large or multiple CHIP mutations conferred greater hazards, with distinct gene-specific effects observed. Moreover, CHIP and high genetic risk also jointly amplified CVD susceptibility. Most importantly, incorporating CHIP into AHA PREVENT significantly improved risk discrimination. Conclusions: CHIP is a significant risk factor associated with more advanced CKM stages and amplifies incident CVD risk. Integrating CHIP into existing prevention strategies may refine CVD risk stratification.

13
Stochastic Morphodynamics of the Human Aorta Across the Lifespan

Twohig, K. C.; Mansour, M.; Pugar, J. A.; Yuan, K.; Pocivavsek, L.; Klishin, A. A.

2026-06-08 surgery 10.64898/2026.06.05.26355015 medRxiv
Top 14%
1.7%
Show abstract

Biological systems evolve as continuous dynamical processes, but at organ-scale and across human lifespans they are rarely observed longitudinally--population data typically exist instead as sparse, cross-sectional snapshots. Inferring lifespan dynamics from such data requires methods distinct from those used at cellular and tissue scales where dense observations are accessible. We address this problem in the thoracic aorta, where surgical decisions currently rest on static, age- and sex-agnostic diameter thresholds that reduce three-dimensional morphology to a single scalar. Treating normal aortic morphology as a stochastic dynamical system, we pose a continuous-time drift-diffusion process in a two-coordinate state space of normalized surface area (A) and normalized fluctuation in integrated Gaussian curvature ({delta} K), and fit closed-form solutions of the Fokker-Planck equation by maximum likelihood to a sex-balanced, age-uniform cohort spanning infancy to age 99. Inter-individual variability is treated as a fitted diffusion parameter rather than as residual scatter, which is distinct from prior normative studies that report variability as scatter around a regression line. The framework identifies two growth regimes for aortic size (childhood expansion followed by persistent adult growth, with adult males growing approximately 70% faster than adult females) and a single dynamical regime for aortic shape, with heteroscedastic variability accumulating at a rate comparable to the mean drift over the lifespan. Applied to independent cohorts of acute and chronic thoracic aortic dissections, the multivariate model identifies over 95% as statistical outliers via Mahalanobis distance, consistently outperforming either coordinate alone. The same probabilistic envelope that describes normal aging thus defines a baseline against which disease can be detected, supporting a shift toward dynamic, age- and sex-aware assessment of thoracic aortic pathology.

14
Limitations of cross-border containment strategies for Bundibugyo ebolavirus

Middleton, C.; Larremore, D.

2026-06-08 epidemiology 10.64898/2026.06.04.26354820 medRxiv
Top 15%
1.7%
Show abstract

An ongoing outbreak of Bundibugyo virus disease (BVD) in the Democratic Republic of the Congo was deemed a public health emergency of international concern in May 2026. To prevent cross-border importation, many countries, including the United States, Canada, India, Thailand, and Kenya have already proposed containment strategies, and others are likely to follow suit. How well (or poorly) are screening and quarantine containment measures are likely to work? We leverage established epidemiological theory and develop a mathematical model of traveler screening and post-arrival quarantine for BVD to answer this question. We find that traveler screening via symptom screening or molecular testing will miss the majority of infected travelers, and should be complemented by post-arrival quarantine and monitoring of sufficient duration to detect those with long incubation periods. Our findings underscore the limitations of border screening and the importance of complementary measures like post-arrival quarantine to prevent local importation of BVD.

15
Heterozygous MMACHC burden variants are associated with higher circulating vitamin B12 in the All of Us Research Program

Cai, L.; DeBerardinis, R. J.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.03.26354855 medRxiv
Top 18%
0.9%
Show abstract

Heterozygous carriers of autosomal recessive disease variants are conventionally considered unaffected, yet population-scale genomic datasets reveal subclinical carrier phenotypes. MMACHC encodes a cobalamin-processing protein whose biallelic loss causes cobalamin C deficiency, an inborn error of intracellular cobalamin metabolism. We performed an unbiased quantitative phenome-wide association screen in All of Us Research Program v8 to identify phenotypes associated with rare heterozygous MMACHC burden variants. Serum/plasma vitamin B12 was the top quantitative association. Carriers had higher circulating B12 than non-carriers in adjusted analyses, but also higher homocysteine, suggesting that elevated circulating B12 does not reflect improved intracellular cobalamin function. Carriers were less likely to fall below conventional B12 insufficiency thresholds, indicating a potential diagnostic blind spot. A pathway-wide rare-variant gene-burden (All-by-All) gene-burden analysis placed this finding in broader biological context. Burdens in genes related to circulating B12 binding or intestinal absorption were associated with lower circulating B12. In contrast, burdens in several genes involved in cellular delivery and intracellular cobalamin handling were associated with higher circulating B12. This step-specific directionality supports a model in which elevated circulating B12 can reflect impaired cellular handling and consequent systemic accumulation rather than improved cellular cobalamin availability. Because EHR-derived B12 is shaped by heterogeneous clinical and medication contexts, prospective carrier-enriched studies with standardized methylmalonic acid, homocysteine, diet, supplement, medication, comorbidity, and symptom ascertainment are needed to evaluate functional-marker-based screening.

16
A liquid biopsy-centered, pan-cancer, open next generation sequencing panel to support clinical decision-making (LION panel)

Feierabend, S.; Künstner, A.; Forster, M.; Helbing, T.; Gebauer, N.; Gemoll, T.; Axt, F.; Nimmagadda, S. C.; Ranganathan, L.; Schwandt, J.; Heber, M.; Szymczak, S.; Hohensee, I.; Fliedner, S. M. J.; Scherer, F.; Oberländer, M.; Derer-Petersen, S.; Busch, H.; von Bubnoff, N.; Dazert, E.

2026-06-08 oncology 10.64898/2026.06.05.26354976 medRxiv
Top 19%
0.8%
Show abstract

Cancer treatment has shifted toward personalized therapy based on molecular profiling, particularly in advanced disease. Existing circulating tumor DNA panels are often broad, generating many non-actionable variants and incurring costs that limit routine use in molecular tumor boards. We developed and validated a manufacturer-independent, 109-gene liquid biopsy-centered pan-cancer open next generation sequencing panel (LION panel), combined with an in-house bioinformatic pipeline to support clinical decision-making. A total of 87 samples were analyzed, including 17 reference samples, 21 healthy blood donor controls, and 49 patient samples including nine tumor entities. The LION panel achieved 92% sensitivity and 99% specificity in reference samples, with high concordance to digital droplet PCR (r = 0.99). It detected variant allele frequencies as low as 0.05% (tumor-informed) and 0.5% (tumor-uninformed). Clinical concordance reached 82% with blood-based digital droplet PCR and 75% with whole exome tissue sequencing. In representative cases, variant dynamics correlated with disease progression and revealed additional targetable variants. Overall, the LION panel supports clinical decision-making by enabling identification of targetable variants, disease monitoring, and detection of treatment resistance, particularly when tumor tissue is unavailable.

17
Phenome-wide association of multiallelic copy number variation in 422,170 UK Biobank individuals reveals novel genetic loci associated with disease

Eisenberg, M.; Packer, R.; Shrine, N.; Demidov, G.; Pack, H.; Hollox, E. J.; Fawcett, K.

2026-06-04 genetic and genomic medicine 10.64898/2026.06.03.26354825 medRxiv
Top 20%
0.8%
Show abstract

The contribution of multi-allelic CNVs (mCNVs) to disease risk has not been widely studied. This is largely because they have been difficult to characterise at a large-scale genome-wide, and are often not strongly associated with flanking SNVs, limiting imputation. Improved understanding of the role of mCNVs in disease risk could lead to novel insights into the pathobiology of disease. We robustly typed 69 mCNVs from UK Biobank whole exome sequences in discovery (n=150,682) and replication sets (n=269,317). Discovery and replication PheWAS used clinically-curated composite phenotypes by integrating self-report, primary and secondary health care data to interrogate these variants, for unrelated British individuals of African, European and Central/South Asian ancestries. 173 mCNV-phenotype associations were detected from 26 mCNVs, of which 114 associations replicated. One of eight potentially novel mCNV-phenotype signals was independent of neighbouring associated SNVs, the association of Sulfotransferase 1A1 and 1A2 genes (SULT1A1/SULT1A2) with estimated glomerular filtration rate (eGFR) in individuals of European ancestry (meta-analysed p=1.05x10-9, beta=0.016 [0.011; 0.021]). Other potentially novel associations include Golgi phosphoprotein 3 (GOLPH3) with the cardiovascular phenotype bundle branch block in individuals of South Asian ancestry (meta-analysed p=3.35x10-6, OR=2.13 [1.53, 2.96]) and alpha amylase 2B (AMY2B) with ventricular fibrillation and flutter in individuals of European ancestry (meta-analysed p=2.48x10-6, OR=1.50 [1.26; 1.78]). In summary, we show that accurate typing of biobank-scale sample sizes can identify associations between traits and mCNVs, acting through a gene dosage relationship. Our work provides several novel likely causative variants contributing to particular traits of clinical importance and immediately suggest a putative functional mechanism for the observed associations.

18
TNFRSF13B Common Variants Enhance Antibody-Dependent Complement Activation and Susceptibility to Acute Respiratory Distress Syndrome Following Respiratory Viral Infection

Naing, L.; de Mattos Barbosa, M. G.; Connell, I. P.; Chicca, J.; Zhao, Z.; Reister, N. A.; Bruchez, A.; Greenspan, N.; McComsey, G.; Platt, J. L.; Cascalho, M.

2026-06-04 allergy and immunology 10.64898/2026.06.02.26354763 medRxiv
Top 20%
0.7%
Show abstract

Acute respiratory distress syndrome (ARDS) is a devastating complication of respiratory infections; however, the biological mechanisms that initiate its onset are poorly defined. Here we show that TNFRSF13B polymorphisms increase the risk of ARDS following SARS-CoV-2 infection up to 7.4-fold compared to the WT genotype. The increased risk was not due to immune-deficiency or impaired virus neutralization. On the contrary, TNFRSF13B mutant subjects mounted better antibody neutralization compared to subjects with WT TNFRSF13B. However, IgG from subjects expressing TNFRSF13B variants had less sialic acid, terminal galactose, and fucose than IgG from subjects with a WT genotype. Moreover, IgG from TNFRSF13B mutant subjects exhibited increased recruitment of complement factors. Thus, besides well-known actions governing plasma cell differentiation, TNFRSF13B impacts both affinity maturation and effector functions of IgG in ways that independently govern complement activation controlling inflammatory responses known to trigger ARDS.

19
Topological Deep Learning Identifies Polygenic Variant Clusters Across Familial Multimorbid Disorders

Vomo-Donfack, K. L.; Bousquet, G.; Falgarone, G.; Ginot, G.; Morilla, I.

2026-06-09 health informatics 10.64898/2026.06.03.26354242 medRxiv
Top 21%
0.7%
Show abstract

Whole-genome sequencing comprehensively captures coding, non-coding and structural variation in families with suspected inherited disorders, yet its clinical utility remains constrained by an interpretation bottleneck: selecting a handful of relevant variants from millions of candidates. Current rule-based pipelines, anchored in ACMG/AMP criteria, excel at identifying highly penetrant Mendelian alleles but frequently miss variants of low-to-moderate penetrance, non-coding alterations and germline-somatic interactions. Here we introduce PolyCLIP-T, a topology-guided multimodal framework that transforms variant selection from a classification problem into a geometric discovery task. By contrastively aligning DNA-sequence embeddings with functional annotations, PolyCLIP-T constructs a unified latent space in which the displacement between reference and alternate embeddings quantifies the molecular perturbation induced by each variant. Persistent homology then identifies stable topological components - coherent variant groups shared among affected relatives - that transcend single-variant scoring logic. Applied to six families with multi-morbid cancer, autoimmune and cardiovascular disease, PolyCLIP-T recovered non-coding and structural candidates overlooked by conventional pipelines and revealed pleiotropic networks spanning disease categories. This approach provides an interpretable, scalable solution for genome-first investigations of disorders driven by polygenic architectures that evade single-variant analysis. The framework was developed and benchmarked on deeply characterised familial cohorts selected for transgenerational multimorbidity; validation in larger, independent populations will be essential to establish its generalisability. An interactive web tool is freely available at https://www.polyclip-t.uma.es/.

20
Five-year immunogenicity and safety follow-up of the PREVAC randomized Trial of Vaccines for Zaire Ebola Virus Disease

BEAVOGUI, A. H.; Doumbia, S.; Kieh, M.; Leigh, B.; Sow, S.; Lhomme, E.; Ben-Farhat, S.; Dubois Cauwelaert, N.; Roy, C.; Diouf, W.; Idrissa, S.; Diarra, S.; Millimouno, N. P.; Diallo, F. A.; Kamara, M.; Pratt, D.; Dicko, I.; Kennedy, S. B.; Esperou, H.; Choi, E. M.; Kpetigo, A.-M. D.; D'Ortenzio, E.; Diallo, A.; Lancrey-javal, S.; Hamze, B.; Schwimmer, C.; Wiedemann, A.; Ayouba, A.; Peeters, M.; Lane, H. C.; Higgs, E.; Watson-Jones, D.; Yazdanpanah, Y.; Greenwood, B.; RICHERT, L.; Levy, Y.; PREVAC study team,

2026-06-08 infectious diseases 10.64898/2026.05.29.26354050 medRxiv
Top 21%
0.7%
Show abstract

Background: The World Health Organization has expanded its recommendations for prophylactic Ebola vaccination for at-risk populations. Durable vaccine-induced immunity is important for sustaining outbreak preparedness in regions with recurrent Ebola virus disease (EVD). We assessed five-year persistence of vaccine-induced immune responses in adults and children from the PREVAC trial. Methods: Two large randomised phase 2 trials (NCT02876328), in adults and children aged [&ge;]1 year, were conducted in four west African countries. Participants were randomly assigned to placebo or to one of three Ebola vaccine strategies: Ad26.ZEBOV followed by MVA-BN-Filo at 56 days; rVSV{Delta}G-ZEBOV-GP followed by placebo; or rVSV{Delta}G-ZEBOV-GP followed by a homologous booster dose at 56 days. After 12 months of follow-up, the primary results were published, participants unblinded to their vaccine assignment, and follow-up continued for 60 months. After Month 24, placebo group recipients were offered active vaccination. Anti Ebola virus glycoprotein Immunoglobulin G (IgG) concentrations were measured for 5 years. Findings: 1401 adults and 1401 children were initially randomized, and 1315 (93.9%) adults and 1322 (94.4%) children attended at least one long-term visit. Retention was high, with 95% followed beyond 1 year and 83% completion at 5-year follow-up. For the three vaccine strategies, antibody geometric mean concentrations (GMC) declined modestly between Months 12 and 24, followed by a stable plateau from Months 24 to 60. At Month 60, antibody GMC were higher in the rVSV-based groups (1099 and 1216 EU/ml for adults; 1982 and 2347 EU/ml for children) than in the Ad26.ZEBOV, MVA-BN-Filo group (252 adults and 645 EU/ml children). Antibody persistence at Month 60 was heterogeneous, varying by age, sex, country, and baseline IgG concentration. Interpretation: Licensed Ebola vaccines induced sustained antibody responses in adults and children for up to 5 years. While the protective antibody level is unknown, these data demonstrate long-lasting immune responses from currently employed vaccine strategies.