FACETS — Latest Matching Preprints

1

Independent Validation of Test-Adjusted COVID-19 Incidence Estimates Using Wastewater Surveillance Data in Ontario, Canada

Fisman, D.; Wilson, N.; Lee, C. E.; Tuite, A.

2026-05-12 infectious diseases 10.64898/2026.05.08.26352754 medRxiv

Top 0.1%

6.5%

Show abstract

BackgroundCase-based infectious disease surveillance is subject to ascertainment bias when testing intensity varies across time and population subgroups. We previously developed a regression-based test adjustment methodology using Standardized Testing Ratios (STRs) to correct for differential testing patterns in COVID-19 surveillance data. Wastewater-based surveillance (WWS) measures viral burden in the community independently of diagnostic testing behavior, making it a valuable external validation tool for test-adjusted case estimates. MethodsWe analyzed 111 weeks of paired wastewater and case surveillance data from Ontario, Canada (July 19, 2020 to August 28, 2022). Wastewater SARS-CoV-2 signals from 107 sewersheds across 34 public health units were normalized within sewersheds and aggregated using population-weighted averages. We compared wastewater correlations with crude reported and test-adjusted case counts using Spearman rank correlations, linear regression, and negative binomial distributed lag nonlinear models (DLNM), stratified by epidemic period. ResultsTest-adjusted cases correlated substantially more strongly with wastewater signals than crude reported cases overall (Spearman {rho} = 0.849 vs. 0.679; linear R{superscript 2} = 0.609 vs. 0.191). The advantage of test adjustment was greatest during the Omicron wave, when population-level diagnostic testing contracted sharply following PCR eligibility restrictions ({rho} = 0.924 vs. 0.604; R{superscript 2} = 0.815 vs. 0.470). DLNM incorporating the wastewater signal explained substantially more variance in test-adjusted than crude reported cases (McFadden pseudo-R{superscript 2} 0.898 vs. 0.776), despite similar lag-response structure for both outcomes. ConclusionsWastewater surveillance provides compelling independent validation of a previously described test adjustment methodology for COVID-19 case surveillance. The agreement between wastewater signals and test-adjusted cases was strongest precisely when testing scarcity was most severe, supporting the use of test adjustment to recover accurate infection dynamics from case surveillance data during periods of changing testing access and policy.

2

Operational complexity predicts selective non-dissemination within pharmaceutical sponsor portfolios: a retrospective cohort study

Sayed, A. M.; Huan, P. T.; Nguyen, T. K.; Fathy, E.; Aziz, T.; Tho, D. V.; Huy, N. T.

2026-05-06 health policy 10.64898/2026.05.05.26352331 medRxiv

Top 0.1%

3.6%

Show abstract

BackgroundIncomplete dissemination of clinical trial results remains an important challenge for research transparency and evidence synthesis. Although prior studies have quantified the overall extent of non-dissemination, less is known about whether trial characteristics observable at registration are associated with subsequent dissemination within sponsor portfolios. Methods and findingsWe conducted a retrospective cohort study of 17,537 completed interventional clinical trials registered on ClinicalTrials.gov between 2007 and 2024 across the 20 largest global pharmaceutical companies. We developed the Operational Complexity Index (OCI), a composite measure derived from planned enrollment, facility count, and geographic scope, and examined its association with trial dissemination using multivariable logistic regression and time-to-event analyses. Higher OCI was associated with greater odds of dissemination (adjusted odds ratio [aOR] = 2.40, 95% CI 2.23-2.60; p < 0.001), with dissemination increasing from 47% in the lowest OCI decile to 95% in the highest. Higher operational complexity was also associated with earlier dissemination; over a 1,095-day horizon, high-OCI trials were disseminated a mean of 310.88 days earlier than low-OCI trials (RMST difference, 310.88 days; 95% CI 300.59-320.96; p < 0.001). This pattern was observed across sponsors, clinical phases, and therapeutic areas. In predictive analyses using registration-time variables, the structural model achieved a cross- validated AUC of 0.816 and a holdout AUC of 0.814, whereas the full model, including sponsor identity, achieved a cross-validated AUC of 0.858 and a holdout AUC of 0.857. Using benchmark phase-based costing assumptions, the 5,019 non-disseminated trials corresponded to an estimated US$10.94-15.26 billion in sunk research investment. ConclusionsAmong trials conducted by the 20 largest pharmaceutical sponsors, greater operational complexity at registration was associated with a higher likelihood of dissemination and earlier dissemination. These findings suggest that aggregate sponsor-level transparency metrics may mask important heterogeneity within sponsor portfolios. Future work should assess whether registration-time trial characteristics can help identify trial subgroups at higher risk of non-dissemination. AUTHOR SUMMARYO_ST_ABSWhy was this study done?C_ST_ABSO_LIIncomplete dissemination of clinical trial results reduces the completeness of the medical evidence base and the public value of research participation. C_LIO_LIPrevious studies have described overall rates of trial non-dissemination, but less is known about whether dissemination varies systematically across different types of trials within sponsor portfolios. C_LIO_LIWe examined whether trial characteristics available at registration were associated with later dissemination of results among large pharmaceutical sponsors. C_LI What did the researchers do and find?O_LIWe analyzed 17,537 completed interventional clinical trials sponsored by the 20 largest pharmaceutical companies and registered on ClinicalTrials.gov between 2007 and 2024. C_LIO_LIWe developed an Operational Complexity Index (OCI) based on planned enrollment, number of facilities, and geographic scope to measure trial operational scale at registration. C_LIO_LIHigher OCI was associated with a greater likelihood of dissemination and earlier dissemination. Dissemination ranged from 47% in the lowest OCI decile to 95% in the highest. C_LIO_LIThis pattern was observed across sponsor portfolios, clinical phases, and therapeutic areas, with an average within-sponsor dissemination gap of 40 percentage points between lower- and higher-complexity trials. C_LIO_LIIn manual validation of 344 sampled trials, the automated dissemination-classification pipeline achieved 92.1% accuracy. C_LIO_LIUsing benchmark phase-based costing assumptions, the 5,019 non-disseminated trials corresponded to an estimated US$10.9-15.3 billion in sunk research investment. C_LI What do these findings mean?O_LIDissemination was not uniform across trial types within sponsor portfolios; trials with lower operational complexity were less likely to be disseminated than trials with higher operational complexity. C_LIO_LIAggregate sponsor-level transparency measures may therefore miss important differences within portfolios. C_LIO_LIRegistration-time trial characteristics showed predictive signal for non-dissemination, but whether such information could support monitoring strategies would require prospective validation. C_LIO_LIMore complete dissemination of trial results would strengthen the scientific record and improve the public value of clinical research. C_LI

3

Effects of Early Career Peer Review Service on Subsequent Grant Submission Outcomes&nbsp

Vancea, A.; Pandit, K.; Ornek, M.; Bhattacharyya, D.; Lindner, M.; Reed, B.

2026-05-20 health policy 10.64898/2026.05.15.26353357 medRxiv

Top 0.1%

3.2%

Show abstract

Peer reviewers provide a critical service to NIH by evaluating the scientific and technical merit of grant applications. While the tangible rewards for this service are limited, many reviewers feel review service makes them better applicants, improving their grant competitiveness. However, empirical evidence for this claim is limited. This study evaluates relationships between early career peer review service and subsequent application behavior and funding outcomes. Using NIH administrative data, applicants who served as Early Career Reviewers (ECRs) during the 2020 - 2021 council years were compared to a matched group of ECR-eligible applicants who had not served as reviewers (n=1,120 per group). To address non-random selection of ECRs, propensity score matching was used to balance groups on research field, demographics, productivity, career stage, and institutional resources. Outcomes, assessed over a three-year follow-up period, included submission volume, peer review scores, and funding outcomes for R01 and R01-equivalent applications. ECRs submitted more applications, were more likely to have their applications discussed, and were more likely to receive a high review score than matched controls. They were also more likely to receive R01 funding. While peer review scores do not solely determine award outcomes, these findings indicate that peer review service among ECRs is associated with improved grant application outcomes.

4

Wastewater Surveillance as an Event Detection System: Outbreak and Peak Detection of SARS-CoV-2 Across 281 U.S. Counties

Link, N. B.; Garrido, R.; Nande, A.; Santillana, M.

2026-05-19 infectious diseases 10.64898/2026.05.14.26353186 medRxiv

Top 0.1%

3.1%

Show abstract

Wastewater-based surveillance (WBS) is increasingly used to monitor infectious disease dynamics, yet most evaluations focus on correlation or forecasting - neither of which directly assesses whether wastewater signals can identify the epidemiological events most relevant to public health decision-making. We argue that outbreak onset and epidemic peak detection are the operationally critical use cases of WBS, requiring a fundamentally different evaluation framework. We introduce a classification-based framework that treats WBS as an event-detection problem, defining outbreaks and peaks as discrete events, establishing detection intervals to account for timing uncertainty, and incorporating censoring and data completeness criteria for valid comparisons against imperfect clinical reference outcomes. Within this framework, we apply a Bayesian exponential growth model for outbreak detection - benchmarked against a standard reproductive number (Rt)-based method - and a rule-based algorithm for peak detection, evaluating performance via sensitivity and positive predictive value (PPV). Applied to county-level SARS-CoV-2 wastewater data from 281 U.S. counties (Biobot, 2021-2024), the exponential growth approach substantially outperforms the Rt-based baseline: sensitivity 0.82 and PPV 0.64 versus sensitivity 0.58 and PPV 0.19 for the best-performing Rt variant. Peak detection achieves sensitivity 0.84 and PPV 0.70 at the county level. Both peak and outbreak detection achieve strong and consistent performance against hospitalizations and deaths at the state level. Spatial aggregation yields a statistically significant improvement in peak detection PPV against a curated reference standard ($p < 0.001$), while outbreak detection improvements under aggregation are directionally consistent but not statistically significant. Wastewater leads case-defined outbreaks by 4-6 days but minimally leads epidemic peaks, consistent with wastewater approximating prevalence rather than incidence. These findings demonstrate that wastewater signals can reliably detect outbreak onset and epidemic peaks across spatial scales and clinical outcomes, and that the choice of detection method matters substantially in practice. The classification framework developed here provides a reusable and principled tool for evaluating any surveillance signal as an event-detection system, with direct relevance to how WBS is actually used in public health decision-making.

5

Effect of the 2025 National Institutes of Health grants disruption on first-time and mechanism-first principal investigators: a cohort study of 80,976 active awards

Alahdab, F.; Mittendorfer, B.

2026-05-25 health policy 10.64898/2026.05.22.26353911 medRxiv

Top 0.1%

1.7%

Show abstract

Objective: To estimate the adjusted relative risk (RR) of administrative grant disruption faced by first-time and mechanism-first principal investigators (PIs) during the 2025 National Institutes of Health (NIH) grant disruptions. Design: Retrospective cohort study linking NIH RePORTER data to a publicly curated registry of grants disrupted in 2025. Setting: All NIH active research grants in fiscal years 2024 to 2025. Participants: 80,976 active projects: 4,961 disrupted during the wave that peaked in May 2025, 76,015 non-disrupted controls. Main outcome measures: Adjusted RR of disruption by two pre-specified first-time PI constructs: absolute first-time PI (no prior NIH grant) and mechanism-first PI (no prior NIH grant with the same activity code). Modified Poisson regression with institution-clustered standard errors adjusted for project, institutional, and geographic covariates. A pre-specified fiscal year 2024 common-anchor analysis addressed year-of-disruption confounding. Results: Of 4,961 disrupted grants, 237 (4.8%) had an absolute first-time PI and 396 (8.0%) had a mechanism-first PI. After adjustment, absolute first-time PIs faced 77% elevated risk of disruption (RR 1.77, 95% CI 1.34 to 2.32) and mechanism-first PIs faced 57% elevated risk (RR 1.57, 1.16 to 2.11). Under the common-anchor analysis, the absolute first-time effect attenuated to RR 1.22 (0.95 to 1.58); the mechanism-first effect persisted (RR 1.48, 1.07 to 2.06). The elevated risk was concentrated in research-mechanism grants (RR 1.78, 1.26 to 2.52) and was robust across 8 of 9 pre-specified sensitivity analyses. The Track A start-time construct, which asks whether the disrupted project was the PI's debut grant, yielded null estimates (RR 0.98, 0.93 to 1.04), with any effect concentrated entirely in newly started projects. Conclusions: First-time and mechanism-first PIs faced disproportionately elevated risk of disruption during the 2025 NIH wave, concentrated in research-mechanism grants and robust to year-confounding-free identification. The relevant exposure was being early-career at the moment of administrative action, not at project initiation. The findings have direct implications for workforce equity in US biomedical research.

6

Prevalence and drivers of nitrogen-related limitation of phytoplankton growth across space and time in Norwegian lakes

Rohrlack, T.

2026-05-08 ecology 10.64898/2026.05.06.723322 medRxiv

Top 0.1%

1.4%

Show abstract

The prevalence of nitrogen limitation and nitrogen-phosphorus co-limitation (henceforth referred to as nitrogen-related limitation) in Norwegian lakes and their relationships with atmospheric nitrogen deposition, climate, dissolved organic matter (DOM), and catchment characteristics were assessed across space and time. Routine monitoring data from 1,529 lakes in the national Vannmiljo database were analyzed for two multi-year periods (1995-2009 and 2010-2025). Limitation was inferred using the molar NO--N/TP ratio as an indicator of dissolved inorganic nitrogen availability. Nitrogen-related limitation was widespread in both periods and exhibited strong regional structure, with highest prevalence in northern regions and lowest prevalence in southwestern Norway. Overall prevalence increased from 31% to 38% between periods, with significant increases in western regions. Regional-scale models identified climate, forest cover, DOM, agriculture, and atmospheric nitrogen deposition as predictors of limitation probability, whereas study period per se and bog/peatland cover were not significant. At the local scale, atmospheric nitrogen deposition and DOM were the only consistent predictors, with substantially lower explanatory power than at the regional scale. These results indicate that large-scale environmental gradients play a major role in shaping nutrient stoichiometry in Norwegian lakes. Because the monitoring dataset primarily represents lakes affected by human activities, the findings are particularly relevant for water management. The widespread occurrence of nitrogen-related limitation suggests that nitrogen availability may influence phytoplankton growth in many systems and that dual-nutrient management strategies addressing both nitrogen and phosphorus may be required in many regions.

7

Patient Portal Activation Among Neurology Patients in Washington, DC

Streicher, N. S.

2026-05-03 health policy 10.64898/2026.04.08.26350061 medRxiv

Top 0.2%

1.2%

Show abstract

Background and ObjectivesPatient portals have become essential infrastructure for healthcare delivery following the 21st Century Cures Act, yet adoption remains inequitable. Understanding demographic and geographic determinants of portal activation is critical for addressing digital health disparities, particularly among neurology patients who face unique access barriers. We examined the demographic, geographic, and neighborhood-level factors associated with patient portal activation among neurology patients at multiple geographic scales in the Washington, DC metropolitan area. MethodsWe conducted a retrospective cohort study of 72,417 adult neurology patients seen at two academic medical centers sharing an electronic health record in Washington, DC (February 2021-February 2026). We examined portal activation using multivariable logistic regression and geographic analysis at four nested scales: the metropolitan catchment area, DCs eight wards, individual census tracts (via geocoded patient addresses), and individual DC residents. ResultsPortal activation was 64.7% overall. Activation varied by race/ethnicity (Non-Hispanic White 76.1%, Non-Hispanic Black 57.0%, Non-Hispanic Asian 57.6%, Hispanic 55.0%) and geography (DC Ward 2: 82.0% vs. Ward 7: 48.0%). Ward-level educational attainment (r = 0.948), broadband access (r = 0.889), and income (r = 0.811) were strongly correlated with activation. Within individual wards, Non-Hispanic White patients activated at 84-91% while Non-Hispanic Black patients activated at 48-64%, demonstrating that neighborhood resources alone do not explain disparities. DiscussionPatient portal activation is shaped by demographic, socioeconomic, and geographic factors operating at multiple levels. Persistent within-ward racial disparities indicate that geographically targeted interventions must be paired with culturally tailored approaches to achieve digital health equity.

8

Adherence to data-sharing policies - a comparison of the BMJ with other major medical journals

Avenell, A.; Bishop, D.

2026-05-21 medical ethics 10.64898/2026.05.15.26353284 medRxiv

Top 0.2%

0.9%

Show abstract

Background: In 2024, the BMJ updated its data-sharing policy for clinical trials, requiring deidentified individual patient data (IPD) to be openly deposited prior to publication. Our objective was to discover if data-sharing increased after introduction of the new policy. Method: All data-sharing statements were downloaded from BMJ trials published in 2023 (submitted pre-updated policy) and 2025 (submitted post-updated policy). Data for 2025 were gathered for trials in five comparison medical journals. Data-sharing statements were coded to specify whether IPD were immediately available, and if not, the reason why. Where a statement gave a link to a repository, we checked whether data were available. Results: Openly available IPD for BMJ trials increased from 0/32 prior to the new policy to 19/33 (58%) after the updated policy; seven articles gave repository links that did not yield any data. In the five comparison journals, rates of open IPD varied from 0% to 5.6%. Conclusions: There was a substantial increase in open sharing of IPD after introduction of the new policy compared to a prior period. Open sharing of IPD is possible, but it is unpopular with authors and is unlikely to be achieved without firm editorial enforcement

9

Keeping human in the loop: A three-phase generative AI workflow for research integrity in data-intensive science.A methodological case study using elite Ethiopian distance-running data

Galko, P.; Yisamaw, A.; Haugen, T.; Seiler, S.

2026-05-29 sports medicine 10.64898/2026.05.29.26354013 medRxiv

Top 0.2%

0.8%

Show abstract

Background: Generative AI tools can support data-intensive research by writing code, drafting prose, searching analytical possibilities, and stress-testing claims. They can also produce false citations, drift between statistical specifications, and lose continuity across long investigations. This paper describes a practical workflow for using AI systems in empirical research while keeping discovery, verification, and accountability inspectable. Methods: We developed and applied a three-phase human-AI workflow to a case study of 14 elite Ethiopian distance runners. The dataset contained 22,605 GPS-segments collected across 97 consecutive days in late 2025, supplemented by venue and athlete metadata collected in the field. Phase 1 used an autonomous data-exploration tool to pre-filter the hypothesis space across five seeded research questions. Phase 2 used an AI system under direct human guidance to construct candidate findings into numerical claims, verification scripts, and draft text. Phase 3 used an independent AI system in an adversarial role to stress-test methods, statistics, prose, figures, and citations. The workflow was informed by Pearl's distinction between association, intervention, and counterfactual reasoning, with human judgement retained for research direction, interpretation, and final claims. Results: The workflow produced three empirical analyses and a documented correction process. The analyses estimated an altitude-to-sea-level pace correction of +0.10 min/km per 1,000 m at matched heart rate, showed why pooled altitude-surface regression was not identifiable within this venue system, documented method-dependence in heart-rate-based intensity classification, characterised within-venue route variation as a 64/36 path-fixed-to-trail-variable split with the Sululta label resolving into two functionally distinct sub-venues, and reframed the cohort's training through a 3x3x3 prescription lattice grounded in Ethiopian coaching practice. The adversarial phase identified several hallucinated citations, a terminology error between HC1 and cluster-robust standard errors, and several inconsistencies between prose, figures, and computed results. Verification scripts re-derived nearly all numerical claims from the cleaned lap-level data. Conclusions: The case study shows how researchers can organise AI-assisted empirical work so that candidate discovery, claim construction, independent stress-testing, and final accountability remain separated. The workflow did not remove the need for domain expertise or human judgement. Its value was in making the route from candidate finding to manuscript claim explicit, reproducible, and open to challenge. Trial registration: Not applicable.

10

Large-Scale Assessment of Animal-to-Human Drug Translation Using Natural Language Processing

Doneva, S. E.; Ellendorff, T. R.; Schneider, G.; Held, L.; von Wyl, V.; Simpson, I.; Sick, B.; Ineichen, B. V.

2026-05-22 bioinformatics 10.64898/2026.05.20.726540 medRxiv

Top 0.3%

0.8%

Show abstract

BackgroundLarge-scale estimates of animal-to-human drug translation and the study characteristics associated with successful translation remain limited. The expanding preclinical literature also challenges manual evidence synthesis. We developed a natural language processing (NLP) pipeline to structure and link preclinical and clinical evidence at scale. MethodsIn this retrospective meta-research study, we analysed more than 500,000 neuroscience-related animal drug studies from PubMed and linked them to clinical trial and regulatory approval data. NLP methods extracted drug, disease, and experimental design characteristics from abstracts and full texts. Translation was defined as progression to completed phase III/IV trials or regulatory approval. Logistic regression assessed associations between preclinical study characteristics and successful translation. FindingsAmong 291,624 drug entities identified in animal studies, 6{middle dot}7% entered clinical development and 3{middle dot}1% reached phase III/IV trials or regulatory approval. At the drug-disease level, 4{middle dot}4% entered clinical development and 1{middle dot}9% achieved translation. Restricting analyses to successfully linked ontology entities increased estimates to 11{middle dot}3% and 4{middle dot}1%, respectively. Male-only animal studies predominated, whereas reporting of randomisation, blinding, and sample size calculations remained limited. Testing across multiple species and reporting blinding were associated with higher odds of successful translation. InterpretationOnly a minority of interventions tested in animals progress to advanced clinical development or regulatory approval. Greater species diversity and blinding were associated with improved translational success. NLP-based evidence synthesis may support scalable evaluation of translational research and identification of potentially modifiable research practices. FundingSwiss National Science Foundation, UZH Digital Entrepreneurship Fellowship, Universities Federation for Animal Welfare. Research in contextO_ST_ABSEvidence before this studyC_ST_ABSWe searched the literature for studies quantifying large-scale animal-to-human translation and factors associated with successful translation. Existing work was mainly limited to specific diseases, interventions, or manually curated datasets, and large-scale linkage of animal and clinical evidence remained limited. Added value of this studyWe developed a natural language processing pipeline linking more than 500,000 animal studies to clinical trial and regulatory approval data. The study provides large-scale estimates of translation and identifies experimental characteristics associated with successful translation. Implications of all the available evidenceThe findings suggest that only a minority of interventions tested in animals progress to advanced clinical development or regulatory approval. Greater species diversity and reporting of blinding were associated with improved translation. Automated evidence synthesis may support more systematic evaluation of translational research practices.

11

Exploring sources of uncertainty in the estimate of waterfowl harvest in the United Kingdom

Ellis, M. B.; Lewis, H. M.; Cameron, T. C.

2026-05-14 ecology 10.64898/2026.05.13.724812 medRxiv

Top 0.3%

0.8%

Show abstract

There is an urgent need to gather data on harvest rates of waterbirds in Europe to assess the sustainability of hunting. Estimates of total waterbird harvest in the United Kingdom (UK) and the relative harvest of different huntable species come from two separate surveys, the Value of Shooting (PACEC 2014) and National Gamebag Census (NGC, Aebischer 2019), and these have been recently used to explore the likelihood of unsustainable harvests of wild waterbirds by UK hunters (Ellis and Cameron 2022; Madden et al., 2025). The reliability of these sustainability estimates depends on how representative the original surveys are of hunter behaviour and success. There are also 1-3 million released game-farm mallard (Anas platyrhynchos) that takes up considerable and unquantified proportions of the UK waterbird harvest. Here we explore uncertainties in the UK winter harvest of wild waterfowl by comparing estimates from the NGC dataset with those from the Crown Estate coastal hunting clubs, and a novel approach using analysis of social-media images (2019/20 to 2023/24). We explore the difference in species-specific harvest with and without the uncertainties in the number of released mallard and the total number of duck harvested in the UK. Waterbird harvest estimates differ markedly depending on the input dataset and whether released mallard are included in the analysis. Confidence intervals of each estimate are inflated by uncertainties in the number of released game-farm mallard contributing to, and the size of that national bag. Estimates extrapolated from social media suggest the national harvest of several species may be considerably larger than the corresponding NGC estimates (e.g. Teal *2.07 and gadwall *11.2), while mallard harvests away from formal shoots represented by NGC are significantly lower (*0.71). Excluding released mallard reduces the statistical estimate of total wild duck harvest by 56-63%, which would have biologically significant effects if realised.

12

BioMARathons as a seasonal engagement model for marine citizen science: adapting BioBlitzes to challenging coastal environments

Linan Moyano, S.; Companys Oliva, B.; Alvarez Sanchez, A.; Turo Silanes, M.; Rodero, C.; Salvador Costa, X.; Piera, J.

2026-05-15 scientific communication and education 10.64898/2026.05.13.724939 medRxiv

Top 0.3%

0.8%

Show abstract

BioBlitzes are widely used citizen science events that combine biodiversity monitoring, public participation, and environmental awareness through short and intensive observation campaigns. However, applying this model to marine environments presents additional challenges related to safety, access, weather dependency, specialised equipment, species identification, and sustained participation. This paper presents the BioMARathon model as a case study of how BioBlitz-inspired events can be adapted to marine citizen science contexts. The BioMARathon extends the conventional BioBlitz format into a longer, seasonal, and distributed engagement model designed specifically for marine and coastal environments. The paper describes the conceptual foundations of the model in the Janus Engagement Framework, which informed both the design of the BioMARathon and the adaptation of the MINKA citizen science observatory to better support participation, validation, feedback, and continuity over time. BioMARato Catalunya, launched in 2021, is presented as the founding implementation of the model and as the basis for later replication in Portugal. Between 2021 and 2025, BioMARato Catalunya showed continued growth in participation, observations, and taxonomic coverage, while also contributing to the detection of non-indigenous species, first regional records, and climate-related ecological impacts. Beyond biodiversity outcomes, the case suggests that extending participation across a season, distributing activities through local mobilising organisations, and combining expert validation with visible feedback mechanisms can support recurrent participation, retention, and community reactivation in marine citizen science. Rather than offering a formal causal evaluation, this article contributes practical lessons for the design of citizen science initiatives in challenging environments.

13

Bacteroidales on Harvesters: Baseline Prevalence and Abundance

Kaur, S.; Wang, J.; Kayabasi, A.; Rath, I.; Benschikovski, I.; Raut, B.; Ra, K.; Verma, M. S.

2026-05-15 bioengineering 10.64898/2026.05.12.724369 medRxiv

Top 0.3%

0.8%

Show abstract

Fresh produce encounters pathogens at various stages of production and supply, with the harvesting process serving as one of these stages. To evaluate contamination associated with harvesting, we systematically swabbed zone 1 harvester surfaces and quantified Bacteroidales as a fecal biomarker using quantitative polymerase chain reaction (qPCR). Baseline contamination was dominated by non-detects, with occasional low-level detections (<25 copies/cm2) near the assay limit of detection (LoD). Detection occurred more frequently post-harvest (overall [~]4% pre-harvest and 10% post-harvest), while microbial loads remained low, indicating that harvesting primarily affected the likelihood of low-level contamination rather than increasing contamination abundance. Additionally, we developed and field-deployed a portable loop- mediated isothermal amplification (LAMP) assay for rapid harvester hygiene assessment and benchmarked its field performance against qPCR. Together, these results support a practical molecular tool for monitoring fecal contamination and informing cleaning and sanitization decisions.

14

Zoonotic and Avian Pathogen Detections in Fecal and Sediment Samples - A Low-risk, High-throughput One Health Approach to Surveillance

Rzeszutek, G. J.; Wight, J.; Jafri, M. S.; Erwin, A. J.; Hiebert, M.; Harrigan, R.; Halbrook, M.; Hoff, N. A.; Bogoch, I. I.; Rimoin, A.; Kindrachuk, J.; Wallace, H. L.

2026-05-06 microbiology 10.64898/2025.12.19.694637 medRxiv

Top 0.3%

0.7%

Show abstract

Many pathogens, both those with human spillover potential as well as avian-specific viruses, are maintained in wild bird populations. While routine surveillance for influenza A viruses (IAVs) is performed annually, surveillance for other pathogens is limited. Sampling of wild birds is time-consuming, labour-intensive, often limited in sample size, and involves handling of wild and potentially infected birds, posing an increased risk of direct exposure for personnel. Additional methods for surveillance are needed given these significant challenges. Longitudinal fecal and sediment sampling was performed at various sites in southern Manitoba, Canada, particularly focused in Winnipeg from May to October 2025. Sites were chosen based on the suitability of the area for waterfowl habitat, the presence of waterfowl in the area, as well as proximity to reported outbreaks of H5N1 influenza virus. Fecal and sediment samples were collected and screened for the presence of influenza A virus (IAV), Newcastle disease virus (NDV), avian reovirus (ARV), and avian poxvirus (APXV). In total, 782 combined fecal and sediment samples were collected. Of the 714 fecal samples, 34 tested positive for IAV RNA (4.8% prevalence). None of the IAV-positive fecal samples tested positive for H5 RNA. Of the 68 sediments, 15 were positive for IAV RNA (22.1% prevalence), four of which were positive for H5 RNA. NDV RNA positivity was low, with only four positive fecal samples (0.6% prevalence) that were all collected on the same day. ARV RNA positivity was also low, with five positive sediment samples (7.4% prevalence in sediment samples). None of the samples tested positive for APXV DNA. This study builds on previous work showing the utility of environmental sampling for a variety of avian and zoonotic pathogens using a One Health approach that is low-risk, efficient, and high-throughput.

15

A genome-resolved view of the wastewater RNA virome

Kantor, R. S.; Shakya, M.; Ruth, N.; Rothman, J. A.; Rushford, C.; Gregory, D. A.; Epstein, A.; Kaufman, J. T.; Allen, J. E.; Chain, P. S. G.; O'Connor, D. H.; Johnson, M. C.

2026-05-22 infectious diseases 10.64898/2026.05.19.26353600 medRxiv

Top 0.4%

0.7%

Show abstract

Sequencing-based wastewater surveillance is emerging as an important tool in pathogen-agnostic threat detection, potentially enabling early identification before capture through clinical surveillance systems. However, virus sequences of human pathogens are typically low in abundance in wastewater while much of the data is unclassifiable at the read level. This presents a challenge because genomes may not assemble well for novel pathogens of interest, but read-based methods cannot currently separate novel from previously seen unclassified sequences. Using ultra-deep untargeted sequencing of the wastewater RNA virome performed by the CASPER consortium (321 samples), we constructed a wastewater virus genome database (WVDB) with the goal of expanding the set of available high-quality non-redundant reference genomes. The first version of this database contains 21,015 near-complete viral genomes, of which the majority are ssRNA bacteriophage (79%). We additionally recovered genomes for putative plant and vertebrate-infecting viruses, human enteric viruses, and viruses whose host could not be predicted. Fewer than 4000 genomes had matches in previously published virus genome databases, and WVDB captured around one fifth of the reads that could not be classified by Kraken2. Further expansion of WVDB will provide a comprehensive resource of RNA virus genomes for characterization of viral diversity and dynamics in wastewater across space and time.

16

A longitudinal cohort study comparing clinical trials registered on ClinicalTrials.gov that stopped during the first three years of the SARS-CoV-2 pandemic with trials that stopped in the three years prior

Carlisle, B. G.; Hutchinson, N.; Moyer, H.

2026-05-22 public and global health 10.64898/2026.05.20.26353581 medRxiv

Top 0.4%

0.7%

Show abstract

Background: The global SARS-CoV-2 pandemic disrupted healthcare systems worldwide, raising concerns about its impact on clinical research. Early reports suggested reductions in participant enrollment, interruptions to ongoing trials, and challenges to protocol adherence, yet the magnitude and duration of these operational disruptions remain unclear. Methods: We conducted a registry-based analysis comparing clinical trials during the COVID-19 pandemic (December 2019 to November 2022) with a matched pre-pandemic cohort (December 2016 to November 2019). Studies were included if they reported any modifications to trial status, enrollment, or protocols during the study periods. Key variables included trial stoppage, enrollment changes, and adoption of remote or hybrid procedures. Results: The global SARS-CoV-2 pandemic resulted in widespread disruptions to trial operations with 13,323 clinical trials terminated, suspended or withdrawn over the course of the pandemic, a 38% increase compared to the 9,665 trials that stopped in the 3 years prior to the pandemic. Registries indicated a sharp decline in new participant enrollment across geographic regions and therapeutic areas, with partial recovery in later months. Review findings highlighted barriers including patient inaccessibility, staff redeployment, and supply chain interruptions. Conclusions: The pandemic caused system-wide operational shocks that compromised trial timelines and may have downstream methodological consequences. Recovery in enrollment does not imply restoration of pre-pandemic protocol fidelity or outcome ascertainment. Standardized reporting of disruptions, proactive contingency planning, and resilient trial designs are needed to maintain data integrity during large-scale disruptions and to support reliable evidence generation.

17

Linking land-use change, water quality, and host-parasite dynamics with droplet digital PCR and Bayesian path analyses

Srinivas, I.; Fouilloux, C. A.; Berini, J.; Orlando-Simoni, P.; Neeno-Eckwall, E.; Alexander, H.; Choi, E.; Vaziri, G.; Hund, A. K.; Bolnick, D. I.; Hite, J.; Chen, A.; Casey, G.; Dubin, S.; Patterson, C.

2026-05-14 ecology 10.64898/2026.05.12.724588 medRxiv

Top 0.4%

0.6%

Show abstract

Global changes in land use and nutrient cycling are transforming ecosystems at unprecedented rates, with significant consequences for infectious disease dynamics. Aquatic environments are particularly vulnerable because the interplay of habitat modification, nutrient enrichment, and biodiversity loss can drive pronounced changes in the community composition of food webs, including hosts and parasites. Yet, despite well-documented effects of habitat modification on aquatic communities and food webs, the mechanisms through which these changes influence infectious disease dynamics remain poorly resolved. This gap arises, in part, because it remains challenging to disentangle how multiple stressors interact to shape disease outcomes and quantify parasite levels and host densities from field-collected samples. Here, we illustrate two tools that might help address these challenges. First, highly sensitive droplet digital PCR can quantify infection loads even when the signal:noise ratio is low. Second, stepwise Bayesian path analyses can identify the direct and indirect pathways connecting land-use changes to infectious disease dynamics. As a case study, we examined cyclopoid copepods and their helminth parasite, Schistocephalus solidus, across 47 freshwater lakes on Vancouver Island, a region strongly shaped by commercial logging, including widespread clear-cutting of old-growth forests. Our results reveal a positive correlation between copepod density and deforestation, potentially mediated by associated changes in water quality and calanoid copepods, key competitors of the focal host. ddPCR enabled sensitive detection of extremely low parasite signals in field-collected copepods. We detected positive infections in only 19.5% of the lakes surveyed, highlighting the difficulty of assessing disease dynamics in natural populations. Nonetheless, this study highlights the challenges of linking land-use change to disease outcomes, while also demonstrating that sensitive molecular and statistical tools offer new ways to reveal these hidden connections.

18

Evaluating longitudinal ecological models linking scientific production to population-level indicators: a global case study in mental health research

Acosta-Monterrosa, A. A.; Hernandez-Paez, D. A.; Visconti-Lopez, F. J.; Kalokoh, S.; Lozada-Martinez, I. D.

2026-05-15 scientific communication and education 10.64898/2026.05.09.723946 medRxiv

Top 0.4%

0.5%

Show abstract

BackgroundQuantifying the alignment between scientific production and population-level indicators remains a persistent methodological challenge in health research evaluation. While longitudinal ecological models have been increasingly used to explore associations between research output and societal outcomes, their feasibility, interpretability, and structural limitations have not been systematically examined. MethodsWe conducted a longitudinal ecological meta-research analysis integrating global bibliometric data on mental health publications with country-level indicators of mental disorders, mental health infrastructure, and subjective well-being. Analyses were stratified by World Bank income groups and implemented using a three-step framework comprising income specific linear regression models, random-effects meta-analyses, and meta-regressions to assess association patterns, heterogeneity, and potential moderators. ResultsScientific production was highly concentrated in high-income countries. Income-stratified regression models revealed divergent association patterns across contexts, with inverse associations observed in higher income groups and predominantly positive coefficients in low-income countries. Meta-analyses showed extreme between-group heterogeneity for most indicators, yielding largely attenuated pooled estimates. Only one subjective well-being indicator retained a significant pooled association. ConclusionsLongitudinal ecological models linking scientific production to population-level indicators can identify broad association patterns and structural asymmetries but are strongly constrained by contextual heterogeneity and data availability.

19

With great power comes great responsibility: how scientific supervisors shape the wellbeing of early-career researchers

Simon Martinez de Goni, X.; Marin-Pena, A. J.; Corrochano-Monsalve, M.; Bozal-Leorri, A.

2026-05-07 scientific communication and education 10.64898/2026.05.05.722947 medRxiv

Top 0.4%

0.5%

Show abstract

Scientific supervision is central to the experience of early-career researchers (ECRs), yet its role in shaping wellbeing and retention remains underexamined from the ECR perspective. We analyzed 2,604 anonymous survey responses from predoctoral, postdoctoral and former researchers across 65 countries. Overall, 76% of respondents reported that their supervisors attitude had a moderate or severe impact on mental health. Although most entered academia for vocational reasons, negative experiences with supervisors were among the most frequently reported reasons for leaving among former researchers (48%), comparable to job insecurity and financial instability. Harm was most often associated with poor communication, disregard for wellbeing, micromanagement and competitiveness. In contrast, ECRs valued supportive rather than boss-like supervision, regular communication, realistic expectations and respect for personal time. These findings identify supervisory behavior as a major and modifiable determinant or ECRs wellbeing and retention, and highlight the need for stronger institutional accountability, mentor training and funding incentives that recognize mentorship as a core component of research culture.

20

Wildlife feeding increases risk of male wild turkeys (Meleagris gallopavo) to hunter harvest

Lashley, M.; Leipold, E.; McDonald, B.; Baruzzi, C.

2026-05-04 ecology 10.64898/2026.04.30.721985 medRxiv

Top 0.4%

0.5%

Show abstract

Wildlife feeding during the wild turkey (Meleagris gallopavo) hunting season is legal in many states within the United States, but hunting turkeys with the aid of bait is unlawful in most states. The most common policy to prevent wildlife feeding from acting as bait is to restrict hunting within a defined radius. However, the effect of wildlife feeders on turkey harvest risk and the effectiveness of distance restrictions on mitigating that influence have not been investigated. During 2024-2025, we used GPS transmitters to track 30 adult male turkeys during the spring hunting season on private land with active feeders in Florida, USA, where hunting turkeys within a 91 m radius of a feeder was unlawful. We used Cox proportional hazard models to link risk of hunter harvest with unique feeders visited daily, number of feeders within a home range, and average morning distance and roosting distance to feeders at multiple temporal scales. Hunters harvested 53% of the tagged turkeys. Risk of hunter harvest increased with the number of unique feeders visited the previous day and after the first three days of hunting season with the number of active feeders within a home range. As distance from the most recent roost site and average morning distance to a feeder decreased, risk of hunter harvest increased. We estimated that risk of hunter harvest would be reduced by over 50% if distance restrictions were increased from 100 m to 200 m, by nearly 75% with an increase from 100 m to 300 m, and by nearly 90% with an increase from 100 m to 500 m. To completely eliminate the influence of wildlife feeders on risk of hunter harvest would require a restriction distance well beyond a 500m radius, which is impractical given that this radius would result in an area twice the average private landowner property size in the region. Thus, if wildlife feeding during the turkey hunting season is to be allowed, it will act as bait, in which case, the acceptable level of its influence as bait can be achieved with the appropriate hunting radius restriction.