Back

Assessing the Secondary Use and Scientific Impact of Shared Clinical Trial Data: A Cross-Sectional Study of Clinical Trials Shared on the YODA Project Platform

Taherifard, E.; Mooghali, M.; Hakimian, H. R.; Mane, S. R.; Fu, M.; Bamford, S.; Berlin, J. A.; Childers, K.; Desai, N. R.; Gross, C. P.; Hewens, D.; Lehman, R.; Ritchie, J. D.; Sargood, T.; Waldstreicher, J.; Wallach, J. D.; Willeford, M. K.; Krumholz, H. M.; Ross, J. S.

2026-03-26 public and global health
10.64898/2026.03.26.26349328 medRxiv
Show abstract

ObjectiveTo assess the number, timing of publication, characteristics, and scientific impact of secondary publications generated using individual participant-level data (IPD) from a portfolio of Johnson & Johnson-sponsored clinical trials shared with external investigators through a data sharing platform. DesignCross-sectional study. SettingYale University Open Data Access (YODA) Project platform. ParticipantsJohnson & Johnson-sponsored clinical trials listed on the YODA Project platform with IPD available for external sharing as of December 31, 2021, and with a full-length, peer-reviewed publication (i.e., primary publication) reporting primary endpoint results by the original trial investigators. Main outcome measuresNumber, timing of publication, research objectives, analysis type, and scientific impact of secondary publications using IPD from these trials identified through citation searches of primary publications in Web of Science through June 2025. Scientific impact metrics included journal impact factor, annual citation count, annual Altmetric Attention Score, and annual Mendeley reader count. Secondary publications were classified as internal (authored by at least one original trial investigator) or external. ResultsAmong 336 eligible trials, 265 (78.9%) had at least one associated secondary publication, totaling 1,167 secondary publications, of which 209 (17.9%) were external. Among external secondary publications for which the data access mechanism was reported (n=190; 90.9%), most obtained access through data sharing platforms (n=161; 84.7%), primarily the YODA Project (n=157; 82.6%). All secondary publications published from 3 years before through the first 2 years after the primary publication (n=161) were internal (100%). Over time, however, external publications increased steadily, exceeding 50% of all secondary publications by year 11 and thereafter. External secondary publications were more frequently pooled analyses (151/209 [72.2%] vs 534/958 [55.7%]; P<0.001). Predictive or prognostic modelling (108/209 [51.7%] vs 322/958 [33.6%]; P<0.001), development of statistical models or algorithms (60/209 [28.7%] vs 114/958 [11.9%]; P<0.001), and validation of existing methods, models, or risk scores (32/209 [15.3%] vs 66/958 [6.9%]; P<0.001) were more frequent among external than internal secondary publications. Compared to internal secondary publications, external secondary publications were published in journals with higher impact factors (median, 6.7 [IQR, 3.4-16.6] vs 4.6 [2.9-10.2]; P=0.002) and had higher annual Altmetric Attention Scores (median, 2.1 [0.7-7.1] vs 0.6 [0.3-2.3]; P<0.001), but lower annual citation counts (median, 2.7 [1.1-5.6] vs 3.4 [1.6-7.5]; P<0.001) and were less likely to be cited in clinical guidelines (21/184 [11.4%] vs 235/805 [29.2%], P<0.001) or policy documents (14/184 [7.6%] vs 206/805 [25.6%], P<0.001); there was no difference in annual Mendeley reader counts (median, 7.4 [3.9-13.0] vs 8.0 [5.1-13.6], P=0.13). ConclusionsClinical trial data shared with external investigators through a data sharing platform generated substantial and sustained secondary research by both original trial investigators and external investigators. The proportion of secondary publications from any clinical trial generated by external investigators increased over time as external investigators pursued complementary research objectives that achieved a comparable scientific impact. Structured data sharing mechanisms may further enhance the scientific impact of clinical trials. What is already known on this topicO_LISharing individual participant-level data (IPD) from clinical trials can promote transparency, reproducibility, and secondary research. C_LIO_LISeveral initiatives, including the Yale University Open Data Access (YODA) Project and government-supported data sharing platforms, provide external investigators with access to clinical trial data. C_LIO_LIWhile prior evaluations of secondary research generated from shared clinical trial data suggest that external investigators publications have citation impacts comparable to those of original trial investigators, overall evidence remains limited. C_LI What this study addsO_LIAnalysis of 336 industry-sponsored clinical trials with IPD shared through the YODA Project showed that most generated secondary publications, by both original trial investigators and external investigators. C_LIO_LIThe proportion of secondary publications from any clinical trial generated by external investigators increased over time, and compared with those generated by the original trial investigators, these publications more frequently use pooled analyses and focus on predictive or prognostic modelling and the development and validation of statistical methods. C_LIO_LISecondary publications generated by external investigators were more often published in higher-impact journals and received higher Altmetric Attention Scores, but had lower annual citation counts and were less likely to be cited in clinical guidelines or policy documents than those generated by the original trial investigators. C_LI

Matching journals

The top 6 journals account for 50% of the predicted probability mass.

1
BMJ Open
554 papers in training set
Top 1%
14.2%
2
Journal of Clinical Epidemiology
28 papers in training set
Top 0.1%
10.0%
3
Journal of the American Medical Informatics Association
61 papers in training set
Top 0.3%
10.0%
4
PLOS ONE
4510 papers in training set
Top 26%
6.7%
5
Trials
25 papers in training set
Top 0.2%
6.2%
6
BMC Medical Research Methodology
43 papers in training set
Top 0.2%
4.8%
50% of probability mass above
7
Journal of Medical Internet Research
85 papers in training set
Top 1.0%
4.8%
8
npj Digital Medicine
97 papers in training set
Top 0.9%
4.8%
9
Research Synthesis Methods
20 papers in training set
Top 0.1%
3.9%
10
JAMA Network Open
127 papers in training set
Top 1%
3.0%
11
Pharmacoepidemiology and Drug Safety
13 papers in training set
Top 0.1%
3.0%
12
BMC Medicine
163 papers in training set
Top 4%
1.7%
13
Annals of Internal Medicine
27 papers in training set
Top 0.4%
1.6%
14
eLife
5422 papers in training set
Top 49%
1.2%
15
F1000Research
79 papers in training set
Top 3%
1.1%
16
International Journal of Epidemiology
74 papers in training set
Top 2%
0.9%
17
Journal of Clinical and Translational Science
11 papers in training set
Top 0.3%
0.9%
18
PLOS Biology
408 papers in training set
Top 17%
0.9%
19
Scientific Data
174 papers in training set
Top 2%
0.9%
20
PLOS Medicine
98 papers in training set
Top 4%
0.8%
21
Nature Communications
4913 papers in training set
Top 63%
0.7%
22
JMIR Public Health and Surveillance
45 papers in training set
Top 4%
0.7%
23
Clinical and Translational Science
21 papers in training set
Top 1%
0.7%
24
Journal of General Internal Medicine
20 papers in training set
Top 1%
0.7%
25
FACETS
11 papers in training set
Top 0.3%
0.7%
26
Scientific Reports
3102 papers in training set
Top 77%
0.7%
27
Journal of Translational Medicine
46 papers in training set
Top 3%
0.7%
28
BMJ
49 papers in training set
Top 1%
0.6%