Assessing the Secondary Use and Scientific Impact of Shared Clinical Trial Data: A Cross-Sectional Study of Clinical Trials Shared on the YODA Project Platform
Taherifard, E.; Mooghali, M.; Hakimian, H. R.; Mane, S. R.; Fu, M.; Bamford, S.; Berlin, J. A.; Childers, K.; Desai, N. R.; Gross, C. P.; Hewens, D.; Lehman, R.; Ritchie, J. D.; Sargood, T.; Waldstreicher, J.; Wallach, J. D.; Willeford, M. K.; Krumholz, H. M.; Ross, J. S.
Show abstract
ObjectiveTo assess the number, timing of publication, characteristics, and scientific impact of secondary publications generated using individual participant-level data (IPD) from a portfolio of Johnson & Johnson-sponsored clinical trials shared with external investigators through a data sharing platform. DesignCross-sectional study. SettingYale University Open Data Access (YODA) Project platform. ParticipantsJohnson & Johnson-sponsored clinical trials listed on the YODA Project platform with IPD available for external sharing as of December 31, 2021, and with a full-length, peer-reviewed publication (i.e., primary publication) reporting primary endpoint results by the original trial investigators. Main outcome measuresNumber, timing of publication, research objectives, analysis type, and scientific impact of secondary publications using IPD from these trials identified through citation searches of primary publications in Web of Science through June 2025. Scientific impact metrics included journal impact factor, annual citation count, annual Altmetric Attention Score, and annual Mendeley reader count. Secondary publications were classified as internal (authored by at least one original trial investigator) or external. ResultsAmong 336 eligible trials, 265 (78.9%) had at least one associated secondary publication, totaling 1,167 secondary publications, of which 209 (17.9%) were external. Among external secondary publications for which the data access mechanism was reported (n=190; 90.9%), most obtained access through data sharing platforms (n=161; 84.7%), primarily the YODA Project (n=157; 82.6%). All secondary publications published from 3 years before through the first 2 years after the primary publication (n=161) were internal (100%). Over time, however, external publications increased steadily, exceeding 50% of all secondary publications by year 11 and thereafter. External secondary publications were more frequently pooled analyses (151/209 [72.2%] vs 534/958 [55.7%]; P<0.001). Predictive or prognostic modelling (108/209 [51.7%] vs 322/958 [33.6%]; P<0.001), development of statistical models or algorithms (60/209 [28.7%] vs 114/958 [11.9%]; P<0.001), and validation of existing methods, models, or risk scores (32/209 [15.3%] vs 66/958 [6.9%]; P<0.001) were more frequent among external than internal secondary publications. Compared to internal secondary publications, external secondary publications were published in journals with higher impact factors (median, 6.7 [IQR, 3.4-16.6] vs 4.6 [2.9-10.2]; P=0.002) and had higher annual Altmetric Attention Scores (median, 2.1 [0.7-7.1] vs 0.6 [0.3-2.3]; P<0.001), but lower annual citation counts (median, 2.7 [1.1-5.6] vs 3.4 [1.6-7.5]; P<0.001) and were less likely to be cited in clinical guidelines (21/184 [11.4%] vs 235/805 [29.2%], P<0.001) or policy documents (14/184 [7.6%] vs 206/805 [25.6%], P<0.001); there was no difference in annual Mendeley reader counts (median, 7.4 [3.9-13.0] vs 8.0 [5.1-13.6], P=0.13). ConclusionsClinical trial data shared with external investigators through a data sharing platform generated substantial and sustained secondary research by both original trial investigators and external investigators. The proportion of secondary publications from any clinical trial generated by external investigators increased over time as external investigators pursued complementary research objectives that achieved a comparable scientific impact. Structured data sharing mechanisms may further enhance the scientific impact of clinical trials. What is already known on this topicO_LISharing individual participant-level data (IPD) from clinical trials can promote transparency, reproducibility, and secondary research. C_LIO_LISeveral initiatives, including the Yale University Open Data Access (YODA) Project and government-supported data sharing platforms, provide external investigators with access to clinical trial data. C_LIO_LIWhile prior evaluations of secondary research generated from shared clinical trial data suggest that external investigators publications have citation impacts comparable to those of original trial investigators, overall evidence remains limited. C_LI What this study addsO_LIAnalysis of 336 industry-sponsored clinical trials with IPD shared through the YODA Project showed that most generated secondary publications, by both original trial investigators and external investigators. C_LIO_LIThe proportion of secondary publications from any clinical trial generated by external investigators increased over time, and compared with those generated by the original trial investigators, these publications more frequently use pooled analyses and focus on predictive or prognostic modelling and the development and validation of statistical methods. C_LIO_LISecondary publications generated by external investigators were more often published in higher-impact journals and received higher Altmetric Attention Scores, but had lower annual citation counts and were less likely to be cited in clinical guidelines or policy documents than those generated by the original trial investigators. C_LI
Matching journals
The top 6 journals account for 50% of the predicted probability mass.