Back

Preprints in motion: tracking changes between posting and journal publication

Polka, J. K.; Dey, G.; Palfy, M.; Nanni, F.; Brierley, L.; Fraser, N.; Coates, J. A.

2021-02-20 scientific communication and education
10.1101/2021.02.20.432090 bioRxiv
Show abstract

Amidst the COVID-19 pandemic, preprints in the biomedical sciences are being posted and accessed at unprecedented rates, drawing widespread attention from the general public, press and policymakers for the first time. This phenomenon has sharpened longstanding questions about the reliability of information shared prior to journal peer review. Does the information shared in preprints typically withstand the scrutiny of peer review, or are conclusions likely to change in the version of record? We assessed preprints from bioRxiv and medRxiv that had been posted and subsequently published in a journal through 30th April 2020, representing the initial phase of the pandemic response. We utilised a combination of automatic and manual annotations to quantify how an article changed between the preprinted and published version. We found that the total number of figure panels and tables changed little between preprint and published articles. Moreover, the conclusions of 7.2% of non-COVID-19-related and 17.2% of COVID-19-related abstracts undergo a discrete change by the time of publication, but the majority of these changes do not qualitatively change the conclusions of the paper.

Matching journals

The top 1 journal accounts for 50% of the predicted probability mass.

1
PLOS Biology
408 papers in training set
Top 0.1%
53.1%
50% of probability mass above
2
eLife
5422 papers in training set
Top 5%
10.3%
3
Nature Biotechnology
147 papers in training set
Top 2%
5.0%
4
Nature Neuroscience
216 papers in training set
Top 2%
4.4%
5
PLOS ONE
4510 papers in training set
Top 38%
3.7%
6
JAMA Network Open
127 papers in training set
Top 2%
1.7%
7
GigaScience
172 papers in training set
Top 2%
1.5%
8
PeerJ
261 papers in training set
Top 9%
1.4%
9
Patterns
70 papers in training set
Top 1%
1.2%
10
Communications Biology
886 papers in training set
Top 14%
1.2%
11
Bioinformatics
1061 papers in training set
Top 8%
1.1%
12
BioData Mining
15 papers in training set
Top 0.6%
0.9%
13
Nature Human Behaviour
85 papers in training set
Top 4%
0.9%
14
Nature Genetics
240 papers in training set
Top 7%
0.8%
15
Journal of the American Medical Informatics Association
61 papers in training set
Top 2%
0.8%
16
Royal Society Open Science
193 papers in training set
Top 5%
0.8%
17
PLOS Computational Biology
1633 papers in training set
Top 24%
0.8%
18
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.8%
19
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 6%
0.8%
20
BMC Medicine
163 papers in training set
Top 7%
0.8%
21
eneuro
389 papers in training set
Top 10%
0.7%
22
Scientific Reports
3102 papers in training set
Top 80%
0.5%
23
Nature
575 papers in training set
Top 18%
0.5%