Back

Pilot study demonstrating changes in DNA hydroxymethylation enable detection of multiple cancers in plasma cell-free DNA

Bergamaschi, A.; Ning, Y.; Ku, C.-J.; Ellison, C.; Collin, F.; Guler, G.; Phillips, T.; McCarthy, E.; Wang, W.; Antoine, M.; Scott, A.; Lloyd, P.; Ashworth, A.; Quake, S.; Levy, S.

2020-01-27 genetic and genomic medicine
10.1101/2020.01.22.20018382 medRxiv
Show abstract

Our study employed the detection of 5-hydroxymethyl cytosine (5hmC) profiles on cell free DNA (cfDNA) from the plasma of cancer patients using a novel enrichment technology coupled with sequencing and machine learning based classification method. These classification methods were develoiped to detect the presence of disease in the plasma of cancer and control subjects. Cancer and control patient cfDNA cohorts were accrued from multiple sites consisting of 48 breast, 55 lung, 32 prostate and 53 pancreatic cancer subjects. In addition, a control cohort of 180 subjects (non-cancer) was employed to match cancer patient demographics (age, sex and smoking status) in a case-control study design. Logistic regression methods applied to each cancer case cohort individually, with a balancing non-cancer cohort, were able to classify cancer and control samples with measurably high performance. Measures of predictive performance by using 5-fold cross validation coupled with out-of-fold area under the curve (AUC) measures were established for breast, lung, pancreatic and prostate cancer to be 0.89, 0.84, 0.95 and 0.83 respectively. The genes defining each of these predictive models were enriched for pathways relevant to disease specific etiology, notably in the control of gene regulation in these same pathways. The breast cancer cohort consisted primarily of stage I and II patients, including tumors < 2 cm and these samples exhibited a high cancer probability score. This suggests that the 5hmC derived classification methodology may yield epigenomic detection of early stage disease in plasma. Same observation was made for the pancreatic dataset where >50% of cancers were stage I and II and showed the highest cancer probability score.

Matching journals

The top 8 journals account for 50% of the predicted probability mass.

1
Scientific Reports
3102 papers in training set
Top 0.4%
22.6%
2
Frontiers in Bioinformatics
45 papers in training set
Top 0.1%
6.4%
3
Frontiers in Genetics
197 papers in training set
Top 1%
4.9%
4
Heliyon
146 papers in training set
Top 0.2%
4.3%
5
Cancer Research Communications
46 papers in training set
Top 0.1%
4.2%
6
Nature Communications
4913 papers in training set
Top 39%
3.6%
7
Genome Medicine
154 papers in training set
Top 2%
3.6%
8
International Journal of Molecular Sciences
453 papers in training set
Top 3%
3.3%
50% of probability mass above
9
Frontiers in Molecular Biosciences
100 papers in training set
Top 0.6%
3.1%
10
PLOS ONE
4510 papers in training set
Top 42%
3.1%
11
Cancers
200 papers in training set
Top 2%
2.7%
12
BMC Genomics
328 papers in training set
Top 2%
1.9%
13
Biosensors and Bioelectronics
52 papers in training set
Top 0.8%
1.7%
14
Epigenetics
43 papers in training set
Top 0.4%
1.7%
15
Communications Biology
886 papers in training set
Top 9%
1.7%
16
Biology
43 papers in training set
Top 0.8%
1.7%
17
Genes
126 papers in training set
Top 1%
1.5%
18
Informatics in Medicine Unlocked
21 papers in training set
Top 0.5%
1.5%
19
Frontiers in Oncology
95 papers in training set
Top 3%
1.2%
20
Clinical Chemistry
22 papers in training set
Top 0.6%
1.1%
21
Archives of Clinical and Biomedical Research
28 papers in training set
Top 2%
1.0%
22
iScience
1063 papers in training set
Top 24%
1.0%
23
Journal of Bioinformatics and Systems Biology
14 papers in training set
Top 0.4%
1.0%
24
The Journal of Molecular Diagnostics
36 papers in training set
Top 0.3%
1.0%
25
Frontiers in Immunology
586 papers in training set
Top 6%
0.9%
26
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.8%
27
Oncotarget
15 papers in training set
Top 0.3%
0.8%
28
Biomedicines
66 papers in training set
Top 3%
0.7%
29
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 10%
0.7%
30
BMC Medicine
163 papers in training set
Top 8%
0.6%