Back

Ensembles for improved detection of invasive breast cancer in histological images

Solorzano, L.; Robertson, S.; Hartman, J.; Rantalainen, M.

2023-04-14 bioinformatics
10.1101/2023.04.13.536542 bioRxiv
Show abstract

Accurate detection of invasive breast cancer (IC) can provide decision support to pathologists as well as improve downstream computational analyses, where detection of IC is a first step. Tissue containing IC is characterized by the presence of specific morphological features, which can be learned by convolutional neural networks (CNN). Here, we compare the use of a single CNN model versus an ensemble of several base models with the same CNN architecture, and we evaluate prediction performance as well as variability across ensemble based model predictions. Two in-house datasets comprising 587 WSI are used to train an ensemble of ten InceptionV3 models whose consensus is used to determine the presence of IC. A novel visualization strategy was developed to communicate ensemble agreement spatially. Performance was evaluated in an internal test set with 118 WSIs, and in an additional external dataset (TCGA breast cancer) with 157 WSI. We observed that the ensemble-based strategy outperformed the single CNN-model alternative with respect to accuracy on tile level in 89% of all WSIs in the test set. The overall accuracy was 0.92 (DICE coefficient, 0.90) for the ensemble model, and 0.85 (DICE coefficient, 0.83) for the single CNN alternative in the internal test set. For TCGA the ensemble outperformed the single CNN in 96.8% of the WSI, with an accuracy of 0.87 (DICE coefficient 0.89), the single model provides an accuracy of 0.75 (DICE coefficient 0.78) The results suggest that an ensemble-based modeling strategy for breast cancer invasive cancer detection consistently outperforms the conventional single model alternative. Furthermore, visualization of the ensemble agreement and confusion areas provide direct visual interpretation of the results. High performing cancer detection can provide decision support in the routine pathology setting as well as facilitate downstream computational analyses.

Matching journals

The top 7 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 2%
14.8%
2
Scientific Reports
3102 papers in training set
Top 6%
10.1%
3
BMC Bioinformatics
383 papers in training set
Top 1%
8.4%
4
Bioinformatics
1061 papers in training set
Top 4%
6.3%
5
Frontiers in Bioinformatics
45 papers in training set
Top 0.1%
4.9%
6
Journal of Pathology Informatics
13 papers in training set
Top 0.1%
4.0%
7
iScience
1063 papers in training set
Top 3%
4.0%
50% of probability mass above
8
PLOS ONE
4510 papers in training set
Top 39%
3.6%
9
Computational and Structural Biotechnology Journal
216 papers in training set
Top 2%
2.9%
10
Cancers
200 papers in training set
Top 2%
2.4%
11
npj Precision Oncology
48 papers in training set
Top 0.4%
2.1%
12
Nature Communications
4913 papers in training set
Top 49%
1.8%
13
Modern Pathology
21 papers in training set
Top 0.2%
1.7%
14
GigaScience
172 papers in training set
Top 2%
1.5%
15
Biological Imaging
15 papers in training set
Top 0.1%
1.3%
16
Biology Methods and Protocols
53 papers in training set
Top 1%
1.3%
17
BMC Medical Informatics and Decision Making
39 papers in training set
Top 2%
1.0%
18
IEEE Access
31 papers in training set
Top 0.8%
0.9%
19
JCO Clinical Cancer Informatics
18 papers in training set
Top 0.7%
0.9%
20
Computer Methods and Programs in Biomedicine
27 papers in training set
Top 0.8%
0.8%
21
Briefings in Bioinformatics
326 papers in training set
Top 6%
0.8%
22
Computers in Biology and Medicine
120 papers in training set
Top 4%
0.8%
23
Communications Biology
886 papers in training set
Top 21%
0.8%
24
npj Systems Biology and Applications
99 papers in training set
Top 2%
0.7%
25
Bioengineering
24 papers in training set
Top 1%
0.7%
26
Genomics, Proteomics & Bioinformatics
171 papers in training set
Top 6%
0.7%
27
Journal of Medical Imaging
11 papers in training set
Top 0.4%
0.7%
28
Breast Cancer Research
32 papers in training set
Top 0.5%
0.7%
29
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 10%
0.7%
30
Annals of Biomedical Engineering
34 papers in training set
Top 1%
0.6%