Back

Automatic grading of cervical biopsies by combining full and self-supervision

Lubrano di Scandalea, M.; Lazard, T.; Balezo, G.; Bellahsen-Harrar, Y.; Badoual, C.; Berlemont, S.; Walter, T.

2022-01-17 cancer biology
10.1101/2022.01.14.476330 bioRxiv
Show abstract

In computational pathology, predictive models from Whole Slide Images (WSI) mostly rely on Multiple Instance Learning (MIL), where the WSI are represented as a bag of tiles, each of which is encoded by a Neural Network (NN). Slide-level predictions are then achieved by building models on the agglomeration of these tile encodings. The tile encoding strategy thus plays a key role for such models. Current approaches include the use of encodings trained on unrelated data sources, full supervision or self-supervision. While self-supervised learning (SSL) exploits unlabeled data, it often requires large computational resources to train. On the other end of the spectrum, fully-supervised methods make use of valuable prior knowledge about the data but involve a costly amount of expert time. This paper proposes a framework to reconcile SSL and full supervision, showing that a combination of both provides efficient encodings, both in terms of performance and in terms of biological interpretability. On a recently organized challenge on grading Cervical Biopsies, we show that our mixed supervision scheme reaches high performance (weighted accuracy (WA): 0.945), outperforming both SSL (WA: 0.927) and transfer learning from ImageNet (WA: 0.877). We further shed light upon the internal representations that trigger classification results, providing a method to reveal relevant phenotypic patterns for grading cervical biopsies. We expect that the combination of full and self-supervision is an interesting strategy for many tasks in computational pathology and will be widely adopted by the field.

Matching journals

The top 12 journals account for 50% of the predicted probability mass.

1
PLOS Computational Biology
1633 papers in training set
Top 3%
10.7%
2
Scientific Reports
3102 papers in training set
Top 11%
7.4%
3
Modern Pathology
21 papers in training set
Top 0.1%
5.0%
4
Journal of Medical Imaging
11 papers in training set
Top 0.1%
4.4%
5
PLOS ONE
4510 papers in training set
Top 35%
4.1%
6
Medical Image Analysis
33 papers in training set
Top 0.3%
3.7%
7
Expert Systems with Applications
11 papers in training set
Top 0.1%
3.7%
8
Communications Biology
886 papers in training set
Top 3%
3.2%
9
npj Precision Oncology
48 papers in training set
Top 0.2%
3.0%
10
Nature Communications
4913 papers in training set
Top 45%
2.5%
11
Biological Imaging
15 papers in training set
Top 0.1%
2.1%
12
Bioinformatics
1061 papers in training set
Top 6%
2.1%
50% of probability mass above
13
Cytometry Part A
30 papers in training set
Top 0.1%
1.9%
14
iScience
1063 papers in training set
Top 11%
1.9%
15
Frontiers in Bioinformatics
45 papers in training set
Top 0.2%
1.8%
16
Cancers
200 papers in training set
Top 3%
1.7%
17
Diagnostics
48 papers in training set
Top 1%
1.7%
18
Patterns
70 papers in training set
Top 1%
1.5%
19
Journal of Computational Biology
37 papers in training set
Top 0.3%
1.4%
20
Genome Medicine
154 papers in training set
Top 5%
1.4%
21
Cancer Research
116 papers in training set
Top 2%
1.3%
22
Biology Methods and Protocols
53 papers in training set
Top 2%
1.1%
23
Frontiers in Cell and Developmental Biology
218 papers in training set
Top 6%
1.1%
24
Computers in Biology and Medicine
120 papers in training set
Top 3%
1.0%
25
Bioengineering
24 papers in training set
Top 1.0%
0.9%
26
BMC Bioinformatics
383 papers in training set
Top 6%
0.8%
27
Frontiers in Molecular Biosciences
100 papers in training set
Top 5%
0.8%
28
Genome Biology
555 papers in training set
Top 7%
0.8%
29
Laboratory Investigation
13 papers in training set
Top 0.2%
0.8%
30
Cell Reports Medicine
140 papers in training set
Top 7%
0.8%