TY - JOUR
T1 - DeepSMILE
T2 - Contrastive self-supervised pre-training benefits MSI and HRD classification directly from H&E whole-slide images in colorectal and breast cancer
AU - Schirris, Yoni
AU - Gavves, Efstratios
AU - Nederlof, Iris
AU - Horlings, Hugo Mark
AU - Teuwen, Jonas
N1 - Funding Information: The collaboration project is co-funded by the PPP Allowance made available by Health Holland1, Top Sector Life Sciences & Health, to stimulate public-private partnerships. Publisher Copyright: © 2022 Elsevier B.V.
PY - 2022/7/1
Y1 - 2022/7/1
N2 - We propose a Deep learning-based weak label learning method for analyzing whole slide images (WSIs) of Hematoxylin and Eosin (H&E) stained tumor tissue not requiring pixel-level or tile-level annotations using Self-supervised pre-training and heterogeneity-aware deep Multiple Instance LEarning (DeepSMILE). We apply DeepSMILE to the task of Homologous recombination deficiency (HRD) and microsatellite instability (MSI) prediction. We utilize contrastive self-supervised learning to pre-train a feature extractor on histopathology tiles of cancer tissue. Additionally, we use variability-aware deep multiple instance learning to learn the tile feature aggregation function while modeling tumor heterogeneity. For MSI prediction in a tumor-annotated and color normalized subset of TCGA-CRC (n=360 patients), contrastive self-supervised learning improves the tile supervision baseline from 0.77 to 0.87 AUROC, on par with our proposed DeepSMILE method. On TCGA-BC (n=1041 patients) without any manual annotations, DeepSMILE improves HRD classification performance from 0.77 to 0.81 AUROC compared to tile supervision with either a self-supervised or ImageNet pre-trained feature extractor. Our proposed methods reach the baseline performance using only 40% of the labeled data on both datasets. These improvements suggest we can use standard self-supervised learning techniques combined with multiple instance learning in the histopathology domain to improve genomic label classification performance with fewer labeled data.
AB - We propose a Deep learning-based weak label learning method for analyzing whole slide images (WSIs) of Hematoxylin and Eosin (H&E) stained tumor tissue not requiring pixel-level or tile-level annotations using Self-supervised pre-training and heterogeneity-aware deep Multiple Instance LEarning (DeepSMILE). We apply DeepSMILE to the task of Homologous recombination deficiency (HRD) and microsatellite instability (MSI) prediction. We utilize contrastive self-supervised learning to pre-train a feature extractor on histopathology tiles of cancer tissue. Additionally, we use variability-aware deep multiple instance learning to learn the tile feature aggregation function while modeling tumor heterogeneity. For MSI prediction in a tumor-annotated and color normalized subset of TCGA-CRC (n=360 patients), contrastive self-supervised learning improves the tile supervision baseline from 0.77 to 0.87 AUROC, on par with our proposed DeepSMILE method. On TCGA-BC (n=1041 patients) without any manual annotations, DeepSMILE improves HRD classification performance from 0.77 to 0.81 AUROC compared to tile supervision with either a self-supervised or ImageNet pre-trained feature extractor. Our proposed methods reach the baseline performance using only 40% of the labeled data on both datasets. These improvements suggest we can use standard self-supervised learning techniques combined with multiple instance learning in the histopathology domain to improve genomic label classification performance with fewer labeled data.
KW - Computational pathology
KW - Histogenomics
KW - Multiple instance learning
KW - Self-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85131223827&partnerID=8YFLogxK
UR - https://pure.uva.nl/ws/files/107966534/1_s2.0_S1361841522001116_mmc1.csv
UR - https://pure.uva.nl/ws/files/107966536/1_s2.0_S1361841522001116_mmc2.csv
UR - https://pure.uva.nl/ws/files/107966538/1_s2.0_S1361841522001116_mmc3.pdf
U2 - https://doi.org/10.1016/j.media.2022.102464
DO - https://doi.org/10.1016/j.media.2022.102464
M3 - Article
C2 - 35596966
SN - 1361-8415
VL - 79
SP - 102464
JO - Medical Image Analysis
JF - Medical Image Analysis
M1 - 102464
ER -