Privacy-preserving distributed learning of radiomics to predict overall survival and HPV status in head and neck cancer

Marta Bogowicz; Arthur Jochems; Timo M. Deist; Stephanie Tanadini-Lang; Shao Hui Huang; Biu Chan; John N. Waldron; Scott Bratman; Brian O’Sullivan; Oliver Riesterer; Gabriela Studer; Jan Unkelbach; Samir Barakat; Ruud H. Brakenhoff; Irene Nauta; Silvia E. Gazzani; Giuseppina Calareso; Kathrin Scheckenbach; Frank Hoebers; Frederik W.R. Wesseling; Simon Keek; Sebastian Sanduleanu; Ralph T.H. Leijenaar; Marije R. Vergeer; C. René Leemans; Chris H.J. Terhaard; Michiel W.M. van den Brekel; Olga Hamming-Vrieze; Martijn A. van der Heijden; Hesham M. Elhalawani; Clifton D. Fuller; Matthias Guckenberger; Philippe Lambin

doi:https://doi.org/10.1038/s41598-020-61297-4

Privacy-preserving distributed learning of radiomics to predict overall survival and HPV status in head and neck cancer

Marta Bogowicz, Arthur Jochems, Timo M. Deist, Stephanie Tanadini-Lang, Shao Hui Huang, Biu Chan, John N. Waldron, Scott Bratman, Brian O’Sullivan, Oliver Riesterer, Gabriela Studer, Jan Unkelbach, Samir Barakat, Ruud H. Brakenhoff, Irene Nauta, Silvia E. Gazzani, Giuseppina Calareso, Kathrin Scheckenbach, Frank Hoebers, Frederik W.R. WesselingSimon Keek, Sebastian Sanduleanu, Ralph T.H. Leijenaar, Marije R. Vergeer, C. René Leemans, Chris H.J. Terhaard, Michiel W.M. van den Brekel, Olga Hamming-Vrieze, Martijn A. van der Heijden, Hesham M. Elhalawani, Clifton D. Fuller, Matthias Guckenberger, Philippe Lambin

Research output: Contribution to journal › Article › Academic › peer-review

44 Citations (Scopus)

Abstract

A major challenge in radiomics is assembling data from multiple centers. Sharing data between hospitals is restricted by legal and ethical regulations. Distributed learning is a technique, enabling training models on multicenter data without data leaving the hospitals (“privacy-preserving” distributed learning). This study tested feasibility of distributed learning of radiomics data for prediction of two year overall survival and HPV status in head and neck cancer (HNC) patients. Pretreatment CT images were collected from 1174 HNC patients in 6 different cohorts. 981 radiomic features were extracted using Z-Rad software implementation. Hierarchical clustering was performed to preselect features. Classification was done using logistic regression. In the validation dataset, the receiver operating characteristics (ROC) were compared between the models trained in the centralized and distributed manner. No difference in ROC was observed with respect to feature selection. The logistic regression coefficients were identical between the methods (absolute difference <10⁻⁷). In comparison of the full workflow (feature selection and classification), no significant difference in ROC was found between centralized and distributed models for both studied endpoints (DeLong p > 0.05). In conclusion, both feature selection and classification are feasible in a distributed manner using radiomics data, which opens new possibility for training more reliable radiomics models.

Original language	English
Article number	4542
Journal	Scientific reports
Volume	10
Issue number	1
DOIs	https://doi.org/10.1038/s41598-020-61297-4
Publication status	Published - 1 Dec 2020

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

Access to Document

https://doi.org/10.1038/s41598-020-61297-4

Cite this

Bogowicz, M., Jochems, A., Deist, T. M., Tanadini-Lang, S., Huang, S. H., Chan, B., Waldron, J. N., Bratman, S., O’Sullivan, B., Riesterer, O., Studer, G., Unkelbach, J., Barakat, S., Brakenhoff, R. H., Nauta, I., Gazzani, S. E., Calareso, G., Scheckenbach, K., Hoebers, F., ... Lambin, P. (2020). Privacy-preserving distributed learning of radiomics to predict overall survival and HPV status in head and neck cancer. Scientific reports, 10(1), Article 4542. https://doi.org/10.1038/s41598-020-61297-4

@article{f317280604e94cafbd5b74b0e4c87ca7,

title = "Privacy-preserving distributed learning of radiomics to predict overall survival and HPV status in head and neck cancer",

abstract = "A major challenge in radiomics is assembling data from multiple centers. Sharing data between hospitals is restricted by legal and ethical regulations. Distributed learning is a technique, enabling training models on multicenter data without data leaving the hospitals (“privacy-preserving” distributed learning). This study tested feasibility of distributed learning of radiomics data for prediction of two year overall survival and HPV status in head and neck cancer (HNC) patients. Pretreatment CT images were collected from 1174 HNC patients in 6 different cohorts. 981 radiomic features were extracted using Z-Rad software implementation. Hierarchical clustering was performed to preselect features. Classification was done using logistic regression. In the validation dataset, the receiver operating characteristics (ROC) were compared between the models trained in the centralized and distributed manner. No difference in ROC was observed with respect to feature selection. The logistic regression coefficients were identical between the methods (absolute difference <10−7). In comparison of the full workflow (feature selection and classification), no significant difference in ROC was found between centralized and distributed models for both studied endpoints (DeLong p > 0.05). In conclusion, both feature selection and classification are feasible in a distributed manner using radiomics data, which opens new possibility for training more reliable radiomics models.",

author = "Marta Bogowicz and Arthur Jochems and Deist, {Timo M.} and Stephanie Tanadini-Lang and Huang, {Shao Hui} and Biu Chan and Waldron, {John N.} and Scott Bratman and Brian O{\textquoteright}Sullivan and Oliver Riesterer and Gabriela Studer and Jan Unkelbach and Samir Barakat and Brakenhoff, {Ruud H.} and Irene Nauta and Gazzani, {Silvia E.} and Giuseppina Calareso and Kathrin Scheckenbach and Frank Hoebers and Wesseling, {Frederik W.R.} and Simon Keek and Sebastian Sanduleanu and Leijenaar, {Ralph T.H.} and Vergeer, {Marije R.} and Leemans, {C. Ren{\'e}} and Terhaard, {Chris H.J.} and {van den Brekel}, {Michiel W.M.} and Olga Hamming-Vrieze and {van der Heijden}, {Martijn A.} and Elhalawani, {Hesham M.} and Fuller, {Clifton D.} and Matthias Guckenberger and Philippe Lambin",

note = "Funding Information: This project was supported by the Swiss National Science Foundation Sinergia grant (310030_173303) and Scientific Exchange grant (IZSEZ0_180524). The clinical study used as one of the cohorts was supported by a research grant from Merck (Schweiz) AG. This work was also supported by the Interreg grant EURADIOMICS and the Dutch technology Foundation STW (grant n° 10696 DuCAT and n° P14-19 Radiomics STRaTegy), which is the applied science division of NWO, the Technology Program of the Ministry of Economic Affairs and the Manchester Cancer Research UK major centre grant. The authors also acknowledge financial support from the EU 7th framework program (ARTFORCE - n° 257144, REQUITE - n° 601826), CTMM-TraIT, EUROSTARS (E-DECIDE, DEEPMAM), Kankeronderzoekfonds Limburg from the Health Foundation Limburg, Alpe d{\textquoteright}HuZes-KWF (DESIGN), The Dutch Cancer Society, the European Program H2020-2015-17 (ImmunoSABR - n° 733008 and BD2Decide - PHC30-689715), the ERC advanced grant (ERC-ADG-2015, n° 694812 - Hypoximmuno), SME Phase 2 (EU proposal 673780 – RAIL). Dr. Elhalawani was supported in part by the philanthropic donations from the Family of Paul W. Beach to Dr. G. Brandon Gunn, MD. Drs. Elhalawani and Fuller receive funding and project-relevant salary support from NIH/NCI Head and Neck Specialized Programs of Research Excellence (SPORE) Developmental Research Program Award (P50 CA097007-10). This research is supported by the Andrew Sabin Family Foundation; Dr. Fuller is a Sabin Family Foundation Fellow. Dr. Fuller receive funding and project-relevant salary support from the National Institutes of Health (NIH), including: National Institute for Dental and Craniofacial Research Award (1R01DE025248-01/R56DE025248-01); National Cancer Institute (NCI) Early Phase Clinical Trials in Imaging and Image-Guided Interventions Program(1R01CA218148-01); National Science Foundation (NSF), Division of Mathematical Sciences; NIH Big Data to Knowledge (BD2K) Program of the National Cancer Institute Early Stage Development of Technologies in Biomedical Computing, Informatics, and Big Data Science Award (1R01CA214825-01); NIH/NCI Cancer Center Support Grant (CCSG) Pilot Research Program Award from the UT MD Anderson CCSG Radiation Oncology and Cancer Imaging Program (P30CA016672) and National Institute of Biomedical Imaging and Bioengineering (NIBIB) Research Education Program (R25EB025787). Dr. Fuller has received direct industry grant support and travel funding from Elekta AB. We thank Jessica van Rossum for language editing of this manuscript. Publisher Copyright: {\textcopyright} 2020, The Author(s).",

year = "2020",

month = dec,

day = "1",

doi = "https://doi.org/10.1038/s41598-020-61297-4",

language = "English",

volume = "10",

journal = "Scientific reports",

issn = "2045-2322",

publisher = "Springer Nature",

number = "1",

}

Bogowicz, M, Jochems, A, Deist, TM, Tanadini-Lang, S, Huang, SH, Chan, B, Waldron, JN, Bratman, S, O’Sullivan, B, Riesterer, O, Studer, G, Unkelbach, J, Barakat, S, Brakenhoff, RH , Nauta, I, Gazzani, SE, Calareso, G, Scheckenbach, K, Hoebers, F, Wesseling, FWR, Keek, S, Sanduleanu, S, Leijenaar, RTH, Vergeer, MR , Leemans, CR, Terhaard, CHJ, van den Brekel, MWM, Hamming-Vrieze, O, van der Heijden, MA, Elhalawani, HM, Fuller, CD, Guckenberger, M & Lambin, P 2020, 'Privacy-preserving distributed learning of radiomics to predict overall survival and HPV status in head and neck cancer', Scientific reports, vol. 10, no. 1, 4542. https://doi.org/10.1038/s41598-020-61297-4

TY - JOUR

T1 - Privacy-preserving distributed learning of radiomics to predict overall survival and HPV status in head and neck cancer

AU - Bogowicz, Marta

AU - Jochems, Arthur

AU - Deist, Timo M.

AU - Tanadini-Lang, Stephanie

AU - Huang, Shao Hui

AU - Chan, Biu

AU - Waldron, John N.

AU - Bratman, Scott

AU - O’Sullivan, Brian

AU - Riesterer, Oliver

AU - Studer, Gabriela

AU - Unkelbach, Jan

AU - Barakat, Samir

AU - Brakenhoff, Ruud H.

AU - Nauta, Irene

AU - Gazzani, Silvia E.

AU - Calareso, Giuseppina

AU - Scheckenbach, Kathrin

AU - Hoebers, Frank

AU - Wesseling, Frederik W.R.

AU - Keek, Simon

AU - Sanduleanu, Sebastian

AU - Leijenaar, Ralph T.H.

AU - Vergeer, Marije R.

AU - Leemans, C. René

AU - Terhaard, Chris H.J.

AU - van den Brekel, Michiel W.M.

AU - Hamming-Vrieze, Olga

AU - van der Heijden, Martijn A.

AU - Elhalawani, Hesham M.

AU - Fuller, Clifton D.

AU - Guckenberger, Matthias

AU - Lambin, Philippe

N1 - Funding Information: This project was supported by the Swiss National Science Foundation Sinergia grant (310030_173303) and Scientific Exchange grant (IZSEZ0_180524). The clinical study used as one of the cohorts was supported by a research grant from Merck (Schweiz) AG. This work was also supported by the Interreg grant EURADIOMICS and the Dutch technology Foundation STW (grant n° 10696 DuCAT and n° P14-19 Radiomics STRaTegy), which is the applied science division of NWO, the Technology Program of the Ministry of Economic Affairs and the Manchester Cancer Research UK major centre grant. The authors also acknowledge financial support from the EU 7th framework program (ARTFORCE - n° 257144, REQUITE - n° 601826), CTMM-TraIT, EUROSTARS (E-DECIDE, DEEPMAM), Kankeronderzoekfonds Limburg from the Health Foundation Limburg, Alpe d’HuZes-KWF (DESIGN), The Dutch Cancer Society, the European Program H2020-2015-17 (ImmunoSABR - n° 733008 and BD2Decide - PHC30-689715), the ERC advanced grant (ERC-ADG-2015, n° 694812 - Hypoximmuno), SME Phase 2 (EU proposal 673780 – RAIL). Dr. Elhalawani was supported in part by the philanthropic donations from the Family of Paul W. Beach to Dr. G. Brandon Gunn, MD. Drs. Elhalawani and Fuller receive funding and project-relevant salary support from NIH/NCI Head and Neck Specialized Programs of Research Excellence (SPORE) Developmental Research Program Award (P50 CA097007-10). This research is supported by the Andrew Sabin Family Foundation; Dr. Fuller is a Sabin Family Foundation Fellow. Dr. Fuller receive funding and project-relevant salary support from the National Institutes of Health (NIH), including: National Institute for Dental and Craniofacial Research Award (1R01DE025248-01/R56DE025248-01); National Cancer Institute (NCI) Early Phase Clinical Trials in Imaging and Image-Guided Interventions Program(1R01CA218148-01); National Science Foundation (NSF), Division of Mathematical Sciences; NIH Big Data to Knowledge (BD2K) Program of the National Cancer Institute Early Stage Development of Technologies in Biomedical Computing, Informatics, and Big Data Science Award (1R01CA214825-01); NIH/NCI Cancer Center Support Grant (CCSG) Pilot Research Program Award from the UT MD Anderson CCSG Radiation Oncology and Cancer Imaging Program (P30CA016672) and National Institute of Biomedical Imaging and Bioengineering (NIBIB) Research Education Program (R25EB025787). Dr. Fuller has received direct industry grant support and travel funding from Elekta AB. We thank Jessica van Rossum for language editing of this manuscript. Publisher Copyright: © 2020, The Author(s).

PY - 2020/12/1

Y1 - 2020/12/1

N2 - A major challenge in radiomics is assembling data from multiple centers. Sharing data between hospitals is restricted by legal and ethical regulations. Distributed learning is a technique, enabling training models on multicenter data without data leaving the hospitals (“privacy-preserving” distributed learning). This study tested feasibility of distributed learning of radiomics data for prediction of two year overall survival and HPV status in head and neck cancer (HNC) patients. Pretreatment CT images were collected from 1174 HNC patients in 6 different cohorts. 981 radiomic features were extracted using Z-Rad software implementation. Hierarchical clustering was performed to preselect features. Classification was done using logistic regression. In the validation dataset, the receiver operating characteristics (ROC) were compared between the models trained in the centralized and distributed manner. No difference in ROC was observed with respect to feature selection. The logistic regression coefficients were identical between the methods (absolute difference <10−7). In comparison of the full workflow (feature selection and classification), no significant difference in ROC was found between centralized and distributed models for both studied endpoints (DeLong p > 0.05). In conclusion, both feature selection and classification are feasible in a distributed manner using radiomics data, which opens new possibility for training more reliable radiomics models.

AB - A major challenge in radiomics is assembling data from multiple centers. Sharing data between hospitals is restricted by legal and ethical regulations. Distributed learning is a technique, enabling training models on multicenter data without data leaving the hospitals (“privacy-preserving” distributed learning). This study tested feasibility of distributed learning of radiomics data for prediction of two year overall survival and HPV status in head and neck cancer (HNC) patients. Pretreatment CT images were collected from 1174 HNC patients in 6 different cohorts. 981 radiomic features were extracted using Z-Rad software implementation. Hierarchical clustering was performed to preselect features. Classification was done using logistic regression. In the validation dataset, the receiver operating characteristics (ROC) were compared between the models trained in the centralized and distributed manner. No difference in ROC was observed with respect to feature selection. The logistic regression coefficients were identical between the methods (absolute difference <10−7). In comparison of the full workflow (feature selection and classification), no significant difference in ROC was found between centralized and distributed models for both studied endpoints (DeLong p > 0.05). In conclusion, both feature selection and classification are feasible in a distributed manner using radiomics data, which opens new possibility for training more reliable radiomics models.

UR - http://www.scopus.com/inward/record.url?scp=85081744940&partnerID=8YFLogxK

U2 - https://doi.org/10.1038/s41598-020-61297-4

DO - https://doi.org/10.1038/s41598-020-61297-4

M3 - Article

C2 - 32161279

SN - 2045-2322

VL - 10

JO - Scientific reports

JF - Scientific reports

IS - 1

M1 - 4542

ER -

Privacy-preserving distributed learning of radiomics to predict overall survival and HPV status in head and neck cancer

Abstract

UN SDGs

Access to Document

Other files and links

Cite this