Reproducible and clinically translatable deep neural networks for cervical screening

Syed Rakin Ahmed; Brian Befano; Andreanne Lemay; Didem Egemen; Ana Cecilia Rodriguez; Sandeep Angara; Kanan Desai; Jose Jeronimo; Sameer Antani; Nicole Campos; Federica Inturrisi; Rebecca Perkins; Aimee Kreimer; Nicolas Wentzensen; Rolando Herrero; Marta del Pino; Wim Quint; Silvia de Sanjose; Mark Schiffman; Jayashree Kalpathy-Cramer

doi:https://doi.org/10.1038/s41598-023-48721-1

Reproducible and clinically translatable deep neural networks for cervical screening

Syed Rakin Ahmed, Brian Befano, Andreanne Lemay, Didem Egemen, Ana Cecilia Rodriguez, Sandeep Angara, Kanan Desai, Jose Jeronimo, Sameer Antani, Nicole Campos, Federica Inturrisi, Rebecca Perkins, Aimee Kreimer, Nicolas Wentzensen, Rolando Herrero, Marta del Pino, Wim Quint, Silvia de Sanjose, Mark Schiffman, Jayashree Kalpathy-Cramer

Epidemiology and Data Science (VUmc)

Research output: Contribution to journal › Article › Academic › peer-review

1 Citation (Scopus)

Abstract

Cervical cancer is a leading cause of cancer mortality, with approximately 90% of the 250,000 deaths per year occurring in low- and middle-income countries (LMIC). Secondary prevention with cervical screening involves detecting and treating precursor lesions; however, scaling screening efforts in LMIC has been hampered by infrastructure and cost constraints. Recent work has supported the development of an artificial intelligence (AI) pipeline on digital images of the cervix to achieve an accurate and reliable diagnosis of treatable precancerous lesions. In particular, WHO guidelines emphasize visual triage of women testing positive for human papillomavirus (HPV) as the primary screen, and AI could assist in this triage task. In this work, we implemented a comprehensive deep-learning model selection and optimization study on a large, collated, multi-geography, multi-institution, and multi-device dataset of 9462 women (17,013 images). We evaluated relative portability, repeatability, and classification performance. The top performing model, when combined with HPV type, achieved an area under the Receiver Operating Characteristics (ROC) curve (AUC) of 0.89 within our study population of interest, and a limited total extreme misclassification rate of 3.4%, on held-aside test sets. Our model also produced reliable and consistent predictions, achieving a strong quadratic weighted kappa (QWK) of 0.86 and a minimal %2-class disagreement (% 2-Cl. D.) of 0.69%, between image pairs across women. Our work is among the first efforts at designing a robust, repeatable, accurate and clinically translatable deep-learning model for cervical screening.

Original language	English
Article number	21772
Journal	Scientific reports
Volume	13
Issue number	1
DOIs	https://doi.org/10.1038/s41598-023-48721-1
Publication status	Published - 1 Dec 2023

Access to Document

https://doi.org/10.1038/s41598-023-48721-1

Cite this

Ahmed, S. R., Befano, B., Lemay, A., Egemen, D., Rodriguez, A. C., Angara, S., Desai, K., Jeronimo, J., Antani, S., Campos, N., Inturrisi, F., Perkins, R., Kreimer, A., Wentzensen, N., Herrero, R., del Pino, M., Quint, W., de Sanjose, S., Schiffman, M., & Kalpathy-Cramer, J. (2023). Reproducible and clinically translatable deep neural networks for cervical screening. Scientific reports, 13(1), Article 21772. https://doi.org/10.1038/s41598-023-48721-1

@article{2bcec475e45e4405a5bd0c047369bedd,

title = "Reproducible and clinically translatable deep neural networks for cervical screening",

abstract = "Cervical cancer is a leading cause of cancer mortality, with approximately 90% of the 250,000 deaths per year occurring in low- and middle-income countries (LMIC). Secondary prevention with cervical screening involves detecting and treating precursor lesions; however, scaling screening efforts in LMIC has been hampered by infrastructure and cost constraints. Recent work has supported the development of an artificial intelligence (AI) pipeline on digital images of the cervix to achieve an accurate and reliable diagnosis of treatable precancerous lesions. In particular, WHO guidelines emphasize visual triage of women testing positive for human papillomavirus (HPV) as the primary screen, and AI could assist in this triage task. In this work, we implemented a comprehensive deep-learning model selection and optimization study on a large, collated, multi-geography, multi-institution, and multi-device dataset of 9462 women (17,013 images). We evaluated relative portability, repeatability, and classification performance. The top performing model, when combined with HPV type, achieved an area under the Receiver Operating Characteristics (ROC) curve (AUC) of 0.89 within our study population of interest, and a limited total extreme misclassification rate of 3.4%, on held-aside test sets. Our model also produced reliable and consistent predictions, achieving a strong quadratic weighted kappa (QWK) of 0.86 and a minimal %2-class disagreement (% 2-Cl. D.) of 0.69%, between image pairs across women. Our work is among the first efforts at designing a robust, repeatable, accurate and clinically translatable deep-learning model for cervical screening.",

author = "Ahmed, {Syed Rakin} and Brian Befano and Andreanne Lemay and Didem Egemen and Rodriguez, {Ana Cecilia} and Sandeep Angara and Kanan Desai and Jose Jeronimo and Sameer Antani and Nicole Campos and Federica Inturrisi and Rebecca Perkins and Aimee Kreimer and Nicolas Wentzensen and Rolando Herrero and {del Pino}, Marta and Wim Quint and {de Sanjose}, Silvia and Mark Schiffman and Jayashree Kalpathy-Cramer",

note = "Publisher Copyright: {\textcopyright} 2023, The Author(s).",

year = "2023",

month = dec,

day = "1",

doi = "https://doi.org/10.1038/s41598-023-48721-1",

language = "English",

volume = "13",

journal = "Scientific reports",

issn = "2045-2322",

publisher = "Springer Nature",

number = "1",

}

Ahmed, SR, Befano, B, Lemay, A, Egemen, D, Rodriguez, AC, Angara, S, Desai, K, Jeronimo, J, Antani, S, Campos, N, Inturrisi, F, Perkins, R, Kreimer, A, Wentzensen, N, Herrero, R, del Pino, M, Quint, W, de Sanjose, S, Schiffman, M & Kalpathy-Cramer, J 2023, 'Reproducible and clinically translatable deep neural networks for cervical screening', Scientific reports, vol. 13, no. 1, 21772. https://doi.org/10.1038/s41598-023-48721-1

TY - JOUR

T1 - Reproducible and clinically translatable deep neural networks for cervical screening

AU - Ahmed, Syed Rakin

AU - Befano, Brian

AU - Lemay, Andreanne

AU - Egemen, Didem

AU - Rodriguez, Ana Cecilia

AU - Angara, Sandeep

AU - Desai, Kanan

AU - Jeronimo, Jose

AU - Antani, Sameer

AU - Campos, Nicole

AU - Inturrisi, Federica

AU - Perkins, Rebecca

AU - Kreimer, Aimee

AU - Wentzensen, Nicolas

AU - Herrero, Rolando

AU - del Pino, Marta

AU - Quint, Wim

AU - de Sanjose, Silvia

AU - Schiffman, Mark

AU - Kalpathy-Cramer, Jayashree

PY - 2023/12/1

Y1 - 2023/12/1

N2 - Cervical cancer is a leading cause of cancer mortality, with approximately 90% of the 250,000 deaths per year occurring in low- and middle-income countries (LMIC). Secondary prevention with cervical screening involves detecting and treating precursor lesions; however, scaling screening efforts in LMIC has been hampered by infrastructure and cost constraints. Recent work has supported the development of an artificial intelligence (AI) pipeline on digital images of the cervix to achieve an accurate and reliable diagnosis of treatable precancerous lesions. In particular, WHO guidelines emphasize visual triage of women testing positive for human papillomavirus (HPV) as the primary screen, and AI could assist in this triage task. In this work, we implemented a comprehensive deep-learning model selection and optimization study on a large, collated, multi-geography, multi-institution, and multi-device dataset of 9462 women (17,013 images). We evaluated relative portability, repeatability, and classification performance. The top performing model, when combined with HPV type, achieved an area under the Receiver Operating Characteristics (ROC) curve (AUC) of 0.89 within our study population of interest, and a limited total extreme misclassification rate of 3.4%, on held-aside test sets. Our model also produced reliable and consistent predictions, achieving a strong quadratic weighted kappa (QWK) of 0.86 and a minimal %2-class disagreement (% 2-Cl. D.) of 0.69%, between image pairs across women. Our work is among the first efforts at designing a robust, repeatable, accurate and clinically translatable deep-learning model for cervical screening.

AB - Cervical cancer is a leading cause of cancer mortality, with approximately 90% of the 250,000 deaths per year occurring in low- and middle-income countries (LMIC). Secondary prevention with cervical screening involves detecting and treating precursor lesions; however, scaling screening efforts in LMIC has been hampered by infrastructure and cost constraints. Recent work has supported the development of an artificial intelligence (AI) pipeline on digital images of the cervix to achieve an accurate and reliable diagnosis of treatable precancerous lesions. In particular, WHO guidelines emphasize visual triage of women testing positive for human papillomavirus (HPV) as the primary screen, and AI could assist in this triage task. In this work, we implemented a comprehensive deep-learning model selection and optimization study on a large, collated, multi-geography, multi-institution, and multi-device dataset of 9462 women (17,013 images). We evaluated relative portability, repeatability, and classification performance. The top performing model, when combined with HPV type, achieved an area under the Receiver Operating Characteristics (ROC) curve (AUC) of 0.89 within our study population of interest, and a limited total extreme misclassification rate of 3.4%, on held-aside test sets. Our model also produced reliable and consistent predictions, achieving a strong quadratic weighted kappa (QWK) of 0.86 and a minimal %2-class disagreement (% 2-Cl. D.) of 0.69%, between image pairs across women. Our work is among the first efforts at designing a robust, repeatable, accurate and clinically translatable deep-learning model for cervical screening.

UR - http://www.scopus.com/inward/record.url?scp=85178965881&partnerID=8YFLogxK

U2 - https://doi.org/10.1038/s41598-023-48721-1

DO - https://doi.org/10.1038/s41598-023-48721-1

M3 - Article

C2 - 38066031

SN - 2045-2322

VL - 13

JO - Scientific reports

JF - Scientific reports

IS - 1

M1 - 21772

ER -

Reproducible and clinically translatable deep neural networks for cervical screening

Abstract

Access to Document

Other files and links

Cite this