TY - JOUR
T1 - Development of prediction models to identify hotspots of schistosomiasis in endemic regions to guide mass drug administration
AU - Singer, Benjamin J.
AU - Coulibaly, Jean T.
AU - Park, Hailey J.
AU - Andrews, Jason R.
AU - Bogoch, Isaac I.
AU - Lo, Nathan C.
N1 - Publisher Copyright: © 2024 National Academy of Sciences. All rights reserved.
PY - 2024/1/9
Y1 - 2024/1/9
N2 - Schistosomiasis is a neglected tropical disease affecting over 150 million people. Hotspots of Schistosoma transmission-communities where infection prevalence does not decline adequately with mass drug administration-present a key challenge in eliminating schistosomiasis. Current approaches to identify hotspots require evaluation 2-5 y after a baseline survey and subsequent mass drug administration. Here, we develop statistical models to predict hotspots at baseline prior to treatment comparing three common hotspot definitions, using epidemiologic, survey-based, and remote sensing data. In a reanalysis of randomized trials in 589 communities in five endemic countries, a regression model predicts whether Schistosoma mansoni infection prevalence will exceed the WHO threshold of 10% in year 5 ("prevalence hotspot") with 86% sensitivity, 74% specificity, and 93% negative predictive value (NPV; assuming 30% hotspot prevalence), and a regression model for Schistosoma haematobium achieves 90% sensitivity, 90% specificity, and 96% NPV. A random forest model predicts whether S. mansoni moderate and heavy infection prevalence will exceed a public health goal of 1% in year 5 ("intensity hotspot") with 92% sensitivity, 79% specificity, and 96% NPV, and a boosted trees model for S. haematobium achieves 77% sensitivity, 95% specificity, and 91% NPV. Baseline prevalence is a top predictor in all models. Prediction is less accurate in countries not represented in training data and for a third hotspot definition based on relative prevalence reduction over time ("persistent hotspot"). These models may be a tool to prioritize high-risk communities for more frequent surveillance or intervention against schistosomiasis, but prediction of hotspots remains a challenge.
AB - Schistosomiasis is a neglected tropical disease affecting over 150 million people. Hotspots of Schistosoma transmission-communities where infection prevalence does not decline adequately with mass drug administration-present a key challenge in eliminating schistosomiasis. Current approaches to identify hotspots require evaluation 2-5 y after a baseline survey and subsequent mass drug administration. Here, we develop statistical models to predict hotspots at baseline prior to treatment comparing three common hotspot definitions, using epidemiologic, survey-based, and remote sensing data. In a reanalysis of randomized trials in 589 communities in five endemic countries, a regression model predicts whether Schistosoma mansoni infection prevalence will exceed the WHO threshold of 10% in year 5 ("prevalence hotspot") with 86% sensitivity, 74% specificity, and 93% negative predictive value (NPV; assuming 30% hotspot prevalence), and a regression model for Schistosoma haematobium achieves 90% sensitivity, 90% specificity, and 96% NPV. A random forest model predicts whether S. mansoni moderate and heavy infection prevalence will exceed a public health goal of 1% in year 5 ("intensity hotspot") with 92% sensitivity, 79% specificity, and 96% NPV, and a boosted trees model for S. haematobium achieves 77% sensitivity, 95% specificity, and 91% NPV. Baseline prevalence is a top predictor in all models. Prediction is less accurate in countries not represented in training data and for a third hotspot definition based on relative prevalence reduction over time ("persistent hotspot"). These models may be a tool to prioritize high-risk communities for more frequent surveillance or intervention against schistosomiasis, but prediction of hotspots remains a challenge.
KW - hotspots
KW - machine learning
KW - neglected tropical diseases
KW - public health
KW - schistosomiasis
UR - http://www.scopus.com/inward/record.url?scp=85181627396&partnerID=8YFLogxK
U2 - https://doi.org/10.1073/pnas.2315463120
DO - https://doi.org/10.1073/pnas.2315463120
M3 - Article
C2 - 38181058
SN - 0027-8424
VL - 121
JO - PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
JF - PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
IS - 2
M1 - e2315463120
ER -