TY - JOUR
T1 - Predictive factors for allergy at 4–6 years of age based on machine learning
T2 - A pilot study
AU - Kamphorst, Kim
AU - Lopez-Rincon, Alejandro
AU - Vlieger, Arine M.
AU - Garssen, Johan
AU - van ’t Riet, Esther
AU - van Elburg, Ruurd M.
N1 - Funding Information: NA. This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Publisher Copyright: © 2022 The Authors
PY - 2023/3/1
Y1 - 2023/3/1
N2 - Background: In Europe, allergic diseases are the most common chronic childhood illnesses and the result of a complex interplay between genetics and environmental factors. A new approach for analyzing this complex data is to employ machine learning (ML) algorithms. Therefore, the aim of this pilot study was to find predictors for the presence of parental-reported allergy at 4–6 years of age by using feature selection in ML. Methods: A recursive ensemble feature selection (REFS) was used, with a 20% step reduction and with eight different classifiers in the ensemble, and resampling given the class unbalance. Thereafter, the Receiver Operating Characteristic Curves for five different classifiers, not included in the original ensemble feature selection technique, were calculated. Results: In total, 130 children (14 with and 116 without parental-reported allergy) and 248 features were included in the ML analyses. The REFS algorithm showed a result of 20 features and particularly, the Multi-layer Perceptron Classifier had an area under the curve (AUC) of 0.86 (SD 0.08). The features predictive for allergy were: tobacco exposure during pregnancy, atopic parents, gestational age, days of: diarrhea, cough, rash, and fever during first year of life, ever being exposed to antibiotics, Resistin, IL-27, MMP9, CXCL8, CCL13, Vimentin, IL-4, CCL22, GAL1, IL-6, LIGHT, and GMCSF. Conclusions: This ML model shows that a combination of environmental exposures and cytokines can predict later allergy with an AUC of 0.86 despite the small sample size. In the future, our ML model still needs to be externally validated.
AB - Background: In Europe, allergic diseases are the most common chronic childhood illnesses and the result of a complex interplay between genetics and environmental factors. A new approach for analyzing this complex data is to employ machine learning (ML) algorithms. Therefore, the aim of this pilot study was to find predictors for the presence of parental-reported allergy at 4–6 years of age by using feature selection in ML. Methods: A recursive ensemble feature selection (REFS) was used, with a 20% step reduction and with eight different classifiers in the ensemble, and resampling given the class unbalance. Thereafter, the Receiver Operating Characteristic Curves for five different classifiers, not included in the original ensemble feature selection technique, were calculated. Results: In total, 130 children (14 with and 116 without parental-reported allergy) and 248 features were included in the ML analyses. The REFS algorithm showed a result of 20 features and particularly, the Multi-layer Perceptron Classifier had an area under the curve (AUC) of 0.86 (SD 0.08). The features predictive for allergy were: tobacco exposure during pregnancy, atopic parents, gestational age, days of: diarrhea, cough, rash, and fever during first year of life, ever being exposed to antibiotics, Resistin, IL-27, MMP9, CXCL8, CCL13, Vimentin, IL-4, CCL22, GAL1, IL-6, LIGHT, and GMCSF. Conclusions: This ML model shows that a combination of environmental exposures and cytokines can predict later allergy with an AUC of 0.86 despite the small sample size. In the future, our ML model still needs to be externally validated.
KW - AI
KW - Artificial intelligence
KW - Atopic disorders
KW - Cytokines
KW - Feature selection
KW - Prediction
UR - http://www.scopus.com/inward/record.url?scp=85144329049&partnerID=8YFLogxK
U2 - https://doi.org/10.1016/j.phanu.2022.100326
DO - https://doi.org/10.1016/j.phanu.2022.100326
M3 - Article
SN - 2213-4344
VL - 23
JO - PharmaNutrition
JF - PharmaNutrition
M1 - 100326
ER -