Purpose: Biomarkers that can accurately predict outcome in DLBCL patients are urgently needed. Radiomics features extracted from baseline [18F]-FDG PET/CT scans have shown promising results. This study aims to investigate which lesion- and feature-selection approaches/methods resulted in the best prediction of progression after 2 years. Methods: A total of 296 patients were included. 485 radiomics features (n = 5 conventional PET, n = 22 morphology, n = 50 intensity, n = 408 texture) were extracted for all individual lesions and at patient level, where all lesions were aggregated into one VOI. 18 features quantifying dissemination were extracted at patient level. Several lesion selection approaches were tested (largest or hottest lesion, patient level [all with/without dissemination], maximum or median of all lesions) and compared to the predictive value of our previously published model. Several data reduction methods were applied (principal component analysis, recursive feature elimination (RFE), factor analysis, and univariate selection). The predictive value of all models was tested using a fivefold cross-validation approach with 50 repeats with and without oversampling, yielding the mean cross-validated AUC (CV-AUC). Additionally, the relative importance of individual radiomics features was determined. Results: Models with conventional PET and dissemination features showed the highest predictive value (CV-AUC: 0.72–0.75). Dissemination features had the highest relative importance in these models. No lesion selection approach showed significantly higher predictive value compared to our previous model. Oversampling combined with RFE resulted in highest CV-AUCs. Conclusion: Regardless of the applied lesion selection or feature selection approach and feature reduction methods, patient level conventional PET features and dissemination features have the highest predictive value. Trial registration number and date: EudraCT: 2006–005174-42, 01–08-2008.
- Lesion selection