Repeatability of 18F-FDG PET radiomic features: A phantom study to explore sensitivity to image reconstruction settings, noise, and delineation method

Elisabeth Pfaehler, Roelof J. Beukinga, Johan R. de Jong, Riemer H. J. A. Slart, Cornelis H. Slump, Rudi A. J. O. Dierckx, Ronald Boellaard

Research output: Contribution to journalArticleAcademicpeer-review

76 Citations (Scopus)


Background: 18F-fluoro-2-deoxy-D-Glucose positron emission tomography (18F-FDG PET) radiomics has the potential to guide the clinical decision making in cancer patients, but validation is required before radiomics can be implemented in the clinical setting. The aim of this study was to explore how feature space reduction and repeatability of 18F-FDG PET radiomic features are affected by various sources of variation such as underlying data (e.g., object size and uptake), image reconstruction methods and settings, noise, discretization method, and delineation method. Methods: The NEMA image quality phantom was scanned with various sphere-to-background ratios (SBR), simulating different activity uptakes, including spheres with low uptake, that is, SBR smaller than 1. Furthermore, images of a phantom containing 3D printed inserts reflecting realistic heterogeneity uptake patterns were acquired. Data were reconstructed using various matrix sizes, reconstruction algorithms, and scan durations (noise). For every specific reconstruction and noise level, ten statistically equal replicates were generated. The phantom inserts were delineated using CT and PET-based segmentation methods. A total of 246 radiomic features was extracted from each image dataset. Images were discretized with a fixed number of 64 bins (FBN) and a fixed bin width (FBW) of 0.25 for the high and a FBW of 0.05 for the low uptake data. In terms of feature reduction, we determined the impact of these factors on the composition of feature clusters, which were defined on the basis of Spearman's correlation matrices. To assess feature repeatability, the intraclass correlation coefficient was calculated over the ten replicates. Results: In general, larger spheres with high uptake resulted in better repeatability compared to smaller low uptake spheres. In terms of repeatability, features extracted from heterogeneous phantom inserts were comparable to features extracted from bigger high uptake spheres. For example, for an EARL-compliant reconstruction, larger and smaller high uptake spheres yielded good repeatability for 32% and 30% of the features, while the heterogeneous inserts resulted in 34% repeatable features. For the low uptake spheres, this was the case for 22% and 20% of the features for bigger and smaller spheres, respectively. Images reconstructed with point-spread-function (PSF) resulted in the highest repeatability when compared with OSEM or time-of-flight, for example, 53%, 30%, and 32% of repeatable features, respectively (for unsmoothed data, discretized with FBN, 300 s scan duration). Reducing image noise (increasing scan duration and smoothing) and using CT-based segmentation for the low uptake spheres yielded improved repeatability. FBW discretization resulted in higher repeatability than FBN discretization, for example, 89% and 35% of the features, respectively (for the EARL-compliant reconstruction and larger high uptake spheres). Conclusion: Feature space reduction and repeatability of 18F-FDG PET radiomic features depended on all studied factors. The high sensitivity of PET radiomic features to image quality suggests that a high level of image acquisition and preprocessing standardization is required to be used as clinical imaging biomarker.
Original languageEnglish
Pages (from-to)665-678
JournalMedical physics
Issue number2
Publication statusPublished - Feb 2019


  • F-FDG PET/CT radiomic features
  • delineation
  • image reconstruction settings

Cite this