Abstract
BACKGROUND: Cell-free DNA (cfDNA) analysis holds great promise for non-invasive cancer screening, diagnosis, and monitoring. We hypothesized that mining the patterns of cfDNA shallow whole-genome sequencing datasets from patients with cancer could improve cancer detection. METHODS: By applying unsupervised clustering and supervised machine learning on large cfDNA shallow whole-genome sequencing datasets from healthy individuals (n = 367) and patients with different hematological (n = 238) and solid malignancies (n = 320), we identified cfDNA signatures that enabled cancer detection and typing. RESULTS: Unsupervised clustering revealed cancer type-specific sub-grouping. Classification using a supervised machine learning model yielded accuracies of 96% and 65% in discriminating hematological and solid malignancies from healthy controls, respectively. The accuracy of disease type prediction was 85% and 70% for the hematological and solid cancers, respectively. The potential utility of managing a specific cancer was demonstrated by classifying benign from invasive and borderline adnexal masses with an area under the curve of 0.87 and 0.74, respectively. CONCLUSIONS: This approach provides a generic analytical strategy for non-invasive pan-cancer detection and cancer type prediction.
Original language | English |
---|---|
Pages (from-to) | 1164-1176 |
Number of pages | 13 |
Journal | Clinical Chemistry |
Volume | 68 |
Issue number | 9 |
DOIs | |
Publication status | Published - 1 Sept 2022 |
Keywords
- cfDNA
- ctDNA
- hematological malignancies
- liquid biopsy
- machine learning
- ovarian tumors
- solid tumors