Analysis of breast cancer related gene expression using natural splines and the Cox proportional hazard model to identify prognostic associations

Bas Kreike, Guus Hart, Harry Bartelink, Marc J. van de Vijver

Research output: Contribution to journalArticleAcademicpeer-review

19 Citations (Scopus)

Abstract

Many studies correlating gene expression data to clinical parameters assume a linear increase or decrease of the clinical parameter under investigation with the expression of a gene. We have studied genes encoding important breast cancer-related proteins using a model for survival-type data that is based on natural splines and the Cox proportional hazard model, thereby removing the linearity assumption. Expression data of 16 genes were studied in relation to metastasis-free probability in a cohort of 295 consecutive breast cancer patients treated at The Netherlands Cancer Institute. The independent predictive power for disease outcome of the 16 individual genes was tested in a multivariable model with known clinical and pathological risk factors. There is a linear relationship between increasing expression and a higher or lower hazard for distant metastasis for ESR1, ERBB4, VEGF, CCNE2, EZH2, and UPA; for ERBB2, ERBB3, CCND1, CCNE1, EED, CXCR4, CCR7, SDF1, and PAI1 there is no clear increase or decrease; and for EGFR there seems to be a non-linear relation. Multivariable analysis showed that the 70-gene prognosis profile outperforms all the other variables in the model (hazard-rate 5.4, 95% CI 2.5-11.7; P = 0.000018). EGFR-expression seems to have a non-linear relation with disease outcome, indicating that lower but also higher expression of EGFR are associated with worse outcome compared to intermediate expression levels; the other genes show no or a linear relation
Original languageEnglish
Pages (from-to)711-720
JournalBreast cancer research and treatment
Volume122
Issue number3
DOIs
Publication statusPublished - 2010

Cite this