Identifying cardiovascular risk factor-related dietary patterns with reduced rank regression and random forest in the EPIC-NL cohort

Sander Biesbroek, Daphne L. Van Der A., Marinka C.C. Brosens, Joline W.J. Beulens, W. M.Monique Verschuren, Yvonne T. Van Der Schouw, Jolanda M.A. Boer

Research output: Contribution to journalArticleAcademicpeer-review

31 Citations (Scopus)


Background: Several methods are used to determine dietary patterns. Hybrid methods incorporate information on nutrient intake or biological factors to extract patterns relevant to disease etiology. Objective: We explore differences between patterns derived with 2 hybrid methods with those obtained by a posteriori methods and compare associations of these patterns with coronary artery disease (CAD) and stroke risk. Design: Food-frequency questionnaires were used to estimate dietary intake in 34,644 participants of European Prospective Investigation into Cancer-Netherlands at baseline (1993-1997). Follow- up was complete until 31 December 2007. Hybrid methods to determine dietary patterns were reduced rank regression (RRR) and random forest with classification tree analysis (RF-CTA). Included risk factors were body mass index, total:high-density lipoprotein cholesterol ratio, and systolic blood pressure. Results were compared with those from principal component analysis (PCA) and k-means cluster analysis (KCA), respectively. Results: Both RRR and PCA derived a "Western," "prudent," and "traditional pattern." All RRR patterns were significantly associated with CAD risk [highest vs. lowest quartile factor score; HR: 1.45 (95% CI: 1.25, 1.69), 0.86 (0.74, 0.99), and 1.25 (1.07, 1.47), respectively]. Only the prudent RRR factor was statistically significant associated with stroke (HR: 0.76; 95% CI: 0.59, 0.97). From the PCA patterns, only the traditional pattern was associated with CAD (HR: 1.29; 95% CI: 1.11, 1.50). RF-CTA derived 7 dietary patterns that could be categorized as "Western-like," "prudent-like," and "traditional-like." KCA established a prudent and Western cluster. Compared with the RF-CTA "prudent-like 1" pattern, only the "traditional-like 1" pattern was associated with CAD (HR: 1.36; 955 CI: 1.12, 1.65). None of the RF-CTA groups were associated with stroke. Compared with the Western KCA cluster, the prudent cluster was not associated with CAD or stroke. Conclusion: Including risk factors in RRR and RF-CTA resulted in small differences in food groups, contributing to similar patterns that showed in general stronger associations with CAD than PCA and KCA, respectively.

Original languageEnglish
Pages (from-to)146-154
Number of pages9
JournalAmerican Journal of Clinical Nutrition
Issue number1
Publication statusPublished - 1 Jul 2015


  • Cardiovascular diseases
  • Dietary patterns
  • Principal component analysis
  • Random forest
  • Reduced rank regression

Cite this