TY - JOUR
T1 - Representation of exposures in regression analysis and interpretation of regression coefficients: basic concepts and pitfalls
AU - Leffondré, Karen
AU - Jager, Kitty J.
AU - Boucquemont, Julie
AU - Stel, Vianda S.
AU - Heinze, Georg
PY - 2014
Y1 - 2014
N2 - Regression models are being used to quantify the effect of an exposure on an outcome, while adjusting for potential confounders. While the type of regression model to be used is determined by the nature of the outcome variable, e.g. linear regression has to be applied for continuous outcome variables, all regression models can handle any kind of exposure variables. However, some fundamentals of representation of the exposure in a regression model and also some potential pitfalls have to be kept in mind in order to obtain meaningful interpretation of results. The objective of this educational paper was to illustrate these fundamentals and pitfalls, using various multiple regression models applied to data from a hypothetical cohort of 3000 patients with chronic kidney disease. In particular, we illustrate how to represent different types of exposure variables (binary, categorical with two or more categories and continuous), and how to interpret the regression coefficients in linear, logistic and Cox models. We also discuss the linearity assumption in these models, and show how wrongly assuming linearity may produce biased results and how flexible modelling using spline functions may provide better estimates
AB - Regression models are being used to quantify the effect of an exposure on an outcome, while adjusting for potential confounders. While the type of regression model to be used is determined by the nature of the outcome variable, e.g. linear regression has to be applied for continuous outcome variables, all regression models can handle any kind of exposure variables. However, some fundamentals of representation of the exposure in a regression model and also some potential pitfalls have to be kept in mind in order to obtain meaningful interpretation of results. The objective of this educational paper was to illustrate these fundamentals and pitfalls, using various multiple regression models applied to data from a hypothetical cohort of 3000 patients with chronic kidney disease. In particular, we illustrate how to represent different types of exposure variables (binary, categorical with two or more categories and continuous), and how to interpret the regression coefficients in linear, logistic and Cox models. We also discuss the linearity assumption in these models, and show how wrongly assuming linearity may produce biased results and how flexible modelling using spline functions may provide better estimates
U2 - https://doi.org/10.1093/ndt/gft500
DO - https://doi.org/10.1093/ndt/gft500
M3 - Article
C2 - 24366898
SN - 0931-0509
VL - 29
SP - 1806
EP - 1814
JO - Nephrology, dialysis, transplantation
JF - Nephrology, dialysis, transplantation
IS - 10
ER -