TY - JOUR
T1 - Exploring interaction effects in small samples increases rates of false-positive and false-negative findings
T2 - Results from a systematic review and simulation study
AU - Schmidt, Amand F.
AU - Groenwold, Rolf H. H.
AU - Knol, Mirjam J.
AU - Hoes, Arno W.
AU - Nielen, Mirjam
AU - Roes, Kit C. B.
AU - de Boer, Anthonius
AU - Klungel, Olaf H.
PY - 2014
Y1 - 2014
N2 - Objective To give a comprehensive comparison of the performance of commonly applied interaction tests. Methods A literature review and simulation study was performed evaluating interaction tests on the odds ratio (OR) or the risk difference (RD) scales: Cochran Q (Q), Breslow-Day (BD), Tarone, unconditional score, likelihood ratio (LR), Wald, and relative excess risk due to interaction (RERI)-based tests. Results Review results agreed with results from our simulation study, which showed that on the OR scale, in small sample sizes (eg, number of subjects ≤ 250) the type 1 error rates of the LR test was 0.10; the BD and Tarone tests showed results around 0.05. On the RD scale, the LR and RERI tests had error rates around 0.05. On both scales, tests did not differ regarding power. When exposure prevented the outcome RERI-based tests were relatively underpowered (eg, N = 100; RERI power = 5% vs. Wald power = 18%). With increasing sample size, difference decreased. Conclusion In small samples, interaction tests differed. On the OR scale, the Tarone and BD tests are recommended. On the RD scale, the LR and RERI-based tests performed best. However, RERI-based tests are underpowered compared with other tests, when exposure prevents the outcome, and sample size is limited. © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
AB - Objective To give a comprehensive comparison of the performance of commonly applied interaction tests. Methods A literature review and simulation study was performed evaluating interaction tests on the odds ratio (OR) or the risk difference (RD) scales: Cochran Q (Q), Breslow-Day (BD), Tarone, unconditional score, likelihood ratio (LR), Wald, and relative excess risk due to interaction (RERI)-based tests. Results Review results agreed with results from our simulation study, which showed that on the OR scale, in small sample sizes (eg, number of subjects ≤ 250) the type 1 error rates of the LR test was 0.10; the BD and Tarone tests showed results around 0.05. On the RD scale, the LR and RERI tests had error rates around 0.05. On both scales, tests did not differ regarding power. When exposure prevented the outcome RERI-based tests were relatively underpowered (eg, N = 100; RERI power = 5% vs. Wald power = 18%). With increasing sample size, difference decreased. Conclusion In small samples, interaction tests differed. On the OR scale, the Tarone and BD tests are recommended. On the RD scale, the LR and RERI-based tests performed best. However, RERI-based tests are underpowered compared with other tests, when exposure prevents the outcome, and sample size is limited. © 2014 The Authors. Published by Elsevier Inc. All rights reserved.
UR - https://www.scopus.com/inward/record.uri?partnerID=HzOxMe3b&scp=84902545752&origin=inward
UR - https://www.ncbi.nlm.nih.gov/pubmed/24768005
U2 - https://doi.org/10.1016/j.jclinepi.2014.02.008
DO - https://doi.org/10.1016/j.jclinepi.2014.02.008
M3 - Article
C2 - 24768005
SN - 0895-4356
VL - 67
SP - 821
EP - 829
JO - Journal of Clinical Epidemiology
JF - Journal of Clinical Epidemiology
IS - 7
ER -