TY - JOUR
T1 - Scoring based on item response theory did not alter the measurement ability of EORTC QLQ-C30 scales
AU - Petersen, Morten Aa
AU - Groenvold, Mogens
AU - Aaronson, Neil
AU - Brenne, Elisabeth
AU - Fayers, Peter
AU - Nielsen, Julie Damgaard
AU - Sprangers, Mirjam
AU - Bjorner, Jakob B.
PY - 2005
Y1 - 2005
N2 - Background and Objectives: Most health-related quality-of-life questionnaires include multi-item scales. Scale scores are usually estimated as simple sums of the item scores. However, scoring procedures utilizing more information from the items might improve measurement abilities, and thereby reduce the needed sample sizes. We investigated whether item response theory (IRT)-based scoring improved the measurement abilities of the EORTC QLQ-C30 physical functioning, emotional functioning, and fatigue scales. Methods: Using a database of 13,0 10 subjects we estimated the relative validities of IRT scoring compared to sum scoring of the scales. Results: The mean relative validities were 1.04 (physical), 1.03 (emotional), and 0.97 (fatigue). None of these were significantly larger than 1. Thus, no gain in measurement abilities using IRT scoring was found for these scales. Possible explanations include that the items in the scales are not constructed for IRT scoring and that the scales are relatively short. Conclusion: IRT scoring of the three longest EORTC QLQ-C30 scales did not improve measurement abilities compared to the traditional sum scoring of the scales. (c) 2005 Elsevier Inc. All rights reserved
AB - Background and Objectives: Most health-related quality-of-life questionnaires include multi-item scales. Scale scores are usually estimated as simple sums of the item scores. However, scoring procedures utilizing more information from the items might improve measurement abilities, and thereby reduce the needed sample sizes. We investigated whether item response theory (IRT)-based scoring improved the measurement abilities of the EORTC QLQ-C30 physical functioning, emotional functioning, and fatigue scales. Methods: Using a database of 13,0 10 subjects we estimated the relative validities of IRT scoring compared to sum scoring of the scales. Results: The mean relative validities were 1.04 (physical), 1.03 (emotional), and 0.97 (fatigue). None of these were significantly larger than 1. Thus, no gain in measurement abilities using IRT scoring was found for these scales. Possible explanations include that the items in the scales are not constructed for IRT scoring and that the scales are relatively short. Conclusion: IRT scoring of the three longest EORTC QLQ-C30 scales did not improve measurement abilities compared to the traditional sum scoring of the scales. (c) 2005 Elsevier Inc. All rights reserved
U2 - https://doi.org/10.1016/j.jclinepi.2005.02.008
DO - https://doi.org/10.1016/j.jclinepi.2005.02.008
M3 - Article
C2 - 16085193
SN - 0895-4356
VL - 58
SP - 902
EP - 908
JO - Journal of Clinical Epidemiology
JF - Journal of Clinical Epidemiology
IS - 9
ER -