Objective: In randomized controlled trials (RCTs), outcome variables are of-ten patient-reported outcomes measured with questionnaires. Ideally, all available item information is used for score construction, which requires an item response theory (JRT) measurement model. However, in practice, the classical test theory measurement model (sum scores) is mostly used, and differences between response patterns leading to the same sum score are ignored. The enhanced differentiation between scores with IRT enables more precise estimation of individual trajectories over time and group effects. The objective of this study was to show the advantages of using IRT scores instead of sum scores when analyzing RCTs. Study Design and Setting: Two studies are presented, a real-life RCT, and a simulation study. Both IRT and sum scores are used to measure the construct and are subsequently used as outcomes for effect calculation. Results: The bias in RCT results is conditional on the measurement model that was used to construct the scores. A bias in estimated trend of around one standard deviation was found when sum scores were used, where IRT showed negligible bias. Conclusion: Accurate statistical inferences are made from an RCT study when using IRT to estimate construct measurements. The use of sum scores leads to incorrect RCT results. (C) 2016 Elsevier Inc. All rights reserved.