Background: The Four-Dimensional Symptom Questionnaire (4DSQ) is a self-report questionnaire designed to measure distress, depression, anxiety, and somatization. Prior to computing scale scores from the item scores, the three highest response alternatives ('Regularly', 'Often', and 'Very often or constantly present') are usually collapsed into one category to reduce the influence of extreme responding on item- and scale scores. In this study, we evaluate the usefulness of this transformation for the distress scale based on a variety of criteria. Methods: Specifically, by using the Graded Response Model, we investigated the effect of this transformation on model fit, local measurement precision, and various indicators of the scale's validity to get an indication on whether the current practice of recoding should be advocated or not. In particular, the effect on the convergent- (operationalized by the General Health Questionnaire and the Maastricht Questionnaire), divergent- (operationalized by the Neuroticism scale of the NEO-FFI), and predictive validity (operationalized as obtrusion with daily chores and activities, the Biographical Problem list and the Utrecht Burnout Scale) of the distress scale was investigated. Results: Results indicate that recoding leads to (i) better model fit as indicated by lower mean probabilities of exact test statistics assessing item fit, (ii) small (<.02) losses in the sizes of various validity coefficients, and (iii) a decrease (DIFF (SE's) =.10-.25) in measurement precision for medium and high levels of distress. Conclusions: For clinical applications and applications in longitudinal research, the current practice of recoding should be avoided because recoding decreases measurement precision for medium and high levels of distress. It would be interesting to see whether this advice also holds for the three other domains of the 4DSQ.