An important aspect of hearing is the degree to which listeners have to deploy effort to understand speech. One promising measure of listening effort is task-evoked pupil dilation. Here, we use functional magnetic resonance imaging (fMRI) to identify the neural correlates of pupil dilation during comprehension of degraded spoken sentences in 17 normal-hearing listeners. Subjects listened to sentences degraded in three different ways: the target female speech was masked by fluctuating noise, by speech from a single male speaker, or the target speech was noise-vocoded. The degree of degradation was individually adapted such that 50% or 84% of the sentences were intelligible. Control conditions included clear speech in quiet, and silent trials.The peak pupil dilation was larger for the 50% compared to the 84% intelligibility condition, and largest for speech masked by the single-talker masker, followed by speech masked by fluctuating noise, and smallest for noise-vocoded speech. Activation in the bilateral superior temporal gyrus (STG) showed the same pattern, with most extensive activation for speech masked by the single-talker masker. Larger peak pupil dilation was associated with more activation in the bilateral STG, bilateral ventral and dorsal anterior cingulate cortex and several frontal brain areas. A subset of the temporal region sensitive to pupil dilation was also sensitive to speech intelligibility and degradation type. These results show that pupil dilation during speech perception in challenging conditions reflects both auditory and cognitive processes that are recruited to cope with degraded speech and the need to segregate target speech from interfering sounds. © 2014 Elsevier Inc.