Purpose During resections of brain tumors, neurosurgeons have to weigh the risk between residual tumor and damage to brain functions. Different perspectives on these risks result in practice variation. We present statistical methods to localize differences in extent of resection between institutions which should enable to reveal brain regions affected by such practice variation. Methods Synthetic data were generated by simulating spheres for brain, tumors, resection cavities, and an effect region in which a likelihood of surgical avoidance could be varied between institutions. Three statistical methods were investigated: a non-parametric permutation based approach, Fisher’s exact test, and a full Bayesian Markov chain Monte Carlo (MCMC) model. For all three methods the false discovery rate (FDR) was determined as a function of the cut-off value for the q-value or the highest density interval, and receiver operating characteristic and precision recall curves were created. Sensitivity to variations in the parameters of the synthetic model were investigated. Finally, all these methods were applied to retrospectively collected data of 77 brain tumor resections in two academic hospitals. Results Fisher’s method provided an accurate estimation of observed FDR in the synthetic data, whereas the permutation approach was too liberal and underestimated FDR. AUC values were similar for Fisher and Bayes methods, and superior to the permutation approach. Fisher’s method deteriorated and became too liberal for reduced tumor size, a smaller size of the effect region, a lower overall extent of resection, fewer patients per cohort, and a smaller discrepancy in surgical avoidance probabilities between the different surgical practices. In the retrospective patient data, all three methods identified a similar effect region, with lower estimated FDR in Fisher’s method than using the permutation method. Conclusions Differences in surgical practice may be detected using voxel statistics. Fisher’s test provides a fast method to localize differences but could underestimate true FDR. Bayesian MCMC is more flexible and easily extendable, and leads to similar results, but at increased computational cost.