Translating the success of deep learning-based computer-assisted classification into clinical adaptation hinges on the ability to explain a prediction's causality. Post-hoc interpretability approaches, especially counterfactual techniques, have shown both technical and psychological potential. Nevertheless, currently dominant approaches utilize heuristic, unvalidated methodology. Thereby, they potentially operate the underlying networks outside their validated domain, adding doubt in the predictor's abilities instead of generating knowledge and trust. In this work, we investigate this out-of-distribution problem for medical image pathology classifiers and propose marginalization techniques and evaluation procedures to overcome it. Furthermore, we propose a complete domain-aware pipeline for radiology environments. Its validity is demonstrated on a synthetic and two publicly available image datasets. Specifically, we evaluate using the CBIS-DDSM/DDSM mammography collection and the Chest X-ray14 radiographs. Our solution shows, both quantitatively and qualitatively, a significant reduction of localization ambiguity and clearer conveying results.