DIFLF: A domain-invariant features learning framework for single-source domain generalization in mammogram classification

Wanfang Xie; Zhenyu Liu; Litao Zhao; Meiyun Wang; Jie Tian; Jiangang Liu

doi:10.1016/j.cmpb.2025.108592

DIFLF: A domain-invariant features learning framework for single-source domain generalization in mammogram classification

Comput Methods Programs Biomed. 2025 Jan 6:261:108592. doi: 10.1016/j.cmpb.2025.108592. Online ahead of print.

Authors

Wanfang Xie¹, Zhenyu Liu², Litao Zhao¹, Meiyun Wang³, Jie Tian⁴, Jiangang Liu⁵

Affiliations

¹ School of Engineering Medicine, Beihang University, Beijing 100191, PR China; Key Laboratory of Big Data-Based Precision Medicine (Beihang University), Ministry of Industry and Information Technology of the People's Republic of China, Beijing 100191, PR China.
² CAS Key Laboratory of Molecular Imaging, Institute of Automation, Beijing 100190, PR China; University of Chinese Academy of Sciences, Beijing 100080, PR China.
³ Department of Medical Imaging, Henan Provincial People's Hospital & People's Hospital of Zhengzhou University, Zhengzhou 450003, PR China. Electronic address: [email protected].
⁴ School of Engineering Medicine, Beihang University, Beijing 100191, PR China; Key Laboratory of Big Data-Based Precision Medicine (Beihang University), Ministry of Industry and Information Technology of the People's Republic of China, Beijing 100191, PR China. Electronic address: [email protected].
⁵ School of Engineering Medicine, Beihang University, Beijing 100191, PR China; Key Laboratory of Big Data-Based Precision Medicine (Beihang University), Ministry of Industry and Information Technology of the People's Republic of China, Beijing 100191, PR China; Beijing Engineering Research Center of Cardiovascular Wisdom Diagnosis and Treatment, Beijing 100029, PR China. Electronic address: [email protected].

PMID: 39813937
DOI: 10.1016/j.cmpb.2025.108592

Abstract

Background and objective: Single-source domain generalization (SSDG) aims to generalize a deep learning (DL) model trained on one source dataset to multiple unseen datasets. This is important for the clinical applications of DL-based models to breast cancer screening, wherein a DL-based model is commonly developed in an institute and then tested in other institutes. One challenge of SSDG is to alleviate the domain shifts using only one domain dataset.

Methods: The present study proposed a domain-invariant features learning framework (DIFLF) for single-source domain. Specifically, a style-augmentation module (SAM) and a content-style disentanglement module (CSDM) are proposed in DIFLF. SAM includes two different color jitter transforms, which transforms each mammogram in the source domain into two synthesized mammograms with new styles. Thus, it can greatly increase the feature diversity of the source domain, reducing the overfitting of the trained model. CSDM includes three feature disentanglement units, which extracts domain-invariant content (DIC) features by disentangling them from domain-specific style (DSS) features, reducing the influence of the domain shifts resulting from different feature distributions. Our code is available for open access on Github (https://github.com/85675/DIFLF).

Results: DIFLF is trained in a private dataset (PRI1), and tested first in another private dataset (PRI2) with similar feature distribution to PRI1 and then tested in two public datasets (INbreast and MIAS) with greatly different feature distributions from PRI1. As revealed by the experiment results, DIFLF presents excellent performance for classifying mammograms in the unseen target datasets of PRI2, INbreast, and MIAS. The accuracy and AUC of DIFLF are 0.917 and 0.928 in PRI2, 0.882 and 0.893 in INbreast, 0.767 and 0.710 in MIAS, respectively.

Conclusions: DIFLF can alleviate the influence of domain shifts only using one source dataset. Moreover, DIFLF can achieve an excellent mammogram classification performance even in the unseen datasets with great feature distribution differences from the training dataset.

Keywords: Breast cancer; Content-style disentanglement module; Deep learning; Domain generalization; Mammogram, Style-augmentation module.