An explainable longitudinal multi-modal fusion model for predicting neoadjuvant therapy response in women with breast cancer

Nat Commun. 2024 Nov 7;15(1):9613. doi: 10.1038/s41467-024-53450-8.

Abstract

Multi-modal image analysis using deep learning (DL) lays the foundation for neoadjuvant treatment (NAT) response monitoring. However, existing methods prioritize extracting multi-modal features to enhance predictive performance, with limited consideration on real-world clinical applicability, particularly in longitudinal NAT scenarios with multi-modal data. Here, we propose the Multi-modal Response Prediction (MRP) system, designed to mimic real-world physician assessments of NAT responses in breast cancer. To enhance feasibility, MRP integrates cross-modal knowledge mining and temporal information embedding strategy to handle missing modalities and remain less affected by different NAT settings. We validated MRP through multi-center studies and multinational reader studies. MRP exhibited comparable robustness to breast radiologists, outperforming humans in predicting pathological complete response in the Pre-NAT phase (ΔAUROC 14% and 10% on in-house and external datasets, respectively). Furthermore, we assessed MRP's clinical utility impact on treatment decision-making. MRP may have profound implications for enrolment into NAT trials and determining surgery extensiveness.

Publication types

  • Multicenter Study

MeSH terms

  • Adult
  • Aged
  • Breast Neoplasms* / drug therapy
  • Breast Neoplasms* / pathology
  • Breast Neoplasms* / surgery
  • Breast Neoplasms* / therapy
  • Deep Learning*
  • Female
  • Humans
  • Longitudinal Studies
  • Middle Aged
  • Multimodal Imaging / methods
  • Neoadjuvant Therapy* / methods
  • Treatment Outcome