Machine learning (ML) models often fail with data that deviates from their training distribution. This is a significant concern for ML-enabled devices as data drift may lead to unexpected performance. This work introduces a new framework for out of distribution (OOD) detection and data drift monitoring that combines ML and geometric methods with statistical process control (SPC). We investigated different design choices, including methods for extracting feature representations and drift quantification for OOD detection in individual images and as an approach for input data monitoring. We evaluated the framework for both identifying OOD images and demonstrating the ability to detect shifts in data streams over time. We demonstrated a proof-of-concept via the following tasks: 1) differentiating axial vs. non-axial CT images, 2) differentiating CXR vs. other radiographic imaging modalities, and 3) differentiating adult CXR vs. pediatric CXR. For the identification of individual OOD images, our framework achieved high sensitivity in detecting OOD inputs: 0.980 in CT, 0.984 in CXR, and 0.854 in pediatric CXR. Our framework is also adept at monitoring data streams and identifying the time a drift occurred. In our simulations tracking drift over time, it effectively detected a shift from CXR to non-CXR instantly, a transition from axial to non-axial CT within few days, and a drift from adult to pediatric CXRs within a day-all while maintaining a low false positive rate. Through additional experiments, we demonstrate the framework is modality-agnostic and independent from the underlying model structure, making it highly customizable for specific applications and broadly applicable across different imaging modalities and deployed ML models.
Keywords: Data drift monitoring; Medical imaging; Out of distribution detection; Statistical process control.
© 2024. This is a U.S. Government work and not under copyright protection in the US; foreign copyright protection may apply.