Assessing large multimodal models for one-shot learning and interpretability in biomedical image classification

bioRxiv [Preprint]. 2024 Oct 8:2023.12.31.573796. doi: 10.1101/2023.12.31.573796.

Abstract

Image classification plays a pivotal role in analyzing biomedical images, serving as a cornerstone for both biological research and clinical diagnostics. We demonstrate that large multimodal models (LMMs), like GPT-4, excel in one-shot learning, generalization, interpretability, and text-driven image classification across diverse biomedical tasks. These tasks include the classification of tissues, cell types, cellular states, and disease status. LMMs stand out from traditional single-modal classification approaches, which often require large training datasets and offer limited interpretability.

Publication types

  • Preprint