Currently, the development of deep learning-based multimodal learning is advancing rapidly, and is widely used in the field of artificial intelligence-generated content, such as image-text conversion and image-text generation. Electronic health records are digital information such as numbers, charts, and texts generated by medical staff using information systems in the process of medical activities. The multimodal fusion method of electronic health records based on deep learning can assist medical staff in the medical field to comprehensively analyze a large number of medical multimodal data generated in the process of diagnosis and treatment, thereby achieving accurate diagnosis and timely intervention for patients. In this article, we firstly introduce the methods and development trends of deep learning-based multimodal data fusion. Secondly, we summarize and compare the fusion of structured electronic medical records with other medical data such as images and texts, focusing on the clinical application types, sample sizes, and the fusion methods involved in the research. Through the analysis and summary of the literature, the deep learning methods for fusion of different medical modal data are as follows: first, selecting the appropriate pre-trained model according to the data modality for feature representation and post-fusion, and secondly, fusing based on the attention mechanism. Lastly, the difficulties encountered in multimodal medical data fusion and its developmental directions, including modeling methods, evaluation and application of models, are discussed. Through this review article, we expect to provide reference information for the establishment of models that can comprehensively utilize various modal medical data.
目前基于深度学习的多模态学习发展迅速,在图文转换、图文生成等人工智能生成内容领域得到广泛应用。电子病历是医务人员在医疗活动过程中使用信息系统生成的数字、图表和文本等数字化信息。基于深度学习的电子病历多模态融合能辅助医护人员综合分析诊疗过程中产生的医学多模态数据,从而对患者进行精准诊断和及时干预。本文首先介绍了基于深度学习的多模态数据融合方法以及发展趋势;其次,对结构化电子病历数据与影像、文本等其他模态医学数据的融合进行了对比归纳,重点介绍了研究涉及的临床应用场景、样本量、融合方法等;通过分析,总结了针对不同模态医学数据融合的深度学习方法:一是根据数据模态选择合适的预训练模型进行特征表征后融合,二是基于注意力机制进行融合;最后,讨论了医学多模态融合中的难点及发展方向,包括建模方法、模型评估应用等。通过本文综述,期望为建立能综合利用各类模态医学数据的算法模型提供参考信息。.
Keywords: Deep learning; Electronic health records; Medical artificial intelligence; Multimodal fusion; Multimodal medical data.