Integrating Biological and Radiological Data in a Structured Repository: a Data Model Applied to the COSMOS Case Study

J Digit Imaging. 2022 Aug;35(4):970-982. doi: 10.1007/s10278-022-00615-w. Epub 2022 Mar 16.

Abstract

Integrating the information coming from biological samples with digital data, such as medical images, has gained prominence with the advent of precision medicine. Research in this field faces an ever-increasing amount of data to manage and, as a consequence, the need to structure these data in a functional and standardized fashion to promote and facilitate cooperation among institutions. Inspired by the Minimum Information About BIobank data Sharing (MIABIS), we propose an extended data model which aims to standardize data collections where both biological and digital samples are involved. In the proposed model, strong emphasis is given to the cause-effect relationships among factors as these are frequently encountered in clinical workflows. To test the data model in a realistic context, we consider the Continuous Observation of SMOking Subjects (COSMOS) dataset as case study, consisting of 10 consecutive years of lung cancer screening and follow-up on more than 5000 subjects. The structure of the COSMOS database, implemented to facilitate the process of data retrieval, is therefore presented along with a description of data that we hope to share in a public repository for lung cancer screening research.

Keywords: Lung cancer screening; Radiology workflow; Standardization; Structured reporting.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Databases, Factual
  • Early Detection of Cancer*
  • Humans
  • Information Storage and Retrieval
  • Lung Neoplasms* / diagnostic imaging
  • Smoking