In clinical research, the analysis of patient cohorts is a widely employed method for investigating relevant healthcare questions. The ability to automatically extract large-scale patient cohorts from hospital systems is vital in order to unlock the potential of real-world clinical data, and answer pivotal medical questions through retrospective research studies. However, existing medical data is often dispersed across various systems and databases, preventing a systematic approach to access and interoperability. Even when the data are readily accessible, clinical researchers need to sift through Electronic Medical Records, confirm ethical approval, verify status of patient consent, check the availability of imaging data, and filter the data based on disease-specific image biomarkers. We present Cohort Builder, a software pipeline designed to facilitate the creation of patient cohorts with predefined baseline characteristics from real-world ophthalmic imaging data and electronic medical records. The applicability of our approach extends beyond ophthalmology to other medical domains with similar requirements such as neurology, cardiology and orthopedics.
Keywords: Biomedical Information Retrieval; Clinical Research; Data Pipeline; Ophthalmology; Real-World Data.