Significance: Optimal meibography utilization and interpretation are hindered due to poor lid presentation, blurry images, or image artifacts and the challenges of applying clinical grading scales. These results, using the largest image dataset analyzed to date, demonstrate development of algorithms that provide standardized, real-time inference that addresses all of these limitations.
Purpose: This study aimed to develop and validate an algorithmic pipeline to automate and standardize meibomian gland absence assessment and interpretation.
Methods: A total of 143,476 images were collected from sites across North America. Ophthalmologist and optometrist experts established ground-truth image quality and quantification (i.e., degree of gland absence). Annotated images were allocated into training, validation, and test sets. Convolutional neural networks within Google Cloud VertexAI trained three locally deployable or edge-based predictive models: image quality detection, over-flip detection, and gland absence detection. The algorithms were combined into an algorithmic pipeline onboard a LipiScan Dynamic Meibomian Imager to provide real-time clinical inference for new images. Performance metrics were generated for each algorithm in the pipeline onboard the LipiScan from naive image test sets.
Results: Individual model performance metrics included the following: weighted average precision (image quality detection: 0.81, over-flip detection: 0.88, gland absence detection: 0.84), weighted average recall (image quality detection: 0.80, over-flip detection: 0.87, gland absence detection: 0.80), weighted average F1 score (image quality detection: 0.80, over-flip detection: 0.87, gland absence detection: 0.81), overall accuracy (image quality detection: 0.80, over-flip detection: 0.87, gland absence detection: 0.80), Cohen κ (image quality detection: 0.60, over-flip detection: 0.62, and gland absence detection: 0.71), Kendall τb (image quality detection: 0.61, p<0.001, over-flip detection: 0.63, p<0.001, and gland absence detection: 0.67, p<001), and Matthews coefficient (image quality detection: 0.61, over-flip detection: 0.63, and gland absence detection: 0.62). Area under the precision-recall curve (image quality detection: 0.87 over-flip detection: 0.92, gland absence detection: 0.89) and area under the receiver operating characteristic curve (image quality detection: 0.88, over-flip detection: 0.91 gland absence detection: 0.93) were calculated across a common set of thresholds, ranging from 0 to 1.
Conclusions: Comparison of predictions from each model to expert panel ground-truth demonstrated strong association and moderate to substantial agreement. The findings and performance metrics show that the pipeline of algorithms provides standardized, real-time inference/prediction of meibomian gland absence.
Copyright © 2025 The Author(s). Published by Wolters Kluwer Health, Inc. on behalf of the American Academy of Optometry.