Glioblastoma ( 'GBM' ) is the most aggressive type of primary malignant adult brain tumor, with very heterogeneous radio-graphic, histologic, and molecular profiles. A growing body of advanced computational analyses are conducted towards further understanding the biology and variation in glioblastoma. To address the intrinsic heterogeneity among different computational studies, reference standards have been established to facilitate both radiographic and molecular analyses, e.g., anatomical atlas for image registration and housekeeping genes, respectively. However, there is an apparent lack of reference standards in the domain of digital pathology, where each independent study uses an arbitrarily chosen slide from their evaluation dataset for normalization purposes. In this study, we introduce a novel stain normalization approach based on a composite reference slide comprised of information from a large population of anatomically annotated hematoxylin and eosin ( 'H&E' ) whole-slide images from the Ivy Glioblastoma Atlas Project ( 'IvyGAP' ). Two board-certified neuropathologists manually reviewed and selected annotations in 509 slides, according to the World Health Organization definitions. We computed summary statistics from each of these approved annotations and weighted them based on their percent contribution to overall slide ( 'PCOS' ), to form a global histogram and stain vectors. Quantitative evaluation of pre- and post-normalization stain density statistics for each annotated region with PCOS > 0.05% yielded a significant (largest p = 0.001, two-sided Wilcoxon rank sum test) reduction of its intensity variation for both 'H' & 'E' . Subject to further large-scale evaluation, our findings support the proposed approach as a potentially robust population-based reference for stain normalization.
Keywords: Brain tumor; Computational pathology; Digital pathology; Glioblastoma; Histology; Pre-processing; Stain normalization.