The human breast is composed of diverse cell types. Studies have delineated mammary epithelial cells, but the other cell types in the breast have scarcely been characterized. In order to gain insight into the cellular composition of the tissue, we performed droplet-mediated RNA sequencing of 3193 single cells isolated from a postmenopausal breast tissue without enriching for epithelial cells. Unbiased clustering analysis identified 10 distinct cell clusters, seven of which were nonepithelial devoid of cytokeratin expression. The remaining three cell clusters expressed cytokeratins (CKs), representing breast epithelial cells; Cluster 2 and Cluster 7 cells expressed luminal and basal CKs, respectively, whereas Cluster 9 cells expressed both luminal and basal CKs, as well as other CKs of unknown specificity. To assess which cell type(s) potentially contributes to breast cancer, we used the differential gene expression signature of each cell cluster to derive gene set variation analysis (GSVA) scores and classified breast tumors in The Cancer Gene Atlas (TGGA) dataset (n = 1100) by assigning the highest GSVA scoring cell cluster number for each tumor. The results showed that five clusters (Clusters 2, 3, 7, 8, and 9) could categorize >85% of breast tumors collectively. Notably, Cluster 2 (luminal epithelial) and Cluster 3 (fibroblast) tumors were equally prevalent in the luminal breast cancer subtypes, whereas Cluster 7 (basal epithelial) and Cluster 9 (other epithelial) tumors were present primarily in the triple-negative breast cancer (TNBC) subtype. Cluster 8 (immune) tumors were present in all subtypes, indicating that immune cells may contribute to breast cancer regardless of the subtypes. Cluster 9 tumors were significantly associated with poor patient survival in TNBC, suggesting that this epithelial cell type may give rise to an aggressive TNBC subset.
Keywords: GSVA; TCGA breast cancer dataset: breast cancer; cluster analysis; cytokeratin expression; mammary epithelial cells; mammary fibroblasts; normal breast; single-cell RNA sequencing; triple-negative breast cancer.