Machine learning based lineage tree reconstruction improved with knowledge of higher level relationships between cells and genomic barcodes

NAR Genom Bioinform. 2023 Aug 21;5(3):lqad077. doi: 10.1093/nargab/lqad077. eCollection 2023 Sep.

Abstract

Tracking cells as they divide and progress through differentiation is a fundamental step in understanding many biological processes, such as the development of organisms and progression of diseases. In this study, we investigate a machine learning approach to reconstruct lineage trees in experimental systems based on mutating synthetic genomic barcodes. We refine previously proposed methodology by embedding information of higher level relationships between cells and single-cell barcode values into a feature space. We test performance of the algorithm on shallow trees (up to 100 cells) and deep trees (up to 10 000 cells). Our proposed algorithm can improve tree reconstruction accuracy in comparison to reconstructions based on a maximum parsimony method, but this comes at a higher computational time requirement.