Microsatellites are repeats of 1- to 6-bp units, and approximately 10 million microsatellites have been identified across the human genome. Microsatellites are vulnerable to DNA mismatch errors and have thus been used to detect cancers with mismatch repair deficiency. To reveal the mutational landscape of microsatellite repeat regions at the genome level, we analyzed approximately 20.1 billion microsatellites in 2717 whole genomes of pan-cancer samples across 21 tissue types. First, we developed a new insertion and deletion caller (MIMcall) that takes into consideration the error patterns of different types of microsatellites. Among the 2717 pan-cancer samples, our analysis identified 31 samples, including colorectal, uterus, and stomach cancers, with a higher proportion of mutated microsatellite (≥0.03), which we defined as microsatellite instability (MSI) cancers of genome-wide level. Next, we found 20 highly mutated microsatellites that can be used to detect MSI cancers with high sensitivity. Third, we found that replication timing and DNA shape were significantly associated with mutation rates of microsatellites. Last, analysis of mutations in mismatch repair genes showed that somatic SNVs and short indels had larger functional impacts than germline mutations and structural variations. Our analysis provides a comprehensive picture of mutations in the microsatellite regions and reveals possible causes of mutations, as well as provides a useful marker set for MSI detection.
© 2020 Fujimoto et al.; Published by Cold Spring Harbor Laboratory Press.