Background: The 2019 novel coronavirus (2019-nCoV or SARS-CoV-2) has spread more rapidly than any other betacoronavirus including SARS-CoV and MERS-CoV. However, the mechanisms responsible for infection and molecular evolution of this virus remained unclear.
Methods: We collected and analyzed 120 genomic sequences of 2019-nCoV including 11 novel genomes from patients in China. Through comprehensive analysis of the available genome sequences of 2019-nCoV strains, we have tracked multiple inheritable SNPs and determined the evolution of 2019-nCoV relative to other coronaviruses.
Results: Systematic analysis of 120 genomic sequences of 2019-nCoV revealed co-circulation of two genetic subgroups with distinct SNPs markers, which can be used to trace the 2019-nCoV spreading pathways to different regions and countries. Although 2019-nCoV, human and bat SARS-CoV share high homologous in overall genome structures, they evolved into two distinct groups with different receptor entry specificities through potential recombination in the receptor binding regions. In addition, 2019-nCoV has a unique four amino acid insertion between S1 and S2 domains of the spike protein, which created a potential furin or TMPRSS2 cleavage site.
Conclusions: Our studies provided comprehensive insights into the evolution and spread of the 2019-nCoV. Our results provided evidence suggesting that 2019-nCoV may increase its infectivity through the receptor binding domain recombination and a cleavage site insertion.
One sentence summary: Novel 2019-nCoV sequences revealed the evolution and specificity of betacoronavirus with possible mechanisms of enhanced infectivity.