SARS-CoV-2 virus pathogenicity and transmissibility are correlated with the mutations acquired over time, giving rise to variants of concern (VOCs). Mutations can significantly influence the genetic make-up of the virus. Herein, we analyzed the SARS-CoV-2 genomes and sub-genomic nucleotide composition in relation to the mutation rate. Nucleotide percentage distributions of 1397 in-house-sequenced SARS-CoV-2 genomes were enumerated, and comparative analyses (i) within the VOCs and of (ii) recovered and mortality patients were performed. Fisher's test was carried out to highlight the significant mutations, followed by RNA secondary structure prediction and protein modeling for their functional impacts. Subsequently, a uniform dinucleotide composition of AT and GC was found across study cohorts. Notably, the N gene was observed to have a high GC percentage coupled with a relatively higher mutation rate. Functional analysis demonstrated the N gene mutations, C29144T and G29332T, to induce structural changes at the RNA level. Protein secondary structure prediction with N gene missense mutations revealed a differential composition of alpha helices, beta sheets, and coils, whereas the tertiary structure displayed no significant changes. Additionally, the N gene CTD region displayed no mutations. The analysis highlighted the importance of N protein in viral evolution with CTD as a possible target for antiviral drugs.
Keywords: RNA secondary structure; VOCs; molecular modeling; mutation analysis; nucleotide diversity.