Functional prediction and comparative population analysis of variants in genes for proteases and innate immunity related to SARS-CoV-2 infection

Infect Genet Evol. 2020 Oct:84:104498. doi: 10.1016/j.meegid.2020.104498. Epub 2020 Aug 7.

Abstract

New coronavirus SARS-CoV-2 is capable to infect humans and cause a novel disease COVID-19. Aiming to understand a host genetic component of COVID-19, we focused on variants in genes encoding proteases and genes involved in innate immunity that could be important for susceptibility and resistance to SARS-CoV-2 infection. Analysis of sequence data of coding regions of FURIN, PLG, PRSS1, TMPRSS11a, MBL2 and OAS1 genes in 143 unrelated individuals from Serbian population identified 22 variants with potential functional effect. In silico analyses (PolyPhen-2, SIFT, MutPred2 and Swiss-Pdb Viewer) predicted that 10 variants could impact the structure and/or function of proteins. These protein-altering variants (p.Gly146Ser in FURIN; p.Arg261His and p.Ala494Val in PLG; p.Asn54Lys in PRSS1; p.Arg52Cys, p.Gly54Asp and p.Gly57Glu in MBL2; p.Arg47Gln, p.Ile99Val and p.Arg130His in OAS1) may have predictive value for inter-individual differences in the response to the SARS-CoV-2 infection. Next, we performed comparative population analysis for the same variants using extracted data from the 1000 Genomes project. Population genetic variability was assessed using delta MAF and Fst statistics. Our study pointed to 7 variants in PLG, TMPRSS11a, MBL2 and OAS1 genes with noticeable divergence in allelic frequencies between populations worldwide. Three of them, all in MBL2 gene, were predicted to be damaging, making them the most promising population-specific markers related to SARS-CoV-2 infection. Comparing allelic frequencies between Serbian and other populations, we found that the highest level of genetic divergence related to selected loci was observed with African, followed by East Asian, Central and South American and South Asian populations. When compared with European populations, the highest divergence was observed with Italian population. In conclusion, we identified 4 variants in genes encoding proteases (FURIN, PLG and PRSS1) and 6 in genes involved in the innate immunity (MBL2 and OAS1) that might be relevant for the host response to SARS-CoV-2 infection.

Keywords: Allele frequencies; COVID-19; Functional prediction; Gene variants; Host genomics; Population genomics; SARS-CoV-2; Susceptibility and resistance.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alleles
  • Angiotensin-Converting Enzyme 2
  • Betacoronavirus / immunology
  • Betacoronavirus / pathogenicity
  • COVID-19
  • Coronavirus Infections / genetics*
  • Coronavirus Infections / immunology
  • Disease Resistance / genetics*
  • Eye Proteins / genetics
  • Eye Proteins / immunology
  • Furin / genetics
  • Furin / immunology
  • Gene Frequency
  • Genetic Predisposition to Disease*
  • Genetic Variation
  • Genome, Human
  • Host-Pathogen Interactions / genetics*
  • Host-Pathogen Interactions / immunology
  • Humans
  • Immunity, Innate
  • Mannose-Binding Lectin / genetics
  • Mannose-Binding Lectin / immunology
  • Membrane Glycoproteins / genetics
  • Membrane Glycoproteins / immunology
  • Metagenomics*
  • Pandemics
  • Peptidyl-Dipeptidase A / genetics*
  • Peptidyl-Dipeptidase A / immunology
  • Plasminogen / genetics
  • Plasminogen / immunology
  • Pneumonia, Viral / genetics*
  • Pneumonia, Viral / immunology
  • Protein Binding
  • SARS-CoV-2
  • Spike Glycoprotein, Coronavirus / genetics*
  • Spike Glycoprotein, Coronavirus / immunology
  • Trypsin / genetics
  • Trypsin / immunology

Substances

  • Eye Proteins
  • GPR143 protein, human
  • MBL2 protein, human
  • Mannose-Binding Lectin
  • Membrane Glycoproteins
  • Spike Glycoprotein, Coronavirus
  • spike protein, SARS-CoV-2
  • Plasminogen
  • Peptidyl-Dipeptidase A
  • ACE2 protein, human
  • Angiotensin-Converting Enzyme 2
  • PRSS1 protein, human
  • Trypsin
  • FURIN protein, human
  • Furin