Background: Germline variants of ten keratin genes (K1, K2, K5, K6A, K6B, K9, K10, K14, K16, and K17) have been reported for causing different types of genodermatoses with an autosomal dominant mode of inheritance. Among all the variants of these ten keratin genes, most of them are missense variants. Unlike pathogenic and likely pathogenic variants, understanding the clinical importance of novel missense variants or variants of uncertain significance (VUS) is the biggest challenge for clinicians or medical geneticists. Functional characterization is the only way to understand the clinical association of novel missense variants or VUS but it is time consuming, costly, and depends on the availability of patient's samples. Existing databases report the pathogenic variants of the keratin genes, but never emphasize the systematic effects of these variants on keratin protein structure and genotype-phenotype correlation.
Results: To address this need, we developed a comprehensive database KVarPredDB, which contains information of all ten keratin genes associated with genodermatoses. We integrated and curated 400 reported pathogenic missense variants as well as 4629 missense VUS. KVarPredDB predicts the pathogenicity of novel missense variants as well as to understand the severity of disease phenotype, based on four criteria; firstly, the difference in physico-chemical properties between the wild type and substituted amino acids; secondly, the loss of inter/intra-chain interactions; thirdly, evolutionary conservation of the wild type amino acids and lastly, the effect of the substituted amino acids in the heptad repeat. Molecular docking simulations based on resolved crystal structures were adopted to predict stability changes and get the binding energy to compare the wild type protein with the mutated one. We use this basic information to determine the structural and functional impact of novel missense variants on the keratin coiled-coil heterodimer. KVarPredDB was built under the integrative web application development framework SSM (SpringBoot, Spring MVC, MyBatis) and implemented in Java, Bootstrap, React-mutation-mapper, MySQL, Tomcat. The website can be accessed through http://bioinfo.zju.edu.cn/KVarPredDB . The genomic variants and analysis results are freely available under the Creative Commons license.
Conclusions: KVarPredDB provides an intuitive and user-friendly interface with computational analytical investigation for each missense variant of the keratin genes associated with genodermatoses.
Keywords: Database; Genodermatoses; Keratin genes; Missense variants; Novel variants; Pathogenicity.