Separating gene clustering in the rare mucopolysaccharidosis disease

J Appl Genet. 2022 May;63(2):361-368. doi: 10.1007/s13353-022-00691-2. Epub 2022 Mar 24.

Abstract

Rare disease datasets are typically structured such that a small number of patients (cases) are represented by multidimensional feature vectors. In this report, we considered a rare disease, mucopolysaccharidosis (MPS). This disease is divided into 11 types and subtypes, depending on the genetic defect, type of deficient enzyme, and nature of accumulated glycosaminoglycan(s). Among them, 7 types are known as possibly neuronopathic and 4 are non-neuronopathic, and in the case of the former group, prediction of the course of the disease is crucial for patient's treatment and the management. Here, we have used transcriptomic data available for one patient from each MPS type/subtype. The approach to gene grouping considered by us was based on the minimization of the perceptron criterion in the form of convex and piecewise linear function (CPL). This approach allows designing complexes of linear classifiers on the basis of small samples of multivariate vectors. As a result, distinguishing neuronopathic and non-neuronopathic forms of MPS was possible on the basis of bioinformatic analysis of gene expression patterns where each MPS type was represented by only one patient. This approach can be potentially used also for assessing other features of patients suffering from rare diseases, for which large body of data (like transcriptomic data) is available from only one or a few representatives.

Keywords: Complexes of linear classifiers; Data mining; Gene clustering; Perceptron criterion function; Rare diseases.

MeSH terms

  • Cluster Analysis
  • Humans
  • Mucopolysaccharidoses* / genetics
  • Rare Diseases*
  • Transcriptome / genetics