AngClust: Angle Feature-Based Clustering for Short Time Series Gene Expression Profiles

IEEE/ACM Trans Comput Biol Bioinform. 2023 Mar-Apr;20(2):1574-1580. doi: 10.1109/TCBB.2022.3192306. Epub 2023 Apr 3.

Abstract

When clustering gene expression, it is expected that correlation coefficients of genes in the same clusters are high, and that gene ontology (GO) enrichment analysis of most clusters will be significant. However, existing short-term gene expression clustering algorithms have limitations. To address this problem, we proposed a novel clustering process based on angular features for short-term gene expression. Our method (named AngClust) uses angular features to indicate the change of trend in gene expression levels at two neighboring time points. The changes of angles at multiple time points reflects the change of trend of the overall expression levels. Such changes are used to measure whether the expression trends of different genes are similar. To obtain functionally significant clusters from the clustering results, we evaluated numbers of genes in clusters, average correlation coefficient, fluctuation, and their correlation with GO term enrichment. The efficacy of AngClust outperform two other measures, Euclidean distance (ED) and dynamic time warping of correlation (DTW), on a dataset of yeast gene expression. The ratios of GO and pathway term-enriched of clusters of AngClust is higher than or equal to that of STEM and TMixClust on human, mouse, and yeast time series of gene expression.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Animals
  • Cluster Analysis
  • Humans
  • Mice
  • Saccharomyces cerevisiae* / genetics
  • Time Factors
  • Transcriptome*