Systematic comparative study of computational methods for T-cell receptor sequencing data analysis

Brief Bioinform. 2019 Jan 18;20(1):222-234. doi: 10.1093/bib/bbx111.

Abstract

High-throughput sequencing technologies have exposed the possibilities for the in-depth evaluation of T-cell receptor (TCR) repertoires. These studies are highly relevant to gain insights into human adaptive immunity and to decipher the composition and diversity of antigen receptors in physiological and disease conditions. The major objective of TCR sequencing data analysis is the identification of V, D and J gene segments, complementarity-determining region 3 (CDR3) sequence extraction and clonality analysis. With the advancement in sequencing technologies, new TCR analysis approaches and programs have been developed. However, there is still a deficit of systematic comparative studies to assist in the selection of an optimal analysis approach. Here, we present a detailed comparison of 10 state-of-the-art TCR analysis tools on samples with different complexities by taking into account many aspects such as clonotype detection [unique V(D)J combination], CDR3 identification or accuracy in error correction. We used our in silico and experimental data sets with known clonalities enabling the identification of potential tool biases. We also established a new strategy, named clonal plane, which allows quantifying and comparing the clonality of multiple samples. Our results provide new insights into the effect of method selection on analysis results, and it will assist users in the selection of an appropriate analysis method.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Base Sequence
  • Computational Biology / methods
  • Computer Simulation
  • Databases, Genetic / statistics & numerical data
  • HeLa Cells
  • High-Throughput Nucleotide Sequencing / statistics & numerical data
  • Humans
  • Jurkat Cells
  • Receptors, Antigen, T-Cell / genetics*
  • Sequence Analysis / statistics & numerical data
  • T-Lymphocytes / immunology

Substances

  • Receptors, Antigen, T-Cell