arcasHLA: high-resolution HLA typing from RNAseq

Bioinformatics. 2020 Jan 1;36(1):33-40. doi: 10.1093/bioinformatics/btz474.

Abstract

Motivation: The human leukocyte antigen (HLA) locus plays a critical role in tissue compatibility and regulates the host response to many diseases, including cancers and autoimmune di3orders. Recent improvements in the quality and accessibility of next-generation sequencing have made HLA typing from standard short-read data practical. However, this task remains challenging given the high level of polymorphism and homology between HLA genes. HLA typing from RNA sequencing is further complicated by post-transcriptional modifications and bias due to amplification.

Results: Here, we present arcasHLA: a fast and accurate in silico tool that infers HLA genotypes from RNA-sequencing data. Our tool outperforms established tools on the gold-standard benchmark dataset for HLA typing in terms of both accuracy and speed, with an accuracy rate of 100% at two-field resolution for Class I genes, and over 99.7% for Class II. Furthermore, we evaluate the performance of our tool on a new biological dataset of 447 single-end total RNA samples from nasopharyngeal swabs, and establish the applicability of arcasHLA in metatranscriptome studies.

Availability and implementation: arcasHLA is available at https://github.com/RabadanLab/arcasHLA.

Supplementary information: Supplementary data are available at Bioinformatics online.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Alleles
  • HLA Antigens* / genetics
  • High-Throughput Nucleotide Sequencing
  • Histocompatibility Antigens Class I* / classification
  • Histocompatibility Testing / methods
  • Humans
  • Sequence Analysis, RNA* / methods

Substances

  • HLA Antigens
  • Histocompatibility Antigens Class I