HomBlocks: A multiple-alignment construction pipeline for organelle phylogenomics based on locally collinear block searching

Genomics. 2018 Jan;110(1):18-22. doi: 10.1016/j.ygeno.2017.08.001. Epub 2017 Aug 3.

Abstract

Organelle phylogenomic analysis requires precisely constructed multi-gene alignment matrices concatenated by pre-aligned single gene datasets. For non-bioinformaticians, it can take days to weeks to manually create high-quality multi-gene alignments comprising tens or hundreds of homologous genes. Here, we describe a new and highly efficient pipeline, HomBlocks, which uses a homologous block searching method to construct multiple sequence alignment. This approach can automatically recognize locally collinear blocks among organelle genomes and excavate phylogenetically informative regions to construct multiple sequence alignment in a few hours. In addition, HomBlocks supports organelle genomes without annotation and makes adjustment to different taxon datasets, thereby enabling the inclusion of as many common genes as possible. Topology comparison of trees built by conventional multi-gene and HomBlocks alignments implemented in different taxon categories shows that the same efficiency can be achieved by HomBlocks as when using the traditional method. The availability of Homblocks makes organelle phylogenetic analyses more accessible to non-bioinformaticians, thereby promising to lead to a better understanding of phylogenic relationships at an organelle genome level.

Availability and implementation: HomBlocks is implemented in Perl and is supported by Unix-like operative systems, including Linux and macOS. The Perl source code is freely available for download from https://github.com/fenghen360/HomBlocks.git, and documentation and tutorials are available at https://github.com/fenghen360/HomBlocks.

Contact: [email protected] or [email protected].

Keywords: Alignment construction; Efficient pipeline; Locally collinear blocks; Organelle phylogenomics.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Genomics / methods*
  • Organelles / genetics*
  • Phylogeny*
  • Sequence Alignment / methods*
  • Software*