EEfinder, a general purpose tool for identification of bacterial and viral endogenized elements in eukaryotic genomes

Comput Struct Biotechnol J. 2024 Oct 18:23:3662-3668. doi: 10.1016/j.csbj.2024.10.012. eCollection 2024 Dec.

Abstract

Horizontal gene transfer is a phenomenon of genetic material transmission between species with no parental relationship. It has been characterized among several major branches of life, including among prokaryotes, viruses and eukaryotes. The characterization of endogenous elements derived from viruses or bacteria provides a snapshot of past host-pathogen interactions and coevolution as well as reference information to remove false positive results from metagenomic studies. Currently there is a lack of general purpose standardized tools for endogenous elements screening which limits reproducibility and hinder comparative analysis between studies. Here we describe EEfinder, a new general purpose tool for identification and classification of endogenous elements derived from viruses or bacteria found in eukaryotic genomes. The tool was developed to include six common steps performed in this type of analysis: data cleaning, similarity search through sequence alignment, filtering candidate elements, taxonomy assignment, merging of truncated elements and flanks extraction. We evaluated the sensitivity of EEfinder to identify endogenous elements through comparative analysis using data from the literature and showed that EEfinder automatically detected 97 % of the EVEs compared to published results obtained by manual curation and detected an almost exact full integration of a Wolbachia genome described using wet-lab experiments. Therefore, EEfinder can effectively and systematically identify endogenous elements with bacterial/viral origin integrated in eukaryotic genomes. EEfinder is publicly available on https://github.com/WallauBioinfo/EEfinder.

Keywords: Bacteria; Comparative genomics; Genome evolution; Integration; Symbionts; Virus.

Associated data

  • figshare/10.6084/m9.figshare.25864525.v2