The increasing number and diversity of genetically modified organisms (GMOs) for the food and feed market calls for the development of advanced methods for their detection and identification. This issue can be addressed by next generation sequencing (NGS). However, the efficiency of NGS-based strategies depends on the availability of bioinformatic methods to find sequences of the transgenic insert and junction regions, which is a challenging topic. To facilitate this task, we have developed Nexplorer, a sequence-based database in which annotated sequences of GM events are stored in a structured, searchable and extractable format. As a proof of concept, we have developed a methodology for the analysis of sequencing data of DNA walking libraries of samples containing GMOs using the database. The efficiency of the method has been tested on datasets representing various scenarios that can be encountered in routine GMO analysis. Database-guided analysis allowed obtaining detailed and reliable information with limited hands-on time. As the database allows for efficient analysis of NGS data, it paves the way for the use of NGS sequencing technology to aid routine detection and identification of GMO.
Keywords: Data analysis workflow; GMO database; GMO detection; GMO identification1; NGS.
© 2022 The Authors.