Copy number variation (CNV), as a type of genomic structural variation, accounts for a large proportion of structural variation and is related to the pathogenesis and susceptibility to some human diseases, playing an important role in the development and change of human diseases. The development of next-generation sequencing technology (NGS) provides strong support for the design of CNV detection algorithms. Although a large number of methods have been developed to detect CNVs using NGS data, it is still considered a difficult problem to detect CNVs with low purity and coverage. In this paper, a new calculation method CNV-FB is proposed to detect CNVs from NGS data. The core idea of CNV-FB is to randomly sample the read depth values of the genome fragment, and then each sample is individually detected for outliers, and finally combined into a final outlier score. The CNV-FB method was applied to simulation data and real data experiments and compared with the other five methods of the same type. The results show that the CNV-FB method has a better detection effect than other methods. Therefore, the CNV-FB method may be an effective algorithm for detecting genomic mutations.
Keywords: Copy number variations; feature bagging; median absolute deviation; next-generation sequencing; outlier score.