Structural variation (SV) is an important component of the diversity of the human genome. Many studies have shown that SV has a significant impact on human disease and is strongly associated with the development of cancer. In recent years, the Hi-C sequencing technique has been shown to be useful for detecting large-scale SVs, and several methods have been proposed for identifying SVs from Hi-C data. However, due to the complexity of the 3D genome structure, accurate identifying SVs from the Hi-C contact matrix remains a challenging task. Here, we present HiSVision, a method for identifying large-scale SVs from Hi-C data using a detection transformer framework. Inspired by object detection network, we transform the Hi-C contact matrix into images, then identify candidate SV regions on the image by detection transformer, and finally filter SVs based on features around the breakpoints. Experimental results show that HiSVision outperforms existing methods in terms of precision and F1 score on cancer cell lines and simulated datasets. The source code and data are available from https://github.com/dcy99/HiSVision .
Keywords: Detection transformer; Hi-C; Object detection; Structural variation.
© 2024. International Association of Scientists in the Interdisciplinary Areas.