Circular RNA (circRNA) is a large group of RNA family extensively existed in cells and tissues. High-throughput sequencing provides a way to view circRNAs across different samples, especially in various diseases. However, there is still no comprehensive database for exploring the cancer-specific circRNAs. We collected 228 total RNA or polyA(-) RNA-seq samples from both cancer and normal cell lines, and identified 272 152 cancer-specific circRNAs. A total of 950 962 circRNAs were identified in normal samples only, and 170 909 circRNAs were identified in both tumor and normal samples, which could be further used as non-tumor background. We constructed a cancer-specific circRNA database (CSCD, http://gb.whu.edu.cn/CSCD). To understand the functional effects of circRNAs, we predicted the microRNA response element sites and RNA binding protein sites for each circRNA. We further predicted potential open reading frames to highlight translatable circRNAs. To understand the association between the linear splicing and the back-splicing, we also predicted the splicing events in linear transcripts of each circRNA. As the first comprehensive cancer-specific circRNA database, we believe CSCD could significantly contribute to the research for the function and regulation of cancer-associated circRNAs.
© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.