Multi-cancer early detection remains a key challenge in cell-free DNA (cfDNA)-based liquid biopsy. Here, we perform cfDNA whole-genome sequencing to generate two test datasets covering 2125 patient samples of 9 cancer types and 1241 normal control samples, and also a reference dataset for background variant filtering based on 20,529 low-depth healthy samples. An external cfDNA dataset consisting of 208 cancer and 214 normal control samples is used for additional evaluation. Accuracy for cancer detection and tissue-of-origin localization is achieved using our algorithm, which incorporates cancer type-specific profiles of mutation distribution and chromatin organization in tumor tissues as model references. Our integrative model detects early-stage cancers, including those of pancreatic origin, with high sensitivity that is comparable to that of late-stage detection. Model interpretation reveals the contribution of cancer type-specific genomic and epigenomic features. Our methodologies may lay the groundwork for accurate cfDNA-based cancer diagnosis, especially at early stages.
© 2023. The Author(s).