Bayesian networks elucidate complex genomic landscapes in cancer

Nicos Angelopoulos; Aikaterini Chatzipli; Jyoti Nangalia; Francesco Maura; Peter J Campbell

doi:10.1038/s42003-022-03243-w

Bayesian networks elucidate complex genomic landscapes in cancer

Commun Biol. 2022 Apr 4;5(1):306. doi: 10.1038/s42003-022-03243-w.

Authors

Nicos Angelopoulos^{1

2}, Aikaterini Chatzipli³, Jyoti Nangalia³, Francesco Maura⁴, Peter J Campbell³

Affiliations

¹ The Cancer, Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK. [email protected].
² Systems Immunity Research Institute, Medical School, Cardiff University, Cardiff, CF14 4XN, UK. [email protected].
³ The Cancer, Ageing and Somatic Mutation Programme, Wellcome Sanger Institute, Hinxton, Cambridgeshire, CB10 1SA, UK.
⁴ Myeloma Program, Sylvester Comprehensive Cancer Center, University of Miami, Miami, FL, USA.

Abstract

Bayesian networks (BNs) are disciplined, explainable Artificial Intelligence models that can describe structured joint probability spaces. In the context of understanding complex relations between a number of variables in biological settings, they can be constructed from observed data and can provide a guiding, graphical tool in exploring such relations. Here we propose BNs for elucidating the relations between driver events in large cancer genomic datasets. We present a methodology that is specifically tailored to biologists and clinicians as they are the main producers of such datasets. We achieve this by using an optimal BN learning algorithm based on well established likelihood functions and by utilising just two tuning parameters, both of which are easy to set and have intuitive readings. To enhance value to clinicians, we introduce (a) the use of heatmaps for families in each network, and (b) visualising pairwise co-occurrence statistics on the network. For binary data, an optional step of fitting logic gates can be employed. We show how our methodology enhances pairwise testing and how biologists and clinicians can use BNs for discussing the main relations among driver events in large genomic cohorts. We demonstrate the utility of our methodology by applying it to 5 cancer datasets revealing complex genomic landscapes. Our networks identify central patterns in all datasets including a central 4-way mutual exclusivity between HDR, t(4,14), t(11,14) and t(14,16) in myeloma, and a 3-way mutual exclusivity of three major players: CALR, JAK2 and MPL, in myeloproliferative neoplasms. These analyses demonstrate that our methodology can play a central role in the study of large genomic cancer datasets.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Algorithms
Artificial Intelligence*
Bayes Theorem
Genomics
Humans
Neoplasms* / genetics

Grants and funding

WT_/Wellcome Trust/United Kingdom