Amyotrophic lateral sclerosis (ALS) is a devastating neurodegenerative disease with no effective treatments. Numerous RNA-binding proteins (RBPs) have been shown to be altered in ALS, with mutations in 11 RBPs causing familial forms of the disease, and 6 more RBPs showing abnormal expression/distribution in ALS albeit without any known mutations. RBP dysregulation is widely accepted as a contributing factor in ALS pathobiology. There are at least 1542 RBPs in the human genome; therefore, other unidentified RBPs may also be linked to the pathogenesis of ALS. We used IBM Watson® to sieve through all RBPs in the genome and identify new RBPs linked to ALS (ALS-RBPs). IBM Watson extracted features from published literature to create semantic similarities and identify new connections between entities of interest. IBM Watson analyzed all published abstracts of previously known ALS-RBPs, and applied that text-based knowledge to all RBPs in the genome, ranking them by semantic similarity to the known set. We then validated the Watson top-ten-ranked RBPs at the protein and RNA levels in tissues from ALS and non-neurological disease controls, as well as in patient-derived induced pluripotent stem cells. 5 RBPs previously unlinked to ALS, hnRNPU, Syncrip, RBMS3, Caprin-1 and NUPL2, showed significant alterations in ALS compared to controls. Overall, we successfully used IBM Watson to help identify additional RBPs altered in ALS, highlighting the use of artificial intelligence tools to accelerate scientific discovery in ALS and possibly other complex neurological disorders.
Keywords: Amyotrophic lateral sclerosis; Artificial intelligence; Motor neuron; Protein aggregation; RNA-binding protein.