Motivation: The increase in publication rates makes it challenging for an individual researcher to stay abreast of all relevant research in order to find novel research hypotheses. Literature-based discovery methods make use of knowledge graphs built using text mining and can infer future associations between biomedical concepts that will likely occur in new publications. These predictions are a valuable resource for researchers to explore a research topic. Current methods for prediction are based on the local structure of the knowledge graph. A method that uses global knowledge from across the knowledge graph needs to be developed in order to make knowledge discovery a frequently used tool by researchers.
Results: We propose an approach based on the singular value decomposition (SVD) that is able to combine data from across the knowledge graph through a reduced representation. Using cooccurrence data extracted from published literature, we show that SVD performs better than the leading methods for scoring discoveries. We also show the diminishing predictive power of knowledge discovery as we compare our predictions with real associations that appear further into the future. Finally, we examine the strengths and weaknesses of the SVD approach against another well-performing system using several predicted associations.
Availability and implementation: All code and results files for this analysis can be accessed at https://github.com/jakelever/knowledgediscovery.
Contact: [email protected].
Supplementary information: Supplementary data are available at Bioinformatics online.
© The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: [email protected]