Objectives: In recent years, the rise of big data and artificial intelligence has led to an increasing expansion of databases and web services in biomedical research. cBioPortal is one of the most widely used platforms for accessing cancer genomic and clinical data. The primary objective of this study was to develop a tool that simplifies programmatic interaction with cBioPortal's web service.
Materials and methods: We developed the pyBioPortal Python package, which leverages the cBioPortal REST API to access genomic and clinical data. The retrieved data is returned as a Pandas DataFrame, a format widely used for data analysis in Python.
Results: pyBioPortal offers an efficient interface between the user and the cBioPortal database. The data is provided in formats conducive to further analysis and visualization, promoting workflows and improving reproducibility.
Discussion: The development of pyBioPortal addresses the challenge of accessing and processing large volumes of biomedical data. By simplifying the interaction with the cBioPortal API and providing data in Pandas DataFrame format, pyBioPortal allows users to focus more on the analytical aspects rather than data extraction.
Conclusion: This tool facilitates the retrieval of heterogeneous biological and clinical data in a standardized format, making it more accessible for analysis and enhancing the reproducibility of results in cancer informatics. Distributed as an open-source project, pyBioPortal is available to the broader bioinformatics community, promoting collaboration and advancing research in cancer genomics.
Keywords: Python; bioinformatics; cBioPortal; cancer research.
© The Author(s) 2025. Published by Oxford University Press on behalf of the American Medical Informatics Association.