Privacy-Preserving Artificial Intelligence Techniques in Biomedicine

Methods Inf Med. 2022 Jun;61(S 01):e12-e27. doi: 10.1055/s-0041-1740630. Epub 2022 Jan 21.

Abstract

Background: Artificial intelligence (AI) has been successfully applied in numerous scientific domains. In biomedicine, AI has already shown tremendous potential, e.g., in the interpretation of next-generation sequencing data and in the design of clinical decision support systems.

Objectives: However, training an AI model on sensitive data raises concerns about the privacy of individual participants. For example, summary statistics of a genome-wide association study can be used to determine the presence or absence of an individual in a given dataset. This considerable privacy risk has led to restrictions in accessing genomic and other biomedical data, which is detrimental for collaborative research and impedes scientific progress. Hence, there has been a substantial effort to develop AI methods that can learn from sensitive data while protecting individuals' privacy.

Method: This paper provides a structured overview of recent advances in privacy-preserving AI techniques in biomedicine. It places the most important state-of-the-art approaches within a unified taxonomy and discusses their strengths, limitations, and open problems.

Conclusion: As the most promising direction, we suggest combining federated machine learning as a more scalable approach with other additional privacy-preserving techniques. This would allow to merge the advantages to provide privacy guarantees in a distributed way for biomedical applications. Nonetheless, more research is necessary as hybrid approaches pose new challenges such as additional network or computation overhead.

MeSH terms

  • Artificial Intelligence
  • Decision Support Systems, Clinical*
  • Genome-Wide Association Study
  • Humans
  • Machine Learning
  • Privacy*

Grants and funding

Funding The FeatureCloud project has received funding from the European Union's Horizon 2020 research and innovation program under grant agreement No. 826078. This publication reflects only the authors' view and the European Commission is not responsible for any use that may be made of the information it contains. The work of J.B. and T.K. was also supported by the Horizon 2020 project REPO-TRIAL (No. 777111). M.L., T.K., and J.B. have further been supported by BMBF project Sys_CARE (01ZX1908A). M.L. and J.B. were also supported by BMBF project SyMBoD (01ZX1910D). J.B.'s contribution was also supported by his VILLUM Young Investigator grant (nr. 13154).