Scientific discovery in the age of artificial intelligence

Hanchen Wang; Tianfan Fu; Yuanqi Du; Wenhao Gao; Kexin Huang; Ziming Liu; Payal Chandak; Shengchao Liu; Peter Van Katwyk; Andreea Deac; Anima Anandkumar; Karianne Bergen; Carla P Gomes; Shirley Ho; Pushmeet Kohli; Joan Lasenby; Jure Leskovec; Tie-Yan Liu; Arjun Manrai; Debora Marks; Bharath Ramsundar; Le Song; Jimeng Sun; Jian Tang; Petar Veličković; Max Welling; Linfeng Zhang; Connor W Coley; Yoshua Bengio; Marinka Zitnik

doi:10.1038/s41586-023-06221-2

Scientific discovery in the age of artificial intelligence

Nature. 2023 Aug;620(7972):47-60. doi: 10.1038/s41586-023-06221-2. Epub 2023 Aug 2.

Authors

Hanchen Wang^#^{1

2

3

4}, Tianfan Fu^#⁵, Yuanqi Du^#⁶, Wenhao Gao⁷, Kexin Huang⁴, Ziming Liu⁸, Payal Chandak⁹, Shengchao Liu^{10

11}, Peter Van Katwyk^{12

13}, Andreea Deac^{10

11}, Anima Anandkumar^{2

14}, Karianne Bergen^{12

13}, Carla P Gomes⁶, Shirley Ho^{15

16

17

18}, Pushmeet Kohli¹⁹, Joan Lasenby¹, Jure Leskovec⁴, Tie-Yan Liu²⁰, Arjun Manrai²¹, Debora Marks^{22

23}, Bharath Ramsundar²⁴, Le Song^{25

26}, Jimeng Sun²⁷, Jian Tang^{10

28

29}, Petar Veličković^{19

30}, Max Welling^{31

32}, Linfeng Zhang^{33

34}, Connor W Coley^{7

35}, Yoshua Bengio^{10

11}, Marinka Zitnik^{36

37

38

39}

Affiliations

¹ Department of Engineering, University of Cambridge, Cambridge, UK.
² Department of Computing and Mathematical Sciences, California Institute of Technology, Pasadena, CA, USA.
³ Department of Research and Early Development, Genentech Inc, South San Francisco, CA, USA.
⁴ Department of Computer Science, Stanford University, Stanford, CA, USA.
⁵ Department of Computational Science and Engineering, Georgia Institute of Technology, Atlanta, GA, USA.
⁶ Department of Computer Science, Cornell University, Ithaca, NY, USA.
⁷ Department of Chemical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA.
⁸ Department of Physics, Massachusetts Institute of Technology, Cambridge, MA, USA.
⁹ Harvard-MIT Program in Health Sciences and Technology, Cambridge, MA, USA.
¹⁰ Mila - Quebec AI Institute, Montreal, Quebec, Canada.
¹¹ Université de Montréal, Montreal, Quebec, Canada.
¹² Department of Earth, Environmental and Planetary Sciences, Brown University, Providence, RI, USA.
¹³ Data Science Institute, Brown University, Providence, RI, USA.
¹⁴ NVIDIA, Santa Clara, CA, USA.
¹⁵ Center for Computational Astrophysics, Flatiron Institute, New York, NY, USA.
¹⁶ Department of Astrophysical Sciences, Princeton University, Princeton, NJ, USA.
¹⁷ Department of Physics, Carnegie Mellon University, Pittsburgh, PA, USA.
¹⁸ Department of Physics and Center for Data Science, New York University, New York, NY, USA.
¹⁹ Google DeepMind, London, UK.
²⁰ Microsoft Research, Beijing, China.
²¹ Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA.
²² Department of Systems Biology, Harvard Medical School, Boston, MA, USA.
²³ Broad Institute of MIT and Harvard, Cambridge, MA, USA.
²⁴ Deep Forest Sciences, Palo Alto, CA, USA.
²⁵ BioMap, Beijing, China.
²⁶ Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, United Arab Emirates.
²⁷ University of Illinois at Urbana-Champaign, Champaign, IL, USA.
²⁸ HEC Montréal, Montreal, Quebec, Canada.
²⁹ CIFAR AI Chair, Toronto, Ontario, Canada.
³⁰ Department of Computer Science and Technology, University of Cambridge, Cambridge, UK.
³¹ University of Amsterdam, Amsterdam, Netherlands.
³² Microsoft Research Amsterdam, Amsterdam, Netherlands.
³³ DP Technology, Beijing, China.
³⁴ AI for Science Institute, Beijing, China.
³⁵ Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, USA.
³⁶ Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA. [email protected].
³⁷ Broad Institute of MIT and Harvard, Cambridge, MA, USA. [email protected].
³⁸ Harvard Data Science Initiative, Cambridge, MA, USA. [email protected].
³⁹ Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University, Cambridge, MA, USA. [email protected].

^# Contributed equally.

PMID: 37532811
DOI: 10.1038/s41586-023-06221-2

Abstract

Artificial intelligence (AI) is being increasingly integrated into scientific discovery to augment and accelerate research, helping scientists to generate hypotheses, design experiments, collect and interpret large datasets, and gain insights that might not have been possible using traditional scientific methods alone. Here we examine breakthroughs over the past decade that include self-supervised learning, which allows models to be trained on vast amounts of unlabelled data, and geometric deep learning, which leverages knowledge about the structure of scientific data to enhance model accuracy and efficiency. Generative AI methods can create designs, such as small-molecule drugs and proteins, by analysing diverse data modalities, including images and sequences. We discuss how these methods can help scientists throughout the scientific process and the central issues that remain despite such advances. Both developers and users of AI toolsneed a better understanding of when such approaches need improvement, and challenges posed by poor data quality and stewardship remain. These issues cut across scientific disciplines and require developing foundational algorithmic approaches that can contribute to scientific understanding or acquire it autonomously, making them critical areas of focus for AI innovation.

Publication types

Review

MeSH terms

Artificial Intelligence* / standards
Artificial Intelligence* / trends
Datasets as Topic
Deep Learning
Research Design* / standards
Research Design* / trends
Unsupervised Machine Learning