Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents

Raphael Köster; Dylan Hadfield-Menell; Richard Everett; Laura Weidinger; Gillian K Hadfield; Joel Z Leibo

doi:10.1073/pnas.2106028118

Spurious normativity enhances learning of compliance and enforcement behavior in artificial agents

Proc Natl Acad Sci U S A. 2022 Jan 18;119(3):e2106028118. doi: 10.1073/pnas.2106028118.

Authors

Raphael Köster¹, Dylan Hadfield-Menell^{2

3}, Richard Everett⁴, Laura Weidinger⁴, Gillian K Hadfield^{3

5

6

7

8

9}, Joel Z Leibo¹

Affiliations

¹ DeepMind, London EC4A 3TW, United Kingdom; [email protected] [email protected].
² Computer Science and Artificial Intelligence Laboratory, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA 02139.
³ Center for Human-Compatible AI, University of California, Berkeley, CA 94720.
⁴ DeepMind, London EC4A 3TW, United Kingdom.
⁵ Faculty of Law, University of Toronto, Toronto, ON M5S 3E6, Canada.
⁶ Rotman School of Management, University of Toronto, Toronto, ON M5S 3E6, Canada.
⁷ Schwartz Reisman Institute for Technology and Society, University of Toronto, Toronto, ON M5G 1L7, Canada.
⁸ Vector Institute for Artificial Intelligence, Toronto, ON M5G 1M1, Canada.
⁹ OpenAI, San Francisco, CA 94110.

Abstract

How do societies learn and maintain social norms? Here we use multiagent reinforcement learning to investigate the learning dynamics of enforcement and compliance behaviors. Artificial agents populate a foraging environment and need to learn to avoid a poisonous berry. Agents learn to avoid eating poisonous berries better when doing so is taboo, meaning the behavior is punished by other agents. The taboo helps overcome a credit assignment problem in discovering delayed health effects. Critically, introducing an additional taboo, which results in punishment for eating a harmless berry, further improves overall returns. This "silly rule" counterintuitively has a positive effect because it gives agents more practice in learning rule enforcement. By probing what individual agents have learned, we demonstrate that normative behavior relies on a sequence of learned skills. Learning rule compliance builds upon prior learning of rule enforcement by other agents. Our results highlight the benefit of employing a multiagent reinforcement learning computational model focused on learning to implement complex actions.

Keywords: cultural evolution; multiagent reinforcement learning; norms; social norms; third-party punishment.

MeSH terms

Environment
Humans
Learning*
Reinforcement, Psychology*
Social Norms*