Rethink reporting of evaluation results in AI.
Burnell R, Schellaert W, Burden J, Ullman TD, Martinez-Plumed F, Tenenbaum JB, Rutar D, Cheke LG, Sohl-Dickstein J, Mitchell M, Kiela D, Shanahan M, Voorhees EM, Cohn AG, Leibo JZ, Hernandez-Orallo J.
Burnell R, et al. Among authors: cohn ag.
Science. 2023 Apr 14;380(6641):136-138. doi: 10.1126/science.adf6369. Epub 2023 Apr 13.
Science. 2023.
PMID: 37053341
Free article.