Zum Hauptinhalt springen

Showing 1–1 of 1 results for author: Knap, O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.11068  [pdf, other

    cs.AI cs.CL

    Show, Don't Tell: Evaluating Large Language Models Beyond Textual Understanding with ChildPlay

    Authors: Gonçalo Hora de Carvalho, Oscar Knap, Robert Pollice

    Abstract: We explore the hypothesis that LLMs, such as GPT-3.5 and GPT-4, possess broader cognitive functions, particularly in non-linguistic domains. Our approach extends beyond standard linguistic benchmarks by incorporating games like Tic-Tac-Toe, Connect Four, and Battleship, encoded via ASCII, to assess strategic thinking and decision-making. To evaluate the models' ability to generalize beyond their t… ▽ More

    Submitted 18 August, 2024; v1 submitted 12 July, 2024; originally announced July 2024.