Zum Hauptinhalt springen

Showing 1–1 of 1 results for author: Tailor, P

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.12872  [pdf, other

    cs.CL cs.LG

    Evaluating Large Language Models with fmeval

    Authors: Pola Schwöbel, Luca Franceschi, Muhammad Bilal Zafar, Keerthan Vasist, Aman Malhotra, Tomer Shenhar, Pinal Tailor, Pinar Yilmaz, Michael Diamond, Michele Donini

    Abstract: fmeval is an open source library to evaluate large language models (LLMs) in a range of tasks. It helps practitioners evaluate their model for task performance and along multiple responsible AI dimensions. This paper presents the library and exposes its underlying design principles: simplicity, coverage, extensibility and performance. We then present how these were implemented in the scientific an… ▽ More

    Submitted 15 July, 2024; originally announced July 2024.