-
Dataset and Lessons Learned from the 2024 SaTML LLM Capture-the-Flag Competition
Authors:
Edoardo Debenedetti,
Javier Rando,
Daniel Paleka,
Silaghi Fineas Florin,
Dragos Albastroiu,
Niv Cohen,
Yuval Lemberg,
Reshmi Ghosh,
Rui Wen,
Ahmed Salem,
Giovanni Cherubin,
Santiago Zanella-Beguelin,
Robin Schmid,
Victor Klemm,
Takahiro Miki,
Chenhao Li,
Stefan Kraft,
Mario Fritz,
Florian Tramèr,
Sahar Abdelnabi,
Lea Schönherr
Abstract:
Large language model systems face important security risks from maliciously crafted messages that aim to overwrite the system's original instructions or leak private data. To study this problem, we organized a capture-the-flag competition at IEEE SaTML 2024, where the flag is a secret string in the LLM system prompt. The competition was organized in two phases. In the first phase, teams developed…
▽ More
Large language model systems face important security risks from maliciously crafted messages that aim to overwrite the system's original instructions or leak private data. To study this problem, we organized a capture-the-flag competition at IEEE SaTML 2024, where the flag is a secret string in the LLM system prompt. The competition was organized in two phases. In the first phase, teams developed defenses to prevent the model from leaking the secret. During the second phase, teams were challenged to extract the secrets hidden for defenses proposed by the other teams. This report summarizes the main insights from the competition. Notably, we found that all defenses were bypassed at least once, highlighting the difficulty of designing a successful defense and the necessity for additional research to protect LLM systems. To foster future research in this direction, we compiled a dataset with over 137k multi-turn attack chats and open-sourced the platform.
△ Less
Submitted 12 June, 2024;
originally announced June 2024.
-
Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models
Authors:
Sarah J. Zhang,
Samuel Florin,
Ariel N. Lee,
Eamon Niknafs,
Andrei Marginean,
Annie Wang,
Keith Tyser,
Zad Chin,
Yann Hicke,
Nikhil Singh,
Madeleine Udell,
Yoon Kim,
Tonio Buonassisi,
Armando Solar-Lezama,
Iddo Drori
Abstract:
We curate a comprehensive dataset of 4,550 questions and solutions from problem sets, midterm exams, and final exams across all MIT Mathematics and Electrical Engineering and Computer Science (EECS) courses required for obtaining a degree. We evaluate the ability of large language models to fulfill the graduation requirements for any MIT major in Mathematics and EECS. Our results demonstrate that…
▽ More
We curate a comprehensive dataset of 4,550 questions and solutions from problem sets, midterm exams, and final exams across all MIT Mathematics and Electrical Engineering and Computer Science (EECS) courses required for obtaining a degree. We evaluate the ability of large language models to fulfill the graduation requirements for any MIT major in Mathematics and EECS. Our results demonstrate that GPT-3.5 successfully solves a third of the entire MIT curriculum, while GPT-4, with prompt engineering, achieves a perfect solve rate on a test set excluding questions based on images. We fine-tune an open-source large language model on this dataset. We employ GPT-4 to automatically grade model responses, providing a detailed performance breakdown by course, question, and answer type. By embedding questions in a low-dimensional space, we explore the relationships between questions, topics, and classes and discover which questions and classes are required for solving other questions and classes through few-shot learning. Our analysis offers valuable insights into course prerequisites and curriculum design, highlighting language models' potential for learning and improving Mathematics and EECS education.
△ Less
Submitted 24 June, 2023; v1 submitted 15 June, 2023;
originally announced June 2023.
-
On the binary adder channel with complete feedback, with an application to quantitative group testing
Authors:
Samuel H. Florin,
Matthew H. Ho,
Zilin Jiang
Abstract:
We determine the exact value of the optimal symmetric rate point $(r, r)$ in the Dueck zero-error capacity region of the binary adder channel with complete feedback. We proved that the average zero-error capacity $r = h(1/2-δ) \approx 0.78974$, where $h(\cdot)$ is the binary entropy function and $δ= 1/(2\log_2(2+\sqrt3))$. Our motivation is a problem in quantitative group testing. Given a set of…
▽ More
We determine the exact value of the optimal symmetric rate point $(r, r)$ in the Dueck zero-error capacity region of the binary adder channel with complete feedback. We proved that the average zero-error capacity $r = h(1/2-δ) \approx 0.78974$, where $h(\cdot)$ is the binary entropy function and $δ= 1/(2\log_2(2+\sqrt3))$. Our motivation is a problem in quantitative group testing. Given a set of $n$ elements two of which are defective, the quantitative group testing problem asks for the identification of these two defectives through a series of tests. Each test gives the number of defectives contained in the tested subset, and the outcomes of previous tests are assumed known at the time of designing the current test. We establish that the minimum number of tests is asymptotic to $(\log_2 n) / r$ as $n \to \infty$.
△ Less
Submitted 28 December, 2021; v1 submitted 25 January, 2021;
originally announced January 2021.