Zum Hauptinhalt springen

Showing 1–1 of 1 results for author: Skapars, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.11059  [pdf, other

    cs.CR cs.AI cs.CL cs.LG

    Was it Slander? Towards Exact Inversion of Generative Language Models

    Authors: Adrians Skapars, Edoardo Manino, Youcheng Sun, Lucas C. Cordeiro

    Abstract: Training large language models (LLMs) requires a substantial investment of time and money. To get a good return on investment, the developers spend considerable effort ensuring that the model never produces harmful and offensive outputs. However, bad-faith actors may still try to slander the reputation of an LLM by publicly reporting a forged output. In this paper, we show that defending against s… ▽ More

    Submitted 10 July, 2024; originally announced July 2024.

    Comments: 4 pages, 3 figures