Zum Hauptinhalt springen

Showing 1–1 of 1 results for author: Aung, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.11789  [pdf, other

    cs.CL cs.AI cs.CY

    Large Language Models as Misleading Assistants in Conversation

    Authors: Betty Li Hou, Kejian Shi, Jason Phang, James Aung, Steven Adler, Rosie Campbell

    Abstract: Large Language Models (LLMs) are able to provide assistance on a wide range of information-seeking tasks. However, model outputs may be misleading, whether unintentionally or in cases of intentional deception. We investigate the ability of LLMs to be deceptive in the context of providing assistance on a reading comprehension task, using LLMs as proxies for human users. We compare outcomes of (1) w… ▽ More

    Submitted 16 July, 2024; originally announced July 2024.

    Comments: Next Generation of AI Safety Workshop, 41st International Conference on Machine Learning (ICML 2024)