Zum Hauptinhalt springen

Showing 1–8 of 8 results for author: Poelitz, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.08781  [pdf, other

    cs.AI cs.CL

    Evaluating the Evaluator: Measuring LLMs' Adherence to Task Evaluation Instructions

    Authors: Bhuvanashree Murugadoss, Christian Poelitz, Ian Drosos, Vu Le, Nick McKenna, Carina Suzana Negreanu, Chris Parnin, Advait Sarkar

    Abstract: LLMs-as-a-judge is a recently popularized method which replaces human judgements in task evaluation (Zheng et al. 2024) with automatic evaluation using LLMs. Due to widespread use of RLHF (Reinforcement Learning from Human Feedback), state-of-the-art LLMs like GPT4 and Llama3 are expected to have strong alignment with human preferences when prompted for a quality judgement, such as the coherence o… ▽ More

    Submitted 16 August, 2024; originally announced August 2024.

  2. arXiv:2406.12692  [pdf, other

    cs.CL cs.AI cs.DB cs.HC

    MAGIC: Generating Self-Correction Guideline for In-Context Text-to-SQL

    Authors: Arian Askari, Christian Poelitz, Xinye Tang

    Abstract: Self-correction in text-to-SQL is the process of prompting large language model (LLM) to revise its previously incorrectly generated SQL, and commonly relies on manually crafted self-correction guidelines by human experts that are not only labor-intensive to produce but also limited by the human ability in identifying all potential error patterns in LLM responses. We introduce MAGIC, a novel multi… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 20 pages, 17 figures

  3. arXiv:2402.11734  [pdf, other

    cs.PL cs.AI cs.SE

    Solving Data-centric Tasks using Large Language Models

    Authors: Shraddha Barke, Christian Poelitz, Carina Suzana Negreanu, Benjamin Zorn, José Cambronero, Andrew D. Gordon, Vu Le, Elnaz Nouri, Nadia Polikarpova, Advait Sarkar, Brian Slininger, Neil Toronto, Jack Williams

    Abstract: Large language models (LLMs) are rapidly replacing help forums like StackOverflow, and are especially helpful for non-professional programmers and end users. These users are often interested in data-centric tasks, such as spreadsheet manipulation and data wrangling, which are hard to solve if the intent is only communicated using a natural-language description, without including the data. But how… ▽ More

    Submitted 24 March, 2024; v1 submitted 18 February, 2024; originally announced February 2024.

    Comments: Paper accepted to NAACL 2024 (Findings)

  4. arXiv:2310.14495  [pdf, other

    cs.CL cs.AI

    InstructExcel: A Benchmark for Natural Language Instruction in Excel

    Authors: Justin Payan, Swaroop Mishra, Mukul Singh, Carina Negreanu, Christian Poelitz, Chitta Baral, Subhro Roy, Rasika Chakravarthy, Benjamin Van Durme, Elnaz Nouri

    Abstract: With the evolution of Large Language Models (LLMs) we can solve increasingly more complex NLP tasks across various domains, including spreadsheets. This work investigates whether LLMs can generate code (Excel OfficeScripts, a TypeScript API for executing many tasks in Excel) that solves Excel specific tasks provided via natural language user instructions. To do so we introduce a new large-scale be… ▽ More

    Submitted 22 October, 2023; originally announced October 2023.

    Comments: Findings of EMNLP 2023, 18 pages

  5. arXiv:2208.06213  [pdf, other

    cs.HC cs.AI cs.PL

    What is it like to program with artificial intelligence?

    Authors: Advait Sarkar, Andrew D. Gordon, Carina Negreanu, Christian Poelitz, Sruti Srinivasa Ragavan, Ben Zorn

    Abstract: Large language models, such as OpenAI's codex and Deepmind's AlphaCode, can generate code to solve a variety of problems expressed in natural language. This technology has already been commercialised in at least one widely-used programming editor extension: GitHub Copilot. In this paper, we explore how programming with large language models (LLM-assisted programming) is similar to, and differs f… ▽ More

    Submitted 17 October, 2022; v1 submitted 12 August, 2022; originally announced August 2022.

    Comments: Proceedings of the 33rd Annual Conference of the Psychology of Programming Interest Group (PPIG 2022)

    ACM Class: D.2.3; D.2.6; I.2.5; I.2.7; H.5.2

  6. arXiv:1706.10231  [pdf, ps, other

    cs.IR cs.LG

    Improving Session Recommendation with Recurrent Neural Networks by Exploiting Dwell Time

    Authors: Alexander Dallmann, Alexander Grimm, Christian Pölitz, Daniel Zoller, Andreas Hotho

    Abstract: Recently, Recurrent Neural Networks (RNNs) have been applied to the task of session-based recommendation. These approaches use RNNs to predict the next item in a user session based on the previ- ously visited items. While some approaches consider additional item properties, we argue that item dwell time can be used as an implicit measure of user interest to improve session-based item recommen- dat… ▽ More

    Submitted 30 June, 2017; originally announced June 2017.

    Comments: 6 pages, 3 figures, submission to DLRS workshop

  7. arXiv:1705.07425  [pdf, other

    cs.CL cs.LG

    Learning Semantic Relatedness From Human Feedback Using Metric Learning

    Authors: Thomas Niebler, Martin Becker, Christian Pölitz, Andreas Hotho

    Abstract: Assessing the degree of semantic relatedness between words is an important task with a variety of semantic applications, such as ontology learning for the Semantic Web, semantic search or query expansion. To accomplish this in an automated fashion, many relatedness measures have been proposed. However, most of these metrics only encode information contained in the underlying corpus and thus do not… ▽ More

    Submitted 24 May, 2017; v1 submitted 21 May, 2017; originally announced May 2017.

    Comments: Under review at ISWC 2017

  8. arXiv:1606.05110  [pdf, other

    stat.ML cs.CY

    Machine Learning meets Data-Driven Journalism: Boosting International Understanding and Transparency in News Coverage

    Authors: Elena Erdmann, Karin Boczek, Lars Koppers, Gerret von Nordheim, Christian Pölitz, Alejandro Molina, Katharina Morik, Henrik Müller, Jörg Rahnenführer, Kristian Kersting

    Abstract: Migration crisis, climate change or tax havens: Global challenges need global solutions. But agreeing on a joint approach is difficult without a common ground for discussion. Public spheres are highly segmented because news are mainly produced and received on a national level. Gain- ing a global view on international debates about important issues is hindered by the enormous quantity of news and b… ▽ More

    Submitted 16 June, 2016; originally announced June 2016.

    Comments: presented at 2016 ICML Workshop on #Data4Good: Machine Learning in Social Good Applications, New York, NY