Zum Hauptinhalt springen

Showing 1–29 of 29 results for author: Lawrence, C

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.06411  [pdf, other

    cs.AI cs.CL

    AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents

    Authors: Luca Gioacchini, Giuseppe Siracusano, Davide Sanvito, Kiril Gashteovski, David Friede, Roberto Bifulco, Carolin Lawrence

    Abstract: The advances made by Large Language Models (LLMs) have led to the pursuit of LLM agents that can solve intricate, multi-step reasoning tasks. As with any research pursuit, benchmarking and evaluation are key corner stones to efficient and reliable progress. However, existing benchmarks are often narrow and simply compute overall task success. To face these issues, we propose AgentQuest -- a framew… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: Accepted at the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (NAACL-HLT 2024)

  2. arXiv:2311.14966  [pdf, other

    cs.CL

    Walking a Tightrope -- Evaluating Large Language Models in High-Risk Domains

    Authors: Chia-Chien Hung, Wiem Ben Rim, Lindsay Frost, Lars Bruckner, Carolin Lawrence

    Abstract: High-risk domains pose unique challenges that require language models to provide accurate and safe responses. Despite the great success of large language models (LLMs), such as ChatGPT and its variants, their performance in high-risk domains remains unclear. Our study delves into an in-depth analysis of the performance of instruction-tuned LLMs, focusing on factual accuracy and safety adherence. T… ▽ More

    Submitted 25 November, 2023; originally announced November 2023.

    Comments: EMNLP 2023 Workshop on Benchmarking Generalisation in NLP (GenBench)

  3. arXiv:2310.14909  [pdf, other

    cs.CL cs.AI cs.LG

    Linking Surface Facts to Large-Scale Knowledge Graphs

    Authors: Gorjan Radevski, Kiril Gashteovski, Chia-Chien Hung, Carolin Lawrence, Goran Glavaš

    Abstract: Open Information Extraction (OIE) methods extract facts from natural language text in the form of ("subject"; "relation"; "object") triples. These facts are, however, merely surface forms, the ambiguity of which impedes their downstream usage; e.g., the surface phrase "Michael Jordan" may refer to either the former basketball player or the university professor. Knowledge Graphs (KGs), on the other… ▽ More

    Submitted 23 October, 2023; originally announced October 2023.

  4. arXiv:2310.07333  [pdf, ps, other

    cs.GT math.NA

    Computing approximate roots of monotone functions

    Authors: Alexandros Hollender, Chester Lawrence, Erel Segal-Halevi

    Abstract: Given a function f: [a,b] -> R, if f(a) < 0 and f(b)> 0 and f is continuous, the Intermediate Value Theorem implies that f has a root in [a,b]. Moreover, given a value-oracle for f, an approximate root of f can be computed using the bisection method, and the number of required evaluations is polynomial in the number of accuracy digits. The goal of this note is to identify conditions under which th… ▽ More

    Submitted 29 February, 2024; v1 submitted 11 October, 2023; originally announced October 2023.

    Comments: We solved all the open cases, except the case when f has 3 or more dimensions, and satisfies all monotonicity conditions except one. Any ideas?

  5. arXiv:2307.00524  [pdf, other

    cs.CL

    Large Language Models Enable Few-Shot Clustering

    Authors: Vijay Viswanathan, Kiril Gashteovski, Carolin Lawrence, Tongshuang Wu, Graham Neubig

    Abstract: Unlike traditional unsupervised clustering, semi-supervised clustering allows users to provide meaningful structure to the data, which helps the clustering algorithm to match the user's intent. Existing approaches to semi-supervised clustering require a significant amount of feedback from an expert to improve the clusters. In this paper, we ask whether a large language model can amplify an expert'… ▽ More

    Submitted 2 July, 2023; originally announced July 2023.

  6. Uncertainty Propagation in Node Classification

    Authors: Zhao Xu, Carolin Lawrence, Ammar Shaker, Raman Siarheyeu

    Abstract: Quantifying predictive uncertainty of neural networks has recently attracted increasing attention. In this work, we focus on measuring uncertainty of graph neural networks (GNNs) for the task of node classification. Most existing GNNs model message passing among nodes. The messages are often deterministic. Questions naturally arise: Does there exist uncertainty in the messages? How could we propag… ▽ More

    Submitted 3 April, 2023; originally announced April 2023.

  7. arXiv:2212.05178  [pdf, ps, other

    cs.LG

    State-Regularized Recurrent Neural Networks to Extract Automata and Explain Predictions

    Authors: Cheng Wang, Carolin Lawrence, Mathias Niepert

    Abstract: Recurrent neural networks are a widely used class of neural architectures. They have, however, two shortcomings. First, they are often treated as black-box models and as such it is difficult to understand what exactly they learn as well as how they arrive at a particular prediction. Second, they tend to work poorly on sequences requiring long-term memorization, despite having this capacity in prin… ▽ More

    Submitted 9 December, 2022; originally announced December 2022.

    Comments: To appear at IEEE Transactions on Pattern Analysis and Machine Intelligence. The extended version of State-Regularized Recurrent Neural Networks [arXiv:1901.08817]

  8. arXiv:2212.00424  [pdf, other

    cs.LG cs.AI stat.ML

    Multi-Source Survival Domain Adaptation

    Authors: Ammar Shaker, Carolin Lawrence

    Abstract: Survival analysis is the branch of statistics that studies the relation between the characteristics of living entities and their respective survival times, taking into account the partial information held by censored cases. A good analysis can, for example, determine whether one medical treatment for a group of patients is better than another. With the rise of machine learning, survival analysis c… ▽ More

    Submitted 6 March, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

    Comments: 37th AAAI Conference on Artificial Intelligence, 2023. Includes Appendix

  9. arXiv:2208.11024  [pdf, other

    cs.AI

    KGxBoard: Explainable and Interactive Leaderboard for Evaluation of Knowledge Graph Completion Models

    Authors: Haris Widjaja, Kiril Gashteovski, Wiem Ben Rim, Pengfei Liu, Christopher Malon, Daniel Ruffinelli, Carolin Lawrence, Graham Neubig

    Abstract: Knowledge Graphs (KGs) store information in the form of (head, predicate, tail)-triples. To augment KGs with new knowledge, researchers proposed models for KG Completion (KGC) tasks such as link prediction; i.e., answering (h; p; ?) or (?; p; t) queries. Such models are usually evaluated with averaged metrics on a held-out test set. While useful for tracking progress, averaged single-score metrics… ▽ More

    Submitted 23 August, 2022; originally announced August 2022.

  10. arXiv:2207.04447  [pdf, other

    cs.CL

    Human-Centric Research for NLP: Towards a Definition and Guiding Questions

    Authors: Bhushan Kotnis, Kiril Gashteovski, Julia Gastinger, Giuseppe Serra, Francesco Alesiani, Timo Sztyler, Ammar Shaker, Na Gong, Carolin Lawrence, Zhao Xu

    Abstract: With Human-Centric Research (HCR) we can steer research activities so that the research outcome is beneficial for human stakeholders, such as end users. But what exactly makes research human-centric? We address this question by providing a working definition and define how a research pipeline can be split into different stages in which human-centric components can be added. Additionally, we discus… ▽ More

    Submitted 10 July, 2022; originally announced July 2022.

  11. arXiv:2205.12749  [pdf, other

    cs.AI cs.HC

    A Human-Centric Assessment Framework for AI

    Authors: Sascha Saralajew, Ammar Shaker, Zhao Xu, Kiril Gashteovski, Bhushan Kotnis, Wiem Ben Rim, Jürgen Quittek, Carolin Lawrence

    Abstract: With the rise of AI systems in real-world applications comes the need for reliable and trustworthy AI. An essential aspect of this are explainable AI systems. However, there is no agreed standard on how explainable AI systems should be assessed. Inspired by the Turing test, we introduce a human-centric assessment framework where a leading domain expert accepts or rejects the solutions of an AI sys… ▽ More

    Submitted 1 July, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted as submission to ICML 2022 Workshop on Human-Machine Collaboration and Teaming

  12. arXiv:2203.05121  [pdf, other

    cs.LG cs.GT

    Collusion Detection in Team-Based Multiplayer Games

    Authors: Laura Greige, Fernando De Mesentier Silva, Meredith Trotter, Chris Lawrence, Peter Chin, Dilip Varadarajan

    Abstract: In the context of competitive multiplayer games, collusion happens when two or more teams decide to collaborate towards a common goal, with the intention of gaining an unfair advantage from this cooperation. The task of identifying colluders from the player population is however infeasible to game designers due to the sheer size of the player population. In this paper, we propose a system that det… ▽ More

    Submitted 9 March, 2022; originally announced March 2022.

    Comments: 14 pages, 4 figures

  13. arXiv:2110.08144  [pdf, other

    cs.CL cs.AI

    milIE: Modular & Iterative Multilingual Open Information Extraction

    Authors: Bhushan Kotnis, Kiril Gashteovski, Daniel Oñoro Rubio, Vanesa Rodriguez-Tembras, Ammar Shaker, Makoto Takamoto, Mathias Niepert, Carolin Lawrence

    Abstract: Open Information Extraction (OpenIE) is the task of extracting (subject, predicate, object) triples from natural language sentences. Current OpenIE systems extract all triple slots independently. In contrast, we explore the hypothesis that it may be beneficial to extract triple slots iteratively: first extract easy slots, followed by the difficult ones by conditioning on the easy slots, and theref… ▽ More

    Submitted 25 April, 2022; v1 submitted 15 October, 2021; originally announced October 2021.

  14. arXiv:2109.07464  [pdf, other

    cs.CL

    AnnIE: An Annotation Platform for Constructing Complete Open Information Extraction Benchmark

    Authors: Niklas Friedrich, Kiril Gashteovski, Mingying Yu, Bhushan Kotnis, Carolin Lawrence, Mathias Niepert, Goran Glavaš

    Abstract: Open Information Extraction (OIE) is the task of extracting facts from sentences in the form of relations and their corresponding arguments in schema-free manner. Intrinsic performance of OIE systems is difficult to measure due to the incompleteness of existing OIE benchmarks: the ground truth extractions do not group all acceptable surface realizations of the same fact that can be extracted from… ▽ More

    Submitted 13 April, 2022; v1 submitted 15 September, 2021; originally announced September 2021.

  15. arXiv:2109.06850  [pdf, other

    cs.CL cs.AI

    BenchIE: A Framework for Multi-Faceted Fact-Based Open Information Extraction Evaluation

    Authors: Kiril Gashteovski, Mingying Yu, Bhushan Kotnis, Carolin Lawrence, Mathias Niepert, Goran Glavaš

    Abstract: Intrinsic evaluations of OIE systems are carried out either manually -- with human evaluators judging the correctness of extractions -- or automatically, on standardized benchmarks. The latter, while much more cost-effective, is less reliable, primarily because of the incompleteness of the existing OIE benchmarks: the ground truth extractions do not include all acceptable variants of the same fact… ▽ More

    Submitted 13 April, 2022; v1 submitted 14 September, 2021; originally announced September 2021.

  16. arXiv:2106.13642  [pdf, other

    cs.LG stat.ML

    VEGN: Variant Effect Prediction with Graph Neural Networks

    Authors: Jun Cheng, Carolin Lawrence, Mathias Niepert

    Abstract: Genetic mutations can cause disease by disrupting normal gene function. Identifying the disease-causing mutations from millions of genetic variants within an individual patient is a challenging problem. Computational methods which can prioritize disease-causing mutations have, therefore, enormous applications. It is well-known that genes function through a complex regulatory network. However, exis… ▽ More

    Submitted 25 June, 2021; originally announced June 2021.

    Comments: Accepted at Workshop on Computational Biology, co-located with the 38th International Conference on Machine Learning

  17. arXiv:2102.10648  [pdf, ps, other

    cs.ET

    Dimensions of Timescales in Neuromorphic Computing Systems

    Authors: Herbert Jaeger, Dirk Doorakkers, Celestine Lawrence, Giacomo Indiveri

    Abstract: This article is a public deliverable of the EU project "Memory technologies with multi-scale time constants for neuromorphic architectures" (MeMScales, https://memscales.eu, Call ICT-06-2019 Unconventional Nanoelectronics, project number 871371). This arXiv version is a verbatim copy of the deliverable report, with administrative information stripped. It collects a wide and varied assortment of ph… ▽ More

    Submitted 21 February, 2021; originally announced February 2021.

  18. arXiv:2011.12010  [pdf, other

    cs.LG

    Uncertainty Estimation and Calibration with Finite-State Probabilistic RNNs

    Authors: Cheng Wang, Carolin Lawrence, Mathias Niepert

    Abstract: Uncertainty quantification is crucial for building reliable and trustable machine learning systems. We propose to estimate uncertainty in recurrent neural networks (RNNs) via stochastic discrete state transitions over recurrent timesteps. The uncertainty of the model can be quantified by running a prediction several times, each time sampling from the recurrent state transition distribution, leadin… ▽ More

    Submitted 24 November, 2020; originally announced November 2020.

  19. arXiv:2011.02511  [pdf, ps, other

    cs.CL cs.LG

    Offline Reinforcement Learning from Human Feedback in Real-World Sequence-to-Sequence Tasks

    Authors: Julia Kreutzer, Stefan Riezler, Carolin Lawrence

    Abstract: Large volumes of interaction logs can be collected from NLP systems that are deployed in the real world. How can this wealth of information be leveraged? Using such interaction logs in an offline reinforcement learning (RL) setting is a promising approach. However, due to the nature of NLP tasks and the constraints of production systems, a series of challenges arise. We present a concise overview… ▽ More

    Submitted 9 June, 2021; v1 submitted 4 November, 2020; originally announced November 2020.

    Comments: 5th Workshop on Structured Prediction for NLP at ACL 2021 Previously named "Learning from Human Feedback: Challenges for Real-World Reinforcement Learning in NLP" and presented at Challenges of Real-World RL Workshop at NeurIPS 2020

  20. arXiv:2010.05516  [pdf, other

    cs.LG cs.AI stat.ML

    Explaining Neural Matrix Factorization with Gradient Rollback

    Authors: Carolin Lawrence, Timo Sztyler, Mathias Niepert

    Abstract: Explaining the predictions of neural black-box models is an important problem, especially when such models are used in applications where user trust is crucial. Estimating the influence of training examples on a learned neural model's behavior allows us to identify training examples most responsible for a given prediction and, therefore, to faithfully explain the output of a black-box model. The m… ▽ More

    Submitted 15 December, 2020; v1 submitted 12 October, 2020; originally announced October 2020.

    Comments: 35th AAAI Conference on Artificial Intelligence, 2021. Includes Appendix

  21. arXiv:2004.02596  [pdf, other

    cs.AI cs.LG

    Answering Complex Queries in Knowledge Graphs with Bidirectional Sequence Encoders

    Authors: Bhushan Kotnis, Carolin Lawrence, Mathias Niepert

    Abstract: Representation learning for knowledge graphs (KGs) has focused on the problem of answering simple link prediction queries. In this work we address the more ambitious challenge of predicting the answers of conjunctive queries with multiple missing entities. We propose Bi-Directional Query Embedding (BIQE), a method that embeds conjunctive queries with models based on bi-directional attention mechan… ▽ More

    Submitted 4 February, 2021; v1 submitted 6 April, 2020; originally announced April 2020.

    Comments: 8 pages, 2 figures

  22. arXiv:1908.08737  [pdf, other

    cs.CR

    Design choices for productive, secure, data-intensive research at scale in the cloud

    Authors: Diego Arenas, Jon Atkins, Claire Austin, David Beavan, Alvaro Cabrejas Egea, Steven Carlysle-Davies, Ian Carter, Rob Clarke, James Cunningham, Tom Doel, Oliver Forrest, Evelina Gabasova, James Geddes, James Hetherington, Radka Jersakova, Franz Kiraly, Catherine Lawrence, Jules Manser, Martin T. O'Reilly, James Robinson, Helen Sherwood-Taylor, Serena Tierney, Catalina A. Vallejos, Sebastian Vollmer, Kirstie Whitaker

    Abstract: We present a policy and process framework for secure environments for productive data science research projects at scale, by combining prevailing data security threat and risk profiles into five sensitivity tiers, and, at each tier, specifying recommended policies for data classification, data ingress, software ingress, data egress, user access, user device control, and analysis environments. By p… ▽ More

    Submitted 15 September, 2019; v1 submitted 23 August, 2019; originally announced August 2019.

  23. arXiv:1908.05915  [pdf, other

    stat.ML cs.CL cs.LG

    Attending to Future Tokens For Bidirectional Sequence Generation

    Authors: Carolin Lawrence, Bhushan Kotnis, Mathias Niepert

    Abstract: Neural sequence generation is typically performed token-by-token and left-to-right. Whenever a token is generated only previously produced tokens are taken into consideration. In contrast, for problems such as sequence classification, bidirectional attention, which takes both past and future tokens into consideration, has been shown to perform much better. We propose to make the sequence generatio… ▽ More

    Submitted 17 September, 2019; v1 submitted 16 August, 2019; originally announced August 2019.

    Comments: Conference on Empirical Methods in Natural Language Processing (EMNLP), 2019, Hong Kong, China

  24. arXiv:1907.08738  [pdf, other

    stat.AP cs.LG stat.ML

    Bayesian Inference Gaussian Process Multiproxy Alignment of Continuous Signals (BIGMACS): Applications for Paleoceanography

    Authors: Taehee Lee, Lorraine E. Lisiecki, Devin Rand, Geoffrey Gebbie, Charles E. Lawrence

    Abstract: We first introduce a novel profile-based alignment algorithm, the multiple continuous Signal Alignment algorithm with Gaussian Process Regression profiles (SA-GPR). SA-GPR addresses the limitations of currently available signal alignment methods by adopting a hybrid of the particle smoothing and Markov-chain Monte Carlo (MCMC) algorithms to align signals, and by applying the Gaussian process regre… ▽ More

    Submitted 13 June, 2021; v1 submitted 19 July, 2019; originally announced July 2019.

    Comments: This article has been submitted to "Bayesian Analysis"

  25. arXiv:1907.03748  [pdf, other

    cs.CL cs.LG stat.ML

    Learning Neural Sequence-to-Sequence Models from Weak Feedback with Bipolar Ramp Loss

    Authors: Laura Jehl, Carolin Lawrence, Stefan Riezler

    Abstract: In many machine learning scenarios, supervision by gold labels is not available and consequently neural models cannot be trained directly by maximum likelihood estimation (MLE). In a weak supervision scenario, metric-augmented objectives can be employed to assign feedback to model outputs, which can be used to extract a supervision signal for training. We present several objectives for two separat… ▽ More

    Submitted 6 July, 2019; originally announced July 2019.

    Comments: Transactions of the Association for Computational Linguistics 2019 Vol. 7, 233-248. Presented at ACL, Florence, Italy

  26. arXiv:1811.12239  [pdf, other

    cs.CL cs.LG stat.ML

    Counterfactual Learning from Human Proofreading Feedback for Semantic Parsing

    Authors: Carolin Lawrence, Stefan Riezler

    Abstract: In semantic parsing for question-answering, it is often too expensive to collect gold parses or even gold answers as supervision signals. We propose to convert model outputs into a set of human-understandable statements which allow non-expert users to act as proofreaders, providing error markings as learning signals to the parser. Because model outputs were suggested by a historic system, we opera… ▽ More

    Submitted 29 November, 2018; originally announced November 2018.

    Comments: "Learning by Instruction" Workshop at the 32nd Conference on Neural Information Processing Systems (NIPS 2018), Montréal, Canada. arXiv admin note: substantial text overlap with arXiv:1805.01252

  27. arXiv:1805.01252  [pdf, other

    cs.CL cs.LG stat.ML

    Improving a Neural Semantic Parser by Counterfactual Learning from Human Bandit Feedback

    Authors: Carolin Lawrence, Stefan Riezler

    Abstract: Counterfactual learning from human bandit feedback describes a scenario where user feedback on the quality of outputs of a historic system is logged and used to improve a target system. We show how to apply this learning framework to neural semantic parsing. From a machine learning perspective, the key challenge lies in a proper reweighting of the estimator so as to avoid known degeneracies in cou… ▽ More

    Submitted 30 November, 2018; v1 submitted 3 May, 2018; originally announced May 2018.

    Comments: Conference of the Association for Computational Linguistics (ACL), 2018, Melbourne, Australia

  28. arXiv:1711.08621  [pdf, ps, other

    stat.ML cs.CL cs.LG

    Counterfactual Learning for Machine Translation: Degeneracies and Solutions

    Authors: Carolin Lawrence, Pratik Gajane, Stefan Riezler

    Abstract: Counterfactual learning is a natural scenario to improve web-based machine translation services by offline learning from feedback logged during user interactions. In order to avoid the risk of showing inferior translations to users, in such scenarios mostly exploration-free deterministic logging policies are in place. We analyze possible degeneracies of inverse and reweighted propensity scoring es… ▽ More

    Submitted 14 December, 2017; v1 submitted 23 November, 2017; originally announced November 2017.

    Comments: Workshop "From 'What If?' To 'What Next?'" at the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA

  29. arXiv:1707.09118  [pdf, other

    stat.ML cs.CL cs.LG

    Counterfactual Learning from Bandit Feedback under Deterministic Logging: A Case Study in Statistical Machine Translation

    Authors: Carolin Lawrence, Artem Sokolov, Stefan Riezler

    Abstract: The goal of counterfactual learning for statistical machine translation (SMT) is to optimize a target SMT system from logged data that consist of user feedback to translations that were predicted by another, historic SMT system. A challenge arises by the fact that risk-averse commercial SMT systems deterministically log the most probable translation. The lack of sufficient exploration of the SMT o… ▽ More

    Submitted 14 December, 2017; v1 submitted 28 July, 2017; originally announced July 2017.

    Comments: Conference on Empirical Methods in Natural Language Processing (EMNLP), 2017, Copenhagen, Denmark