Zum Hauptinhalt springen

Showing 1–6 of 6 results for author: Krumdick, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.19415  [pdf, other

    cs.CL

    An Analysis of Multilingual FActScore

    Authors: Kim Trong Vu, Michael Krumdick, Varshini Reddy, Franck Dernoncourt, Viet Dac Lai

    Abstract: FActScore has gained popularity as a metric to estimate the factuality of long-form texts generated by Large Language Models (LLMs) in English. However, there has not been any work in studying the behavior of FActScore in other languages. This paper studies the limitations of each component in the four-component pipeline of FActScore in the multilingual setting. We introduce a new dataset for FAct… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  2. arXiv:2406.14394  [pdf, other

    cs.CL

    SEC-QA: A Systematic Evaluation Corpus for Financial QA

    Authors: Viet Dac Lai, Michael Krumdick, Charles Lovering, Varshini Reddy, Craig Schmidt, Chris Tanner

    Abstract: The financial domain frequently deals with large numbers of long documents that are essential for daily operations. Significant effort is put towards automating financial data analysis. However, a persistent challenge, not limited to the finance domain, is the scarcity of datasets that accurately reflect real-world tasks for model evaluation. Existing datasets are often constrained by size, contex… ▽ More

    Submitted 20 June, 2024; originally announced June 2024.

  3. arXiv:2401.06915  [pdf, other

    cs.CL cs.AI

    DocFinQA: A Long-Context Financial Reasoning Dataset

    Authors: Varshini Reddy, Rik Koncel-Kedziorski, Viet Dac Lai, Michael Krumdick, Charles Lovering, Chris Tanner

    Abstract: For large language models (LLMs) to be effective in the financial domain -- where each decision can have a significant impact -- it is necessary to investigate realistic tasks and data. Financial professionals often interact with documents that are hundreds of pages long, but most financial research datasets only deal with short excerpts from these documents. To address this, we introduce a long-d… ▽ More

    Submitted 29 February, 2024; v1 submitted 12 January, 2024; originally announced January 2024.

    Comments: 13 pages

  4. arXiv:2311.06602  [pdf, other

    cs.CL

    BizBench: A Quantitative Reasoning Benchmark for Business and Finance

    Authors: Rik Koncel-Kedziorski, Michael Krumdick, Viet Lai, Varshini Reddy, Charles Lovering, Chris Tanner

    Abstract: Answering questions within business and finance requires reasoning, precision, and a wide-breadth of technical knowledge. Together, these requirements make this domain difficult for large language models (LLMs). We introduce BizBench, a benchmark for evaluating models' ability to reason about realistic financial problems. BizBench comprises eight quantitative reasoning tasks, focusing on question-… ▽ More

    Submitted 12 March, 2024; v1 submitted 11 November, 2023; originally announced November 2023.

    Comments: Work in progress

  5. arXiv:2308.02051  [pdf, other

    cs.LG

    A Graphical Approach to Document Layout Analysis

    Authors: Jilin Wang, Michael Krumdick, Baojia Tong, Hamima Halim, Maxim Sokolov, Vadym Barda, Delphine Vendryes, Chris Tanner

    Abstract: Document layout analysis (DLA) is the task of detecting the distinct, semantic content within a document and correctly classifying these items into an appropriate category (e.g., text, title, figure). DLA pipelines enable users to convert documents into structured machine-readable formats that can then be used for many useful downstream tasks. Most existing state-of-the-art (SOTA) DLA models repre… ▽ More

    Submitted 3 August, 2023; originally announced August 2023.

    Comments: ICDAR 2023

  6. arXiv:1912.08166  [pdf, other

    cs.CV

    APRICOT: A Dataset of Physical Adversarial Attacks on Object Detection

    Authors: Anneliese Braunegg, Amartya Chakraborty, Michael Krumdick, Nicole Lape, Sara Leary, Keith Manville, Elizabeth Merkhofer, Laura Strickhart, Matthew Walmer

    Abstract: Physical adversarial attacks threaten to fool object detection systems, but reproducible research on the real-world effectiveness of physical patches and how to defend against them requires a publicly available benchmark dataset. We present APRICOT, a collection of over 1,000 annotated photographs of printed adversarial patches in public locations. The patches target several object categories for… ▽ More

    Submitted 20 August, 2020; v1 submitted 17 December, 2019; originally announced December 2019.

    Comments: 23 pages, 14 figures, 3 tables. Updated version as accepted to ECCV 2020