Zum Hauptinhalt springen

Showing 1–28 of 28 results for author: Trivedi, H

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.18901  [pdf, other

    cs.SE cs.AI cs.CL cs.LG

    AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents

    Authors: Harsh Trivedi, Tushar Khot, Mareike Hartmann, Ruskin Manku, Vinty Dong, Edward Li, Shashank Gupta, Ashish Sabharwal, Niranjan Balasubramanian

    Abstract: Autonomous agents that address day-to-day digital tasks (e.g., ordering groceries for a household), must not only operate multiple apps (e.g., notes, messaging, shopping app) via APIs, but also generate rich code with complex control flow in an iterative manner based on their interaction with the environment. However, existing benchmarks for tool use are inadequate, as they only cover tasks that r… ▽ More

    Submitted 26 July, 2024; originally announced July 2024.

    Comments: ACL'24 Camera Ready

  2. arXiv:2312.12442  [pdf

    cs.CV cs.AI

    Hierarchical Classification System for Breast Cancer Specimen Report (HCSBC) -- an end-to-end model for characterizing severity and diagnosis

    Authors: Thiago Santos, Harish Kamath, Christopher R. McAdams, Mary S. Newell, Marina Mosunjac, Gabriela Oprea-Ilies, Geoffrey Smith, Constance Lehman, Judy Gichoya, Imon Banerjee, Hari Trivedi

    Abstract: Automated classification of cancer pathology reports can extract information from unstructured reports and categorize each report into structured diagnosis and severity categories. Thus, such system can reduce the burden for populating tumor registries, help registration for clinical trial as well as developing large dataset for deep learning model development using true pathologic ground truth. H… ▽ More

    Submitted 2 November, 2023; originally announced December 2023.

  3. arXiv:2311.12560  [pdf

    cs.CV

    Benchmarking bias: Expanding clinical AI model card to incorporate bias reporting of social and non-social factors

    Authors: Carolina A. M. Heming, Mohamed Abdalla, Shahram Mohanna, Monish Ahluwalia, Linglin Zhang, Hari Trivedi, MinJae Woo, Benjamin Fine, Judy Wawira Gichoya, Leo Anthony Celi, Laleh Seyyed-Kalantari

    Abstract: Clinical AI model reporting cards should be expanded to incorporate a broad bias reporting of both social and non-social factors. Non-social factors consider the role of other factors, such as disease dependent, anatomic, or instrument factors on AI model bias, which are essential to ensure safe deployment.

    Submitted 2 July, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

  4. Synthetically Enhanced: Unveiling Synthetic Data's Potential in Medical Imaging Research

    Authors: Bardia Khosravi, Frank Li, Theo Dapamede, Pouria Rouzrokh, Cooper U. Gamble, Hari M. Trivedi, Cody C. Wyles, Andrew B. Sellergren, Saptarshi Purkayastha, Bradley J. Erickson, Judy W. Gichoya

    Abstract: Chest X-rays (CXR) are essential for diagnosing a variety of conditions, but when used on new populations, model generalizability issues limit their efficacy. Generative AI, particularly denoising diffusion probabilistic models (DDPMs), offers a promising approach to generating synthetic images, enhancing dataset diversity. This study investigates the impact of synthetic data supplementation on th… ▽ More

    Submitted 7 July, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

  5. arXiv:2305.04422  [pdf

    eess.IV cs.CV cs.CY cs.LG

    Multivariate Analysis on Performance Gaps of Artificial Intelligence Models in Screening Mammography

    Authors: Linglin Zhang, Beatrice Brown-Mulry, Vineela Nalla, InChan Hwang, Judy Wawira Gichoya, Aimilia Gastounioti, Imon Banerjee, Laleh Seyyed-Kalantari, MinJae Woo, Hari Trivedi

    Abstract: Although deep learning models for abnormality classification can perform well in screening mammography, the demographic, imaging, and clinical characteristics associated with increased risk of model failure remain unclear. This retrospective study uses the Emory BrEast Imaging Dataset(EMBED) containing mammograms from 115931 patients imaged at Emory Healthcare between 2013-2020, with BI-RADS asses… ▽ More

    Submitted 19 October, 2023; v1 submitted 7 May, 2023; originally announced May 2023.

    Comments: 29 pages, 6 tables, 7 figures, 2 supplemental tables

  6. arXiv:2212.10509  [pdf, other

    cs.CL

    Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions

    Authors: Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal

    Abstract: Prompting-based large language models (LLMs) are surprisingly powerful at generating natural language reasoning steps or Chains-of-Thoughts (CoT) for multi-step question answering (QA). They struggle, however, when the necessary knowledge is either unavailable to the LLM or not up-to-date within its parameters. While using the question to retrieve relevant text from an external knowledge source he… ▽ More

    Submitted 22 June, 2023; v1 submitted 20 December, 2022; originally announced December 2022.

    Comments: ACL'23 Camera Ready

  7. arXiv:2211.06925  [pdf, other

    cs.CV cs.LG

    Early Diagnosis of Chronic Obstructive Pulmonary Disease from Chest X-Rays using Transfer Learning and Fusion Strategies

    Authors: Ryan Wang, Li-Ching Chen, Lama Moukheiber, Mira Moukheiber, Dana Moukheiber, Zach Zaiman, Sulaiman Moukheiber, Tess Litchman, Kenneth Seastedt, Hari Trivedi, Rebecca Steinberg, Po-Chih Kuo, Judy Gichoya, Leo Anthony Celi

    Abstract: Chronic obstructive pulmonary disease (COPD) is one of the most common chronic illnesses in the world and the third leading cause of mortality worldwide. It is often underdiagnosed or not diagnosed until later in the disease course. Spirometry tests are the gold standard for diagnosing COPD but can be difficult to obtain, especially in resource-poor countries. Chest X-rays (CXRs), however, are rea… ▽ More

    Submitted 13 November, 2022; originally announced November 2022.

    Comments: 15 pages, 12 figures

  8. arXiv:2210.10860  [pdf, other

    cs.CL

    Two-Turn Debate Doesn't Help Humans Answer Hard Reading Comprehension Questions

    Authors: Alicia Parrish, Harsh Trivedi, Nikita Nangia, Vishakh Padmakumar, Jason Phang, Amanpreet Singh Saimbhi, Samuel R. Bowman

    Abstract: The use of language-model-based question-answering systems to aid humans in completing difficult tasks is limited, in part, by the unreliability of the text these systems generate. Using hard multiple-choice reading comprehension questions as a testbed, we assess whether presenting humans with arguments for two competing answer options, where one is correct and the other is incorrect, allows human… ▽ More

    Submitted 19 October, 2022; originally announced October 2022.

    Comments: 12 pages, 6 figures, 7 tables

  9. arXiv:2210.02406  [pdf, other

    cs.CL

    Decomposed Prompting: A Modular Approach for Solving Complex Tasks

    Authors: Tushar Khot, Harsh Trivedi, Matthew Finlayson, Yao Fu, Kyle Richardson, Peter Clark, Ashish Sabharwal

    Abstract: Few-shot prompting is a surprisingly powerful way to use Large Language Models (LLMs) to solve various tasks. However, this approach struggles as the task complexity increases or when the individual reasoning steps of the task themselves are hard to learn, especially when embedded in more complex tasks. To address this, we propose Decomposed Prompting, a new approach to solve complex tasks by deco… ▽ More

    Submitted 11 April, 2023; v1 submitted 5 October, 2022; originally announced October 2022.

    Comments: ICLR'23 Camera Ready

  10. arXiv:2207.00066  [pdf

    cs.LG cs.AI math.NA

    Advances in Prediction of Readmission Rates Using Long Term Short Term Memory Networks on Healthcare Insurance Data

    Authors: Shuja Khalid, Francisco Matos, Ayman Abunimer, Joel Bartlett, Richard Duszak, Michal Horny, Judy Gichoya, Imon Banerjee, Hari Trivedi

    Abstract: 30-day hospital readmission is a long standing medical problem that affects patients' morbidity and mortality and costs billions of dollars annually. Recently, machine learning models have been created to predict risk of inpatient readmission for patients with specific diseases, however no model exists to predict this risk across all patients. We developed a bi-directional Long Short Term Memory (… ▽ More

    Submitted 30 June, 2022; originally announced July 2022.

    Comments: 7 pages, 3 figures, 3 tables

  11. arXiv:2205.12496  [pdf, other

    cs.CL cs.AI

    Teaching Broad Reasoning Skills for Multi-Step QA by Generating Hard Contexts

    Authors: Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal

    Abstract: Question-answering datasets require a broad set of reasoning skills. We show how to use question decompositions to teach language models these broad reasoning skills in a robust fashion. Specifically, we use widely available QDMR representations to programmatically create hard-to-cheat synthetic contexts for real questions in six multi-step reasoning datasets. These contexts are carefully designed… ▽ More

    Submitted 3 November, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted at EMNLP'22

  12. arXiv:2205.06885  [pdf

    cs.CL

    PathologyBERT -- Pre-trained Vs. A New Transformer Language Model for Pathology Domain

    Authors: Thiago Santos, Amara Tariq, Susmita Das, Kavyasree Vayalpati, Geoffrey H. Smith, Hari Trivedi, Imon Banerjee

    Abstract: Pathology text mining is a challenging task given the reporting variability and constant new findings in cancer sub-type definitions. However, successful text mining of a large pathology database can play a critical role to advance 'big data' cancer research like similarity-based treatment selection, case identification, prognostication, surveillance, clinical trial screening, risk stratification,… ▽ More

    Submitted 13 May, 2022; originally announced May 2022.

    Comments: submitted to "American Medical Informatics Association (AMIA)" 2022 Annual Symposium

  13. arXiv:2204.05212  [pdf, other

    cs.CL

    Single-Turn Debate Does Not Help Humans Answer Hard Reading-Comprehension Questions

    Authors: Alicia Parrish, Harsh Trivedi, Ethan Perez, Angelica Chen, Nikita Nangia, Jason Phang, Samuel R. Bowman

    Abstract: Current QA systems can generate reasonable-sounding yet false answers without explanation or evidence for the generated answer, which is especially problematic when humans cannot readily check the model's answers. This presents a challenge for building trust in machine learning systems. We take inspiration from real-world situations where difficult questions are answered by considering opposing si… ▽ More

    Submitted 13 April, 2022; v1 submitted 11 April, 2022; originally announced April 2022.

    Comments: Accepted to the 2022 ACL Workshop on Learning with Natural Language Supervision. 12 pages total, 9 figures, 2 tables

  14. arXiv:2204.03074  [pdf, other

    cs.CV

    OSCARS: An Outlier-Sensitive Content-Based Radiography Retrieval System

    Authors: Xiaoyuan Guo, Jiali Duan, Saptarshi Purkayastha, Hari Trivedi, Judy Wawira Gichoya, Imon Banerjee

    Abstract: Improving the retrieval relevance on noisy datasets is an emerging need for the curation of a large-scale clean dataset in the medical domain. While existing methods can be applied for class-wise retrieval (aka. inter-class), they cannot distinguish the granularity of likeness within the same class (aka. intra-class). The problem is exacerbated on medical external datasets, where noisy samples of… ▽ More

    Submitted 6 April, 2022; originally announced April 2022.

    Comments: 12 pages, 6 figures, 2 tables

  15. arXiv:2202.04073  [pdf

    eess.IV cs.CV cs.LG

    The EMory BrEast imaging Dataset (EMBED): A Racially Diverse, Granular Dataset of 3.5M Screening and Diagnostic Mammograms

    Authors: Jiwoong J. Jeong, Brianna L. Vey, Ananth Reddy, Thomas Kim, Thiago Santos, Ramon Correa, Raman Dutt, Marina Mosunjac, Gabriela Oprea-Ilies, Geoffrey Smith, Minjae Woo, Christopher R. McAdams, Mary S. Newell, Imon Banerjee, Judy Gichoya, Hari Trivedi

    Abstract: Developing and validating artificial intelligence models in medical imaging requires datasets that are large, granular, and diverse. To date, the majority of publicly available breast imaging datasets lack in one or more of these areas. Models trained on these data may therefore underperform on patient populations or pathologies that have not previously been encountered. The EMory BrEast imaging D… ▽ More

    Submitted 8 February, 2022; originally announced February 2022.

  16. arXiv:2112.13885  [pdf, other

    eess.IV cs.CV

    MedShift: identifying shift data for medical dataset curation

    Authors: Xiaoyuan Guo, Judy Wawira Gichoya, Hari Trivedi, Saptarshi Purkayastha, Imon Banerjee

    Abstract: To curate a high-quality dataset, identifying data variance between the internal and external sources is a fundamental and crucial step. However, methods to detect shift or variance in data have not been significantly researched. Challenges to this are the lack of effective approaches to learn dense representation of a dataset and difficulties of sharing private data across medical institutions. T… ▽ More

    Submitted 27 December, 2021; originally announced December 2021.

    Comments: 35 pages, 28 figures, 2 tables

  17. arXiv:2111.08711  [pdf, other

    eess.IV cs.CV cs.LG

    Two-step adversarial debiasing with partial learning -- medical image case-studies

    Authors: Ramon Correa, Jiwoong Jason Jeong, Bhavik Patel, Hari Trivedi, Judy W. Gichoya, Imon Banerjee

    Abstract: The use of artificial intelligence (AI) in healthcare has become a very active research area in the last few years. While significant progress has been made in image classification tasks, only a few AI methods are actually being deployed in hospitals. A major hurdle in actively using clinical AI models currently is the trustworthiness of these models. More often than not, these complex models are… ▽ More

    Submitted 16 November, 2021; originally announced November 2021.

  18. arXiv:2109.06853  [pdf, other

    cs.CL

    Summarize-then-Answer: Generating Concise Explanations for Multi-hop Reading Comprehension

    Authors: Naoya Inoue, Harsh Trivedi, Steven Sinha, Niranjan Balasubramanian, Kentaro Inui

    Abstract: How can we generate concise explanations for multi-hop Reading Comprehension (RC)? The current strategies of identifying supporting sentences can be seen as an extractive question-focused summarization of the input text. However, these extractive explanations are not necessarily concise i.e. not minimally sufficient for answering a question. Instead, we advocate for an abstractive approach, where… ▽ More

    Submitted 14 September, 2021; originally announced September 2021.

    Comments: Accepted to EMNLP2021 Long Paper (Main Track)

  19. arXiv:2108.00573  [pdf, other

    cs.CL cs.AI

    MuSiQue: Multihop Questions via Single-hop Question Composition

    Authors: Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal

    Abstract: Multihop reasoning remains an elusive goal as existing multihop benchmarks are known to be largely solvable via shortcuts. Can we create a question answering (QA) dataset that, by construction, \emph{requires} proper multihop reasoning? To this end, we introduce a bottom-up approach that systematically selects composable pairs of single-hop questions that are connected, i.e., where one reasoning s… ▽ More

    Submitted 5 May, 2022; v1 submitted 1 August, 2021; originally announced August 2021.

    Comments: Accepted for publication in Transactions of the Association for Computational Linguistics (TACL), 2022

  20. arXiv:2107.10356  [pdf

    cs.CV cs.CY eess.IV

    Reading Race: AI Recognises Patient's Racial Identity In Medical Images

    Authors: Imon Banerjee, Ananth Reddy Bhimireddy, John L. Burns, Leo Anthony Celi, Li-Ching Chen, Ramon Correa, Natalie Dullerud, Marzyeh Ghassemi, Shih-Cheng Huang, Po-Chih Kuo, Matthew P Lungren, Lyle Palmer, Brandon J Price, Saptarshi Purkayastha, Ayis Pyrros, Luke Oakden-Rayner, Chima Okechukwu, Laleh Seyyed-Kalantari, Hari Trivedi, Ryan Wang, Zachary Zaiman, Haoran Zhang, Judy W Gichoya

    Abstract: Background: In medical imaging, prior studies have demonstrated disparate AI performance by race, yet there is no known correlation for race on medical imaging that would be obvious to the human expert interpreting the images. Methods: Using private and public datasets we evaluate: A) performance quantification of deep learning models to detect race from medical images, including the ability of… ▽ More

    Submitted 21 July, 2021; originally announced July 2021.

    MSC Class: 68-XX ACM Class: I.2

  21. arXiv:2106.01199  [pdf, other

    cs.CL

    IrEne: Interpretable Energy Prediction for Transformers

    Authors: Qingqing Cao, Yash Kumar Lal, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian

    Abstract: Existing software-based energy measurements of NLP models are not accurate because they do not consider the complex interactions between energy consumption and model execution. We present IrEne, an interpretable and extensible energy prediction system that accurately predicts the inference energy consumption of a wide range of Transformer-based NLP models. IrEne constructs a model tree graph that… ▽ More

    Submitted 2 June, 2021; originally announced June 2021.

    Comments: ACL 2021 camera ready

  22. arXiv:2106.00794  [pdf, other

    cs.CL cs.AI cs.HC

    What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks?

    Authors: Nikita Nangia, Saku Sugawara, Harsh Trivedi, Alex Warstadt, Clara Vania, Samuel R. Bowman

    Abstract: Crowdsourcing is widely used to create data for common natural language understanding tasks. Despite the importance of these datasets for measuring and refining model understanding of language, there has been little focus on the crowdsourcing methods used for collecting the datasets. In this paper, we compare the efficacy of interventions that have been proposed in prior work as ways of improving… ▽ More

    Submitted 1 June, 2021; originally announced June 2021.

    Comments: ACL 2021

  23. arXiv:2006.13262  [pdf

    eess.IV cs.CV cs.LG

    Was there COVID-19 back in 2012? Challenge for AI in Diagnosis with Similar Indications

    Authors: Imon Banerjee, Priyanshu Sinha, Saptarshi Purkayastha, Nazanin Mashhaditafreshi, Amara Tariq, Jiwoong Jeong, Hari Trivedi, Judy W. Gichoya

    Abstract: Purpose: Since the recent COVID-19 outbreak, there has been an avalanche of research papers applying deep learning based image processing to chest radiographs for detection of the disease. To test the performance of the two top models for CXR COVID-19 diagnosis on external datasets to assess model generalizability. Methods: In this paper, we present our argument regarding the efficiency and applic… ▽ More

    Submitted 23 June, 2020; originally announced June 2020.

  24. arXiv:2005.00789  [pdf, other

    cs.CL cs.AI cs.LG

    Is Multihop QA in DiRe Condition? Measuring and Reducing Disconnected Reasoning

    Authors: Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal

    Abstract: Has there been real progress in multi-hop question-answering? Models often exploit dataset artifacts to produce correct answers, without connecting information across multiple supporting facts. This limits our ability to measure true progress and defeats the purpose of building multi-hop QA datasets. We make three contributions towards addressing this. First, we formalize such undesirable behavior… ▽ More

    Submitted 16 November, 2020; v1 submitted 2 May, 2020; originally announced May 2020.

    Comments: Accepted at EMNLP'20

  25. arXiv:2005.00697  [pdf, other

    cs.CL cs.AI cs.LG

    DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering

    Authors: Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian

    Abstract: Transformer-based QA models use input-wide self-attention -- i.e. across both the question and the input passage -- at all layers, causing them to be slow and memory-intensive. It turns out that we can get by without input-wide self-attention at all layers, especially in the lower layers. We introduce DeFormer, a decomposed transformer, which substitutes the full self-attention with question-wide… ▽ More

    Submitted 2 May, 2020; originally announced May 2020.

    Comments: ACL 2020 camera ready

  26. arXiv:2004.07965  [pdf, other

    eess.IV cs.CV cs.LG

    A DICOM Framework for Machine Learning Pipelines against Real-Time Radiology Images

    Authors: Pradeeban Kathiravelu, Puneet Sharma, Ashish Sharma, Imon Banerjee, Hari Trivedi, Saptarshi Purkayastha, Priyanshu Sinha, Alexandre Cadrin-Chenevert, Nabile Safdar, Judy Wawira Gichoya

    Abstract: Executing machine learning (ML) pipelines in real-time on radiology images is hard due to the limited computing resources in clinical environments and the lack of efficient data transfer capabilities to run them on research clusters. We propose Niffler, an integrated framework that enables the execution of ML pipelines at research clusters by efficiently querying and retrieving radiology images fr… ▽ More

    Submitted 5 August, 2020; v1 submitted 16 April, 2020; originally announced April 2020.

    Comments: Preprint

    Journal ref: Journal of Digital Imaging (JDI), 2021

  27. arXiv:1910.08112  [pdf, other

    cs.LG eess.IV q-bio.NC stat.AP stat.ML

    Anatomically-Informed Data Augmentation for functional MRI with Applications to Deep Learning

    Authors: Kevin P. Nguyen, Cherise Chin Fatt, Alex Treacher, Cooper Mellema, Madhukar H. Trivedi, Albert Montillo

    Abstract: The application of deep learning to build accurate predictive models from functional neuroimaging data is often hindered by limited dataset sizes. Though data augmentation can help mitigate such training obstacles, most data augmentation methods have been developed for natural images as in computer vision tasks such as CIFAR, not for medical images. This work helps to fills in this gap by proposin… ▽ More

    Submitted 17 October, 2019; originally announced October 2019.

    Comments: SPIE Medical Imaging 2020

  28. arXiv:1904.09380  [pdf, other

    cs.CL cs.AI cs.LG

    Repurposing Entailment for Multi-Hop Question Answering Tasks

    Authors: Harsh Trivedi, Heeyoung Kwon, Tushar Khot, Ashish Sabharwal, Niranjan Balasubramanian

    Abstract: Question Answering (QA) naturally reduces to an entailment problem, namely, verifying whether some text entails the answer to a question. However, for multi-hop QA tasks, which require reasoning with multiple sentences, it remains unclear how best to utilize entailment models pre-trained on large scale datasets such as SNLI, which are based on sentence pairs. We introduce Multee, a general archite… ▽ More

    Submitted 19 April, 2019; originally announced April 2019.

    Comments: Accepted at NAACL'19