Zum Hauptinhalt springen

Showing 1–16 of 16 results for author: Munikoti, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.14595  [pdf, other

    cs.CL

    Surprisingly Fragile: Assessing and Addressing Prompt Instability in Multimodal Foundation Models

    Authors: Ian Stewart, Sameera Horawalavithana, Brendan Kennedy, Sai Munikoti, Karl Pazdernik

    Abstract: Multimodal foundation models (MFMs) such as OFASys show the potential to unlock analysis of complex data such as images, videos, and audio data via text prompts alone. However, their performance may suffer in the face of text input that differs even slightly from their training distribution, which is surprising considering the use of modality-specific data to "ground" the text input. This study de… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

    Comments: in submission

    ACM Class: I.2.7

  2. arXiv:2408.11800  [pdf, other

    cs.CL

    PermitQA: A Benchmark for Retrieval Augmented Generation in Wind Siting and Permitting domain

    Authors: Rounak Meyur, Hung Phan, Sridevi Wagle, Jan Strube, Mahantesh Halappanavar, Sameera Horawalavithana, Anurag Acharya, Sai Munikoti

    Abstract: In the rapidly evolving landscape of Natural Language Processing (NLP) and text generation, the emergence of Retrieval Augmented Generation (RAG) presents a promising avenue for improving the quality and reliability of generated text by leveraging information retrieved from user specified database. Benchmarking is essential to evaluate and compare the performance of the different RAG configuration… ▽ More

    Submitted 21 August, 2024; originally announced August 2024.

  3. arXiv:2407.07321  [pdf, other

    cs.CL

    RAG vs. Long Context: Examining Frontier Large Language Models for Environmental Review Document Comprehension

    Authors: Hung Phan, Anurag Acharya, Sarthak Chaturvedi, Shivam Sharma, Mike Parker, Dan Nally, Ali Jannesari, Karl Pazdernik, Mahantesh Halappanavar, Sai Munikoti, Sameera Horawalavithana

    Abstract: Large Language Models (LLMs) have been applied to many research problems across various domains. One of the applications of LLMs is providing question-answering systems that cater to users from different fields. The effectiveness of LLM-based question-answering systems has already been established at an acceptable level for users posing questions in popular and public domains such as trivia and li… ▽ More

    Submitted 9 July, 2024; originally announced July 2024.

    Comments: 14 pages

  4. arXiv:2406.05496  [pdf, other

    cs.CL

    Generalist Multimodal AI: A Review of Architectures, Challenges and Opportunities

    Authors: Sai Munikoti, Ian Stewart, Sameera Horawalavithana, Henry Kvinge, Tegan Emerson, Sandra E Thompson, Karl Pazdernik

    Abstract: Multimodal models are expected to be a critical component to future advances in artificial intelligence. This field is starting to grow rapidly with a surge of new design elements motivated by the success of foundation models in natural language processing (NLP) and vision. It is widely hoped that further extending the foundation models to multiple modalities (e.g., text, image, video, sensor, tim… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

    Comments: 25 pages, 3 figures, 5 tables

  5. arXiv:2311.12289  [pdf, other

    cs.CL cs.AI

    ATLANTIC: Structure-Aware Retrieval-Augmented Language Model for Interdisciplinary Science

    Authors: Sai Munikoti, Anurag Acharya, Sridevi Wagle, Sameera Horawalavithana

    Abstract: Large language models record impressive performance on many natural language processing tasks. However, their knowledge capacity is limited to the pretraining corpus. Retrieval augmentation offers an effective solution by retrieving context from external knowledge sources to complement the language model. However, existing retrieval augmentation techniques ignore the structural relationships betwe… ▽ More

    Submitted 20 November, 2023; originally announced November 2023.

    ACM Class: I.2.7

  6. arXiv:2311.09358  [pdf, other

    cs.CL cs.AI

    Empirical evaluation of Uncertainty Quantification in Retrieval-Augmented Language Models for Science

    Authors: Sridevi Wagle, Sai Munikoti, Anurag Acharya, Sara Smith, Sameera Horawalavithana

    Abstract: Large language models (LLMs) have shown remarkable achievements in natural language processing tasks, producing high-quality outputs. However, LLMs still exhibit limitations, including the generation of factually incorrect information. In safety-critical applications, it is important to assess the confidence of LLM-generated content to make informed decisions. Retrieval Augmented Language Models (… ▽ More

    Submitted 15 November, 2023; originally announced November 2023.

    ACM Class: I.2.7

  7. arXiv:2311.04348  [pdf, other

    cs.CL cs.AI

    Evaluating the Effectiveness of Retrieval-Augmented Large Language Models in Scientific Document Reasoning

    Authors: Sai Munikoti, Anurag Acharya, Sridevi Wagle, Sameera Horawalavithana

    Abstract: Despite the dramatic progress in Large Language Model (LLM) development, LLMs often provide seemingly plausible but not factual information, often referred to as hallucinations. Retrieval-augmented LLMs provide a non-parametric approach to solve these issues by retrieving relevant information from external data sources and augment the training process. These models help to trace evidence from an e… ▽ More

    Submitted 7 November, 2023; originally announced November 2023.

    Comments: 5 pages

    ACM Class: I.2.7

  8. arXiv:2310.10920  [pdf, other

    cs.CL cs.AI

    NuclearQA: A Human-Made Benchmark for Language Models for the Nuclear Domain

    Authors: Anurag Acharya, Sai Munikoti, Aaron Hellinger, Sara Smith, Sridevi Wagle, Sameera Horawalavithana

    Abstract: As LLMs have become increasingly popular, they have been used in almost every field. But as the application for LLMs expands from generic fields to narrow, focused science domains, there exists an ever-increasing gap in ways to evaluate their efficacy in those fields. For the benchmarks that do exist, a lot of them focus on questions that don't require proper understanding of the subject in questi… ▽ More

    Submitted 16 October, 2023; originally announced October 2023.

    Comments: 9 pages

    ACM Class: I.2.7

  9. arXiv:2307.01139  [pdf, other

    cs.CV cs.AI cs.CL cs.LG

    SCITUNE: Aligning Large Language Models with Scientific Multimodal Instructions

    Authors: Sameera Horawalavithana, Sai Munikoti, Ian Stewart, Henry Kvinge

    Abstract: Instruction finetuning is a popular paradigm to align large language models (LLM) with human intent. Despite its popularity, this idea is less explored in improving the LLMs to align existing foundation models with scientific disciplines, concepts and goals. In this work, we present SciTune as a tuning framework to improve the ability of LLMs to follow scientific multimodal instructions. To test o… ▽ More

    Submitted 3 July, 2023; originally announced July 2023.

    Comments: Preprint. Work in progress

  10. arXiv:2306.01189  [pdf, other

    cs.LG

    A General Framework for Uncertainty Quantification via Neural SDE-RNN

    Authors: Shweta Dahale, Sai Munikoti, Balasubramaniam Natarajan

    Abstract: Uncertainty quantification is a critical yet unsolved challenge for deep learning, especially for the time series imputation with irregularly sampled measurements. To tackle this problem, we propose a novel framework based on the principles of recurrent neural networks and neural stochastic differential equations for reconciling irregularly sampled measurements. We impute measurements at any arbit… ▽ More

    Submitted 1 June, 2023; originally announced June 2023.

    Comments: 7 pages, 3 figures

  11. arXiv:2305.19871  [pdf, other

    cs.LG

    There is more to graphs than meets the eye: Learning universal features with self-supervision

    Authors: Laya Das, Sai Munikoti, Nrushad Joshi, Mahantesh Halappanavar

    Abstract: We study the problem of learning features through self-supervision that are generalisable to multiple graphs. State-of-the-art graph self-supervision restricts training to only one graph, resulting in graph-specific models that are incompatible with different but related graphs. We hypothesize that training with more than one graph that belong to the same family can improve the quality of the lear… ▽ More

    Submitted 29 July, 2024; v1 submitted 31 May, 2023; originally announced May 2023.

    Comments: arXiv admin note: text overlap with arXiv:2302.11939, arXiv:2301.13287, arXiv:2305.12686, arXiv:2305.02299

  12. arXiv:2206.07922  [pdf, other

    cs.LG

    Challenges and Opportunities in Deep Reinforcement Learning with Graph Neural Networks: A Comprehensive review of Algorithms and Applications

    Authors: Sai Munikoti, Deepesh Agarwal, Laya Das, Mahantesh Halappanavar, Balasubramaniam Natarajan

    Abstract: Deep reinforcement learning (DRL) has empowered a variety of artificial intelligence fields, including pattern recognition, robotics, recommendation-systems, and gaming. Similarly, graph neural networks (GNN) have also demonstrated their superior performance in supervised learning for graph-structured data. In recent times, the fusion of GNN with DRL for graph-structured environments has attracted… ▽ More

    Submitted 7 November, 2022; v1 submitted 16 June, 2022; originally announced June 2022.

    Comments: 20 pages, 3 figures, 2 tables

  13. arXiv:2205.14834  [pdf, other

    cs.LG cs.SI

    GraMeR: Graph Meta Reinforcement Learning for Multi-Objective Influence Maximization

    Authors: Sai Munikoti, Balasubramaniam Natarajan, Mahantesh Halappanavar

    Abstract: Influence maximization (IM) is a combinatorial problem of identifying a subset of nodes called the seed nodes in a network (graph), which when activated, provide a maximal spread of influence in the network for a given diffusion model and a budget for seed set size. IM has numerous applications such as viral marketing, epidemic control, sensor placement and other network-related tasks. However, th… ▽ More

    Submitted 29 May, 2022; originally announced May 2022.

    Comments: 11 pages, 6 figures

  14. arXiv:2205.09968  [pdf, other

    cs.LG stat.ML

    A General Framework for quantifying Aleatoric and Epistemic uncertainty in Graph Neural Networks

    Authors: Sai Munikoti, Deepesh Agarwal, Laya Das, Balasubramaniam Natarajan

    Abstract: Graph Neural Networks (GNN) provide a powerful framework that elegantly integrates Graph theory with Machine learning for modeling and analysis of networked data. We consider the problem of quantifying the uncertainty in predictions of GNN stemming from modeling errors and measurement uncertainty. We consider aleatoric uncertainty in the form of probabilistic links and noise in feature vector of n… ▽ More

    Submitted 20 May, 2022; originally announced May 2022.

    Comments: 10 pages, 1 figure, 6 Tables

  15. Bayesian Graph Neural Network for Fast identification of critical nodes in Uncertain Complex Networks

    Authors: Sai Munikoti, Laya Das, Balasubramaniam Natarajan

    Abstract: In the quest to improve efficiency, interdependence and complexity are becoming defining characteristics of modern complex networks representing engineered and natural systems. Graph theory is a widely used framework for modeling such complex networks and to evaluate their robustness to disruptions. Particularly, identification of critical nodes/links in a graph can facilitate the enhancement of g… ▽ More

    Submitted 17 May, 2021; v1 submitted 26 December, 2020; originally announced December 2020.

    Comments: 6 pages, 2 figures, 3 Tables

  16. Scalable Graph Neural Network-based framework for identifying critical nodes and links in Complex Networks

    Authors: Sai Munikoti, Laya Das, Balasubramaniam Natarajan

    Abstract: Identifying critical nodes and links in graphs is a crucial task. These nodes/links typically represent critical elements/communication links that play a key role in a system's performance. However, a majority of the methods available in the literature on the identification of critical nodes/links are based on an iterative approach that explores each node/link of a graph at a time, repeating for a… ▽ More

    Submitted 10 May, 2021; v1 submitted 26 December, 2020; originally announced December 2020.

    Comments: 29 pages, single column, 3 figures