Skip to main content

Showing 1–18 of 18 results for author: Barua, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2404.13567  [pdf, other

    cs.AI

    On the Value of Labeled Data and Symbolic Methods for Hidden Neuron Activation Analysis

    Authors: Abhilekha Dalal, Rushrukh Rayan, Adrita Barua, Eugene Y. Vasserman, Md Kamruzzaman Sarker, Pascal Hitzler

    Abstract: A major challenge in Explainable AI is in correctly interpreting activations of hidden neurons: accurate interpretations would help answer the question of what a deep learning system internally detects as relevant in the input, demystifying the otherwise black-box nature of deep learning systems. The state of the art indicates that hidden node activations can, in some cases, be interpretable in a… ▽ More

    Submitted 21 April, 2024; originally announced April 2024.

  2. arXiv:2404.11875  [pdf, other

    cs.AI

    Concept Induction using LLMs: a user experiment for assessment

    Authors: Adrita Barua, Cara Widmer, Pascal Hitzler

    Abstract: Explainable Artificial Intelligence (XAI) poses a significant challenge in providing transparent and understandable insights into complex AI models. Traditional post-hoc algorithms, while useful, often struggle to deliver interpretable explanations. Concept-based models offer a promising avenue by incorporating explicit representations of concepts to enhance interpretability. However, existing res… ▽ More

    Submitted 17 April, 2024; originally announced April 2024.

  3. arXiv:2403.08295  [pdf, other

    cs.CL cs.AI

    Gemma: Open Models Based on Gemini Research and Technology

    Authors: Gemma Team, Thomas Mesnard, Cassidy Hardin, Robert Dadashi, Surya Bhupatiraju, Shreya Pathak, Laurent Sifre, Morgane Rivière, Mihir Sanjay Kale, Juliette Love, Pouya Tafti, Léonard Hussenot, Pier Giuseppe Sessa, Aakanksha Chowdhery, Adam Roberts, Aditya Barua, Alex Botev, Alex Castro-Ros, Ambrose Slone, Amélie Héliou, Andrea Tacchetti, Anna Bulanova, Antonia Paterson, Beth Tsai, Bobak Shahriari , et al. (83 additional authors not shown)

    Abstract: This work introduces Gemma, a family of lightweight, state-of-the art open models built from the research and technology used to create Gemini models. Gemma models demonstrate strong performance across academic benchmarks for language understanding, reasoning, and safety. We release two sizes of models (2 billion and 7 billion parameters), and provide both pretrained and fine-tuned checkpoints. Ge… ▽ More

    Submitted 16 April, 2024; v1 submitted 13 March, 2024; originally announced March 2024.

  4. arXiv:2403.05530  [pdf, other

    cs.CL cs.AI

    Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

    Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1092 additional authors not shown)

    Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More

    Submitted 14 June, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

  5. arXiv:2402.01777  [pdf

    cs.CL cs.AI cs.HC

    On the Psychology of GPT-4: Moderately anxious, slightly masculine, honest, and humble

    Authors: Adrita Barua, Gary Brase, Ke Dong, Pascal Hitzler, Eugene Vasserman

    Abstract: We subject GPT-4 to a number of rigorous psychometric tests and analyze the results. We find that, compared to the average human, GPT-4 tends to show more honesty and humility, and less machiavellianism and narcissism. It sometimes exhibits ambivalent sexism, leans slightly toward masculinity, is moderately anxious but mostly not depressive (but not always). It shows human-average numerical litera… ▽ More

    Submitted 1 February, 2024; originally announced February 2024.

    Comments: 16 pages, 8 tables, 1 code repository

  6. arXiv:2401.17181  [pdf, other

    cs.CL

    Transfer Learning for Text Diffusion Models

    Authors: Kehang Han, Kathleen Kenealy, Aditya Barua, Noah Fiedel, Noah Constant

    Abstract: In this report, we explore the potential for text diffusion to replace autoregressive (AR) decoding for the training and deployment of large language models (LLMs). We are particularly interested to see whether pretrained AR models can be transformed into text diffusion models through a lightweight adaptation procedure we call ``AR2Diff''. We begin by establishing a strong baseline setup for train… ▽ More

    Submitted 30 January, 2024; originally announced January 2024.

  7. arXiv:2312.11805  [pdf, other

    cs.CL cs.AI cs.CV

    Gemini: A Family of Highly Capable Multimodal Models

    Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

    Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More

    Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

  8. arXiv:2308.03999  [pdf, other

    cs.LG cs.AI cs.CV

    Understanding CNN Hidden Neuron Activations Using Structured Background Knowledge and Deductive Reasoning

    Authors: Abhilekha Dalal, Md Kamruzzaman Sarker, Adrita Barua, Eugene Vasserman, Pascal Hitzler

    Abstract: A major challenge in Explainable AI is in correctly interpreting activations of hidden neurons: accurate interpretations would provide insights into the question of what a deep learning system has internally detected as relevant on the input, demystifying the otherwise black-box character of deep learning systems. The state of the art indicates that hidden node activations can, in some cases, be i… ▽ More

    Submitted 9 August, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

  9. arXiv:2301.09611  [pdf, other

    cs.LG

    Explaining Deep Learning Hidden Neuron Activations using Concept Induction

    Authors: Abhilekha Dalal, Md Kamruzzaman Sarker, Adrita Barua, Pascal Hitzler

    Abstract: One of the current key challenges in Explainable AI is in correctly interpreting activations of hidden neurons. It seems evident that accurate interpretations thereof would provide insights into the question what a deep learning system has internally \emph{detected} as relevant on the input, thus lifting some of the black box character of deep learning systems. The state of the art on this front… ▽ More

    Submitted 23 January, 2023; originally announced January 2023.

    Comments: Submitted to IJCAI-23

  10. arXiv:2210.03719  [pdf, other

    cs.CR

    BayesImposter: Bayesian Estimation Based .bss Imposter Attack on Industrial Control Systems

    Authors: Anomadarshi Barua, Lelin Pan, Mohammad Abdullah Al Faruque

    Abstract: Over the last six years, several papers used memory deduplication to trigger various security issues, such as leaking heap-address and causing bit-flip in the physical memory. The most essential requirement for successful memory deduplication is to provide identical copies of a physical page. Recent works use a brute-force approach to create identical copies of a physical page that is an inaccurat… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

  11. arXiv:2210.03688  [pdf, other

    cs.CR

    A Wolf in Sheep's Clothing: Spreading Deadly Pathogens Under the Disguise of Popular Music

    Authors: Anomadarshi Barua, Yonatan Gizachew Achamyeleh, Mohammad Abdullah Al Faruque

    Abstract: A Negative Pressure Room (NPR) is an essential requirement by the Bio-Safety Levels (BSLs) in biolabs or infectious-control hospitals to prevent deadly pathogens from being leaked from the facility. An NPR maintains a negative pressure inside with respect to the outside reference space so that microbes are contained inside of an NPR. Nowadays, differential pressure sensors (DPSs) are utilized by t… ▽ More

    Submitted 7 October, 2022; originally announced October 2022.

  12. arXiv:2208.09741  [pdf, other

    cs.CR

    Sensor Security: Current Progress, Research Challenges, and Future Roadmap

    Authors: Anomadarshi Barua, Mohammad Abdullah Al Faruque

    Abstract: Sensors are one of the most pervasive and integral components of today's safety-critical systems. Sensors serve as a bridge between physical quantities and connected systems. The connected systems with sensors blindly believe the sensor as there is no way to authenticate the signal coming from a sensor. This could be an entry point for an attacker. An attacker can inject a fake input signal along… ▽ More

    Submitted 20 August, 2022; originally announced August 2022.

  13. arXiv:2205.12647  [pdf, other

    cs.CL

    Overcoming Catastrophic Forgetting in Zero-Shot Cross-Lingual Generation

    Authors: Tu Vu, Aditya Barua, Brian Lester, Daniel Cer, Mohit Iyyer, Noah Constant

    Abstract: In this paper, we explore the challenging problem of performing a generative task in a target language when labeled data is only available in English, using summarization as a case study. We assume a strict setting with no access to parallel data or machine translation and find that common transfer learning approaches struggle in this setting, as a generative multilingual model fine-tuned purely o… ▽ More

    Submitted 23 October, 2022; v1 submitted 25 May, 2022; originally announced May 2022.

    Comments: Accepted as a main conference paper at EMNLP 2022, 22 pages, 8 figures, 11 tables

  14. arXiv:2105.13626  [pdf, other

    cs.CL

    ByT5: Towards a token-free future with pre-trained byte-to-byte models

    Authors: Linting Xue, Aditya Barua, Noah Constant, Rami Al-Rfou, Sharan Narang, Mihir Kale, Adam Roberts, Colin Raffel

    Abstract: Most widely-used pre-trained language models operate on sequences of tokens corresponding to word or subword units. By comparison, token-free models that operate directly on raw text (bytes or characters) have many benefits: they can process text in any language out of the box, they are more robust to noise, and they minimize technical debt by removing complex and error-prone text preprocessing pi… ▽ More

    Submitted 7 March, 2022; v1 submitted 28 May, 2021; originally announced May 2021.

    Comments: To be published in TACL 2022

  15. arXiv:2010.11934  [pdf, other

    cs.CL

    mT5: A massively multilingual pre-trained text-to-text transformer

    Authors: Linting Xue, Noah Constant, Adam Roberts, Mihir Kale, Rami Al-Rfou, Aditya Siddhant, Aditya Barua, Colin Raffel

    Abstract: The recent "Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on a wide variety of English-language NLP tasks. In this paper, we introduce mT5, a multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages. We detail the design and modified training of mT5 and demonstrate its s… ▽ More

    Submitted 11 March, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

  16. arXiv:2004.05484  [pdf, other

    cs.CL cs.LG

    LAReQA: Language-agnostic answer retrieval from a multilingual pool

    Authors: Uma Roy, Noah Constant, Rami Al-Rfou, Aditya Barua, Aaron Phillips, Yinfei Yang

    Abstract: We present LAReQA, a challenging new benchmark for language-agnostic answer retrieval from a multilingual candidate pool. Unlike previous cross-lingual tasks, LAReQA tests for "strong" cross-lingual alignment, requiring semantically related cross-language pairs to be closer in representation space than unrelated same-language pairs. Building on multilingual BERT (mBERT), we study different strateg… ▽ More

    Submitted 11 April, 2020; originally announced April 2020.

  17. arXiv:1405.1397  [pdf

    cs.AI

    Analysis Tool for UNL-Based Knowledge Representation

    Authors: Shamim Ripon, Aoyan Barua, Mohammad Salah Uddin

    Abstract: The fundamental issue in knowledge representation is to provide a precise definition of the knowledge that they possess in a manner that is independent of procedural considerations, context free and easy to manipulate, exchange and reason about. Knowledge must be accessible to everyone regardless of their native languages. Universal Networking Language (UNL) is a declarative formal language and a… ▽ More

    Submitted 4 May, 2014; originally announced May 2014.

    Comments: 8 pages, 5 figures. arXiv admin note: text overlap with arXiv:cs/0404030 by other authors

    Journal ref: Journal of Advanced Computer Science and Technology Research (JACSTR) Vol. 2, No. 4, pp. 176-183, 2012

  18. Web Service Composition - BPEL vs cCSP Process Algebra

    Authors: Shamim Ripon, Mohammad Salah Uddin, Aoyan Barua

    Abstract: Web services technology provides a platform on which we can develop distributed services. The interoperability among these services is achieved by various standard protocols. In recent years, several researches suggested that process algebras provide a satisfactory assistance to the whole process of web services development. Business transactions, on the other hand, involve the coordination and in… ▽ More

    Submitted 23 February, 2014; originally announced February 2014.

    Comments: 6 pages, 4 figures, Advanced Computer Science Applications and Technologies (ACSAT), 2012 International Conference on