Search | arXiv e-print repository

Error-margin Analysis for Hidden Neuron Activation Labels

Authors: Abhilekha Dalal, Rushrukh Rayan, Pascal Hitzler

Abstract: Understanding how high-level concepts are represented within artificial neural networks is a fundamental challenge in the field of artificial intelligence. While existing literature in explainable AI emphasizes the importance of labeling neurons with concepts to understand their functioning, they mostly focus on identifying what stimulus activates a neuron in most cases, this corresponds to the no… ▽ More Understanding how high-level concepts are represented within artificial neural networks is a fundamental challenge in the field of artificial intelligence. While existing literature in explainable AI emphasizes the importance of labeling neurons with concepts to understand their functioning, they mostly focus on identifying what stimulus activates a neuron in most cases, this corresponds to the notion of recall in information retrieval. We argue that this is only the first-part of a two-part job, it is imperative to also investigate neuron responses to other stimuli, i.e., their precision. We call this the neuron labels error margin. △ Less

Submitted 14 May, 2024; originally announced May 2024.

arXiv:2405.03417 [pdf, other]

doi 10.1109/ACCESS.2024.3408318

Gaussian Splatting: 3D Reconstruction and Novel View Synthesis, a Review

Authors: Anurag Dalal, Daniel Hagen, Kjell G. Robbersmyr, Kristian Muri Knausgård

Abstract: Image-based 3D reconstruction is a challenging task that involves inferring the 3D shape of an object or scene from a set of input images. Learning-based methods have gained attention for their ability to directly estimate 3D shapes. This review paper focuses on state-of-the-art techniques for 3D reconstruction, including the generation of novel, unseen views. An overview of recent developments in… ▽ More Image-based 3D reconstruction is a challenging task that involves inferring the 3D shape of an object or scene from a set of input images. Learning-based methods have gained attention for their ability to directly estimate 3D shapes. This review paper focuses on state-of-the-art techniques for 3D reconstruction, including the generation of novel, unseen views. An overview of recent developments in the Gaussian Splatting method is provided, covering input types, model structures, output representations, and training strategies. Unresolved challenges and future directions are also discussed. Given the rapid progress in this domain and the numerous opportunities for enhancing 3D reconstruction methods, a comprehensive examination of algorithms appears essential. Consequently, this study offers a thorough overview of the latest advancements in Gaussian Splatting. △ Less

Submitted 6 May, 2024; originally announced May 2024.

Comments: 24 pages

ACM Class: I.2.10; I.3.6; I.3.7; I.3.8; I.4.5; I.4.8; I.4.10

arXiv:2404.13567 [pdf, other]

On the Value of Labeled Data and Symbolic Methods for Hidden Neuron Activation Analysis

Authors: Abhilekha Dalal, Rushrukh Rayan, Adrita Barua, Eugene Y. Vasserman, Md Kamruzzaman Sarker, Pascal Hitzler

Abstract: A major challenge in Explainable AI is in correctly interpreting activations of hidden neurons: accurate interpretations would help answer the question of what a deep learning system internally detects as relevant in the input, demystifying the otherwise black-box nature of deep learning systems. The state of the art indicates that hidden node activations can, in some cases, be interpretable in a… ▽ More A major challenge in Explainable AI is in correctly interpreting activations of hidden neurons: accurate interpretations would help answer the question of what a deep learning system internally detects as relevant in the input, demystifying the otherwise black-box nature of deep learning systems. The state of the art indicates that hidden node activations can, in some cases, be interpretable in a way that makes sense to humans, but systematic automated methods that would be able to hypothesize and verify interpretations of hidden neuron activations are underexplored. This is particularly the case for approaches that can both draw explanations from substantial background knowledge, and that are based on inherently explainable (symbolic) methods. In this paper, we introduce a novel model-agnostic post-hoc Explainable AI method demonstrating that it provides meaningful interpretations. Our approach is based on using a Wikipedia-derived concept hierarchy with approximately 2 million classes as background knowledge, and utilizes OWL-reasoning-based Concept Induction for explanation generation. Additionally, we explore and compare the capabilities of off-the-shelf pre-trained multimodal-based explainable methods. Our results indicate that our approach can automatically attach meaningful class expressions as explanations to individual neurons in the dense layer of a Convolutional Neural Network. Evaluation through statistical analysis and degree of concept activation in the hidden layer show that our method provides a competitive edge in both quantitative and qualitative aspects compared to prior work. △ Less

Submitted 21 April, 2024; originally announced April 2024.

arXiv:2401.16596 [pdf, other]

PrIsing: Privacy-Preserving Peer Effect Estimation via Ising Model

Authors: Abhinav Chakraborty, Anirban Chatterjee, Abhinandan Dalal

Abstract: The Ising model, originally developed as a spin-glass model for ferromagnetic elements, has gained popularity as a network-based model for capturing dependencies in agents' outputs. Its increasing adoption in healthcare and the social sciences has raised privacy concerns regarding the confidentiality of agents' responses. In this paper, we present a novel $(\varepsilon,δ)$-differentially private a… ▽ More The Ising model, originally developed as a spin-glass model for ferromagnetic elements, has gained popularity as a network-based model for capturing dependencies in agents' outputs. Its increasing adoption in healthcare and the social sciences has raised privacy concerns regarding the confidentiality of agents' responses. In this paper, we present a novel $(\varepsilon,δ)$-differentially private algorithm specifically designed to protect the privacy of individual agents' outcomes. Our algorithm allows for precise estimation of the natural parameter using a single network through an objective perturbation technique. Furthermore, we establish regret bounds for this algorithm and assess its performance on synthetic datasets and two real-world networks: one involving HIV status in a social network and the other concerning the political leaning of online blogs. △ Less

Submitted 29 January, 2024; originally announced January 2024.

Comments: To Appear in AISTATS 2024

arXiv:2308.03999 [pdf, other]

Understanding CNN Hidden Neuron Activations Using Structured Background Knowledge and Deductive Reasoning

Authors: Abhilekha Dalal, Md Kamruzzaman Sarker, Adrita Barua, Eugene Vasserman, Pascal Hitzler

Abstract: A major challenge in Explainable AI is in correctly interpreting activations of hidden neurons: accurate interpretations would provide insights into the question of what a deep learning system has internally detected as relevant on the input, demystifying the otherwise black-box character of deep learning systems. The state of the art indicates that hidden node activations can, in some cases, be i… ▽ More A major challenge in Explainable AI is in correctly interpreting activations of hidden neurons: accurate interpretations would provide insights into the question of what a deep learning system has internally detected as relevant on the input, demystifying the otherwise black-box character of deep learning systems. The state of the art indicates that hidden node activations can, in some cases, be interpretable in a way that makes sense to humans, but systematic automated methods that would be able to hypothesize and verify interpretations of hidden neuron activations are underexplored. In this paper, we provide such a method and demonstrate that it provides meaningful interpretations. Our approach is based on using large-scale background knowledge approximately 2 million classes curated from the Wikipedia concept hierarchy together with a symbolic reasoning approach called Concept Induction based on description logics, originally developed for applications in the Semantic Web field. Our results show that we can automatically attach meaningful labels from the background knowledge to individual neurons in the dense layer of a Convolutional Neural Network through a hypothesis and verification process. △ Less

Submitted 9 August, 2023; v1 submitted 7 August, 2023; originally announced August 2023.

arXiv:2303.01229 [pdf, other]

Almanac: Retrieval-Augmented Language Models for Clinical Medicine

Authors: Cyril Zakka, Akash Chaurasia, Rohan Shad, Alex R. Dalal, Jennifer L. Kim, Michael Moor, Kevin Alexander, Euan Ashley, Jack Boyd, Kathleen Boyd, Karen Hirsch, Curt Langlotz, Joanna Nelson, William Hiesinger

Abstract: Large-language models have recently demonstrated impressive zero-shot capabilities in a variety of natural language tasks such as summarization, dialogue generation, and question-answering. Despite many promising applications in clinical medicine, adoption of these models in real-world settings has been largely limited by their tendency to generate incorrect and sometimes even toxic statements. In… ▽ More Large-language models have recently demonstrated impressive zero-shot capabilities in a variety of natural language tasks such as summarization, dialogue generation, and question-answering. Despite many promising applications in clinical medicine, adoption of these models in real-world settings has been largely limited by their tendency to generate incorrect and sometimes even toxic statements. In this study, we develop Almanac, a large language model framework augmented with retrieval capabilities for medical guideline and treatment recommendations. Performance on a novel dataset of clinical scenarios (n = 130) evaluated by a panel of 5 board-certified and resident physicians demonstrates significant increases in factuality (mean of 18% at p-value < 0.05) across all specialties, with improvements in completeness and safety. Our results demonstrate the potential for large language models to be effective tools in the clinical decision-making process, while also emphasizing the importance of careful testing and deployment to mitigate their shortcomings. △ Less

Submitted 31 May, 2023; v1 submitted 28 February, 2023; originally announced March 2023.

arXiv:2301.09611 [pdf, other]

Explaining Deep Learning Hidden Neuron Activations using Concept Induction

Authors: Abhilekha Dalal, Md Kamruzzaman Sarker, Adrita Barua, Pascal Hitzler

Abstract: One of the current key challenges in Explainable AI is in correctly interpreting activations of hidden neurons. It seems evident that accurate interpretations thereof would provide insights into the question what a deep learning system has internally \emph{detected} as relevant on the input, thus lifting some of the black box character of deep learning systems. The state of the art on this front… ▽ More One of the current key challenges in Explainable AI is in correctly interpreting activations of hidden neurons. It seems evident that accurate interpretations thereof would provide insights into the question what a deep learning system has internally \emph{detected} as relevant on the input, thus lifting some of the black box character of deep learning systems. The state of the art on this front indicates that hidden node activations appear to be interpretable in a way that makes sense to humans, at least in some cases. Yet, systematic automated methods that would be able to first hypothesize an interpretation of hidden neuron activations, and then verify it, are mostly missing. In this paper, we provide such a method and demonstrate that it provides meaningful interpretations. It is based on using large-scale background knowledge -- a class hierarchy of approx. 2 million classes curated from the Wikipedia Concept Hierarchy -- together with a symbolic reasoning approach called \emph{concept induction} based on description logics that was originally developed for applications in the Semantic Web field. Our results show that we can automatically attach meaningful labels from the background knowledge to individual neurons in the dense layer of a Convolutional Neural Network through a hypothesis and verification process. △ Less

Submitted 23 January, 2023; originally announced January 2023.

Comments: Submitted to IJCAI-23

arXiv:2111.09304 [pdf, other]

Quantum-Assisted Support Vector Regression for Detecting Facial Landmarks

Authors: Archismita Dalal, Mohsen Bagherimehrab, Barry C. Sanders

Abstract: The classical machine-learning model for support vector regression (SVR) is widely used for regression tasks, including weather prediction, stock-market and real-estate pricing. However, a practically realisable quantum version for SVR remains to be formulated. We devise annealing-based algorithms, namely simulated and quantum-classical hybrid, for training two SVR models, and compare their empiri… ▽ More The classical machine-learning model for support vector regression (SVR) is widely used for regression tasks, including weather prediction, stock-market and real-estate pricing. However, a practically realisable quantum version for SVR remains to be formulated. We devise annealing-based algorithms, namely simulated and quantum-classical hybrid, for training two SVR models, and compare their empirical performances against the SVR implementation of Python's scikit-learn package and the SVR-based state-of-the-art algorithm for the facial landmark detection (FLD) problem. Our method is to derive a quadratic-unconstrained-binary formulation for the optimisation problem used for training a SVR model and solve this problem using annealing. Using D-Wave's Hybrid Solver, we construct a quantum-assisted SVR model, thereby demonstrating a slight advantage over classical models regarding landmark-detection accuracy. Furthermore, we observe that annealing-based SVR models predict landmarks with lower variances compared to the SVR models trained by greedy optimisation procedures. Our work is a proof-of-concept example for applying quantu-assisted SVR to a supervised learning task with a small training dataset. △ Less

Submitted 17 November, 2021; originally announced November 2021.

Comments: 20 pages, 6 figures

arXiv:2003.14310 [pdf, other]

Accelerography: Feasibility of Gesture Typing using Accelerometer

Authors: Arindam Roy Chowdhury, Abhinandan Dalal, Shubhajit Sen

Abstract: In this paper, we aim to look into the feasibility of constructing alphabets using gestures. The main idea is to construct gestures, that are easy to remember, not cumbersome to reproduce and easily identifiable. We construct gestures for the entire English alphabet and provide an algorithm to identify the gestures, even when they are constructed continuously. We tackle the problem statistically,… ▽ More In this paper, we aim to look into the feasibility of constructing alphabets using gestures. The main idea is to construct gestures, that are easy to remember, not cumbersome to reproduce and easily identifiable. We construct gestures for the entire English alphabet and provide an algorithm to identify the gestures, even when they are constructed continuously. We tackle the problem statistically, taking into account the problem of randomness in the hand movement gestures of users, and achieve an average accuracy of 97.33% with the entire English alphabet. △ Less

Submitted 29 March, 2020; originally announced March 2020.

arXiv:1507.02356 [pdf, other]

Intrinsic Non-stationary Covariance Function for Climate Modeling

Authors: Chintan A. Dalal, Vladimir Pavlovic, Robert E. Kopp

Abstract: Designing a covariance function that represents the underlying correlation is a crucial step in modeling complex natural systems, such as climate models. Geospatial datasets at a global scale usually suffer from non-stationarity and non-uniformly smooth spatial boundaries. A Gaussian process regression using a non-stationary covariance function has shown promise for this task, as this covariance f… ▽ More Designing a covariance function that represents the underlying correlation is a crucial step in modeling complex natural systems, such as climate models. Geospatial datasets at a global scale usually suffer from non-stationarity and non-uniformly smooth spatial boundaries. A Gaussian process regression using a non-stationary covariance function has shown promise for this task, as this covariance function adapts to the variable correlation structure of the underlying distribution. In this paper, we generalize the non-stationary covariance function to address the aforementioned global scale geospatial issues. We define this generalized covariance function as an intrinsic non-stationary covariance function, because it uses intrinsic statistics of the symmetric positive definite matrices to represent the characteristic length scale and, thereby, models the local stochastic process. Experiments on a synthetic and real dataset of relative sea level changes across the world demonstrate improvements in the error metrics for the regression estimates using our newly proposed approach. △ Less

Submitted 8 July, 2015; originally announced July 2015.

Comments: 9 pages, 3 figures

Showing 1–10 of 10 results for author: Dalal, A