-
Targeted Background Removal Creates Interpretable Feature Visualizations
Authors:
Ian E. Nielsen,
Erik Grundeland,
Joseph Snedeker,
Ghulam Rasool,
Ravi P. Ramachandran
Abstract:
Feature visualization is used to visualize learned features for black box machine learning models. Our approach explores an altered training process to improve interpretability of the visualizations. We argue that by using background removal techniques as a form of robust training, a network is forced to learn more human recognizable features, namely, by focusing on the main object of interest wit…
▽ More
Feature visualization is used to visualize learned features for black box machine learning models. Our approach explores an altered training process to improve interpretability of the visualizations. We argue that by using background removal techniques as a form of robust training, a network is forced to learn more human recognizable features, namely, by focusing on the main object of interest without any distractions from the background. Four different training methods were used to verify this hypothesis. The first used unmodified pictures. The second used a black background. The third utilized Gaussian noise as the background. The fourth approach employed a mix of background removed images and unmodified images. The feature visualization results show that the background removed images reveal a significant improvement over the baseline model. These new results displayed easily recognizable features from their respective classes, unlike the model trained on unmodified data.
△ Less
Submitted 22 June, 2023;
originally announced June 2023.
-
EvalAttAI: A Holistic Approach to Evaluating Attribution Maps in Robust and Non-Robust Models
Authors:
Ian E. Nielsen,
Ravi P. Ramachandran,
Nidhal Bouaynaya,
Hassan M. Fathallah-Shaykh,
Ghulam Rasool
Abstract:
The expansion of explainable artificial intelligence as a field of research has generated numerous methods of visualizing and understanding the black box of a machine learning model. Attribution maps are generally used to highlight the parts of the input image that influence the model to make a specific decision. On the other hand, the robustness of machine learning models to natural noise and adv…
▽ More
The expansion of explainable artificial intelligence as a field of research has generated numerous methods of visualizing and understanding the black box of a machine learning model. Attribution maps are generally used to highlight the parts of the input image that influence the model to make a specific decision. On the other hand, the robustness of machine learning models to natural noise and adversarial attacks is also being actively explored. This paper focuses on evaluating methods of attribution mapping to find whether robust neural networks are more explainable. We explore this problem within the application of classification for medical imaging. Explainability research is at an impasse. There are many methods of attribution mapping, but no current consensus on how to evaluate them and determine the ones that are the best. Our experiments on multiple datasets (natural and medical imaging) and various attribution methods reveal that two popular evaluation metrics, Deletion and Insertion, have inherent limitations and yield contradictory results. We propose a new explainability faithfulness metric (called EvalAttAI) that addresses the limitations of prior metrics. Using our novel evaluation, we found that Bayesian deep neural networks using the Variational Density Propagation technique were consistently more explainable when used with the best performing attribution method, the Vanilla Gradient. However, in general, various types of robust neural networks may not be more explainable, despite these models producing more visually plausible attribution maps.
△ Less
Submitted 15 March, 2023;
originally announced March 2023.
-
Spelling convention sensitivity in neural language models
Authors:
Elizabeth Nielsen,
Christo Kirov,
Brian Roark
Abstract:
We examine whether large neural language models, trained on very large collections of varied English text, learn the potentially long-distance dependency of British versus American spelling conventions, i.e., whether spelling is consistently one or the other within model-generated strings. In contrast to long-distance dependencies in non-surface underlying structure (e.g., syntax), spelling consis…
▽ More
We examine whether large neural language models, trained on very large collections of varied English text, learn the potentially long-distance dependency of British versus American spelling conventions, i.e., whether spelling is consistently one or the other within model-generated strings. In contrast to long-distance dependencies in non-surface underlying structure (e.g., syntax), spelling consistency is easier to measure both in LMs and the text corpora used to train them, which can provide additional insight into certain observed model behaviors. Using a set of probe words unique to either British or American English, we first establish that training corpora exhibit substantial (though not total) consistency. A large T5 language model does appear to internalize this consistency, though only with respect to observed lexical items (not nonce words with British/American spelling patterns). We further experiment with correcting for biases in the training data by fine-tuning T5 on synthetic data that has been debiased, and find that finetuned T5 remains only somewhat sensitive to spelling consistency. Further experiments show GPT2 to be similarly limited.
△ Less
Submitted 6 March, 2023;
originally announced March 2023.
-
Prosodic features improve sentence segmentation and parsing
Authors:
Elizabeth Nielsen,
Sharon Goldwater,
Mark Steedman
Abstract:
Parsing spoken dialogue presents challenges that parsing text does not, including a lack of clear sentence boundaries. We know from previous work that prosody helps in parsing single sentences (Tran et al. 2018), but we want to show the effect of prosody on parsing speech that isn't segmented into sentences. In experiments on the English Switchboard corpus, we find prosody helps our model both wit…
▽ More
Parsing spoken dialogue presents challenges that parsing text does not, including a lack of clear sentence boundaries. We know from previous work that prosody helps in parsing single sentences (Tran et al. 2018), but we want to show the effect of prosody on parsing speech that isn't segmented into sentences. In experiments on the English Switchboard corpus, we find prosody helps our model both with parsing and with accurately identifying sentence boundaries. However, we find that the best-performing parser is not necessarily the parser that produces the best sentence segmentation performance. We suggest that the best parses instead come from modelling sentence boundaries jointly with other constituent boundaries.
△ Less
Submitted 23 February, 2023;
originally announced February 2023.
-
Finding smart contract vulnerabilities with ConCert's property-based testing framework
Authors:
Mikkel Milo,
Eske Hoy Nielsen,
Danil Annenkov,
Bas Spitters
Abstract:
We provide three detailed case studies of vulnerabilities in smart contracts, and show how property-based testing would have found them:
1. the Dexter1 token exchange;
2. the iToken;
3. the ICO of Brave's BAT token.
The last example is, in fact, new, and was missed in the auditing process. We have implemented this testing in ConCert, a general executable model/specification of smart contra…
▽ More
We provide three detailed case studies of vulnerabilities in smart contracts, and show how property-based testing would have found them:
1. the Dexter1 token exchange;
2. the iToken;
3. the ICO of Brave's BAT token.
The last example is, in fact, new, and was missed in the auditing process. We have implemented this testing in ConCert, a general executable model/specification of smart contract execution in the Coq proof assistant. ConCert contracts can be used to generate verified smart contracts in Tezos' LIGO and Concordium's rust language. We thus show the effectiveness of combining formal verification and property-based testing of smart contracts.
△ Less
Submitted 1 August, 2022;
originally announced August 2022.
-
Zero-shot Cross-Linguistic Learning of Event Semantics
Authors:
Malihe Alikhani,
Thomas Kober,
Bashar Alhafni,
Yue Chen,
Mert Inan,
Elizabeth Nielsen,
Shahab Raji,
Mark Steedman,
Matthew Stone
Abstract:
Typologically diverse languages offer systems of lexical and grammatical aspect that allow speakers to focus on facets of event structure in ways that comport with the specific communicative setting and discourse constraints they face. In this paper, we look specifically at captions of images across Arabic, Chinese, Farsi, German, Russian, and Turkish and describe a computational model for predict…
▽ More
Typologically diverse languages offer systems of lexical and grammatical aspect that allow speakers to focus on facets of event structure in ways that comport with the specific communicative setting and discourse constraints they face. In this paper, we look specifically at captions of images across Arabic, Chinese, Farsi, German, Russian, and Turkish and describe a computational model for predicting lexical aspects. Despite the heterogeneity of these languages, and the salient invocation of distinctive linguistic resources across their caption corpora, speakers of these languages show surprising similarities in the ways they frame image content. We leverage this observation for zero-shot cross-lingual learning and show that lexical aspects can be predicted for a given language despite not having observed any annotated data for this language at all.
△ Less
Submitted 5 July, 2022;
originally announced July 2022.
-
Transformers in Time-series Analysis: A Tutorial
Authors:
Sabeen Ahmed,
Ian E. Nielsen,
Aakash Tripathi,
Shamoon Siddiqui,
Ghulam Rasool,
Ravi P. Ramachandran
Abstract:
Transformer architecture has widespread applications, particularly in Natural Language Processing and computer vision. Recently Transformers have been employed in various aspects of time-series analysis. This tutorial provides an overview of the Transformer architecture, its applications, and a collection of examples from recent research papers in time-series analysis. We delve into an explanation…
▽ More
Transformer architecture has widespread applications, particularly in Natural Language Processing and computer vision. Recently Transformers have been employed in various aspects of time-series analysis. This tutorial provides an overview of the Transformer architecture, its applications, and a collection of examples from recent research papers in time-series analysis. We delve into an explanation of the core components of the Transformer, including the self-attention mechanism, positional encoding, multi-head, and encoder/decoder. Several enhancements to the initial, Transformer architecture are highlighted to tackle time-series tasks. The tutorial also provides best practices and techniques to overcome the challenge of effectively training Transformers for time-series analysis.
△ Less
Submitted 1 July, 2023; v1 submitted 28 April, 2022;
originally announced May 2022.
-
Formalising Decentralised Exchanges in Coq
Authors:
Eske Hoy Nielsen,
Danil Annenkov,
Bas Spitters
Abstract:
The number of attacks and accidents leading to significant losses of crypto-assets is growing. According to Chainalysis, in 2021, approx. $14 billion has been lost due to various incidents, and this number is dominated by Decentralized Finance (DeFi) applications. In order to address these issues, one can use a collection of tools ranging from auditing to formal methods. We use formal verification…
▽ More
The number of attacks and accidents leading to significant losses of crypto-assets is growing. According to Chainalysis, in 2021, approx. $14 billion has been lost due to various incidents, and this number is dominated by Decentralized Finance (DeFi) applications. In order to address these issues, one can use a collection of tools ranging from auditing to formal methods. We use formal verification and provide the first formalisation of a DeFi contract in a foundational proof assistant capturing contract interactions. We focus on Dexter2, a decentralized, non-custodial exchange for the Tezos network similar to Uniswap on Ethereum. The Dexter implementation consists of several smart contracts. This poses unique challenges for formalisation due to the complex contract interactions. Our formalisation includes proofs of functional correctness with respect to an informal specification for the contracts involved in Dexter's implementation. Moreover, our formalisation is the first to feature proofs of safety properties of the interacting smart contracts of a decentralized exchange. We have extracted our contract from Coq into CameLIGO code, so it can be deployed on the Tezos blockchain. Uniswap and Dexter are paradigmatic for a collection of similar contracts. Our methodology thus allows us to implement and verify DeFi applications featuring similar interaction patterns.
△ Less
Submitted 11 March, 2022;
originally announced March 2022.
-
Autonomous Rollator: A Case Study in the Agebots Project
Authors:
Jonas Frei,
Anina Havelka,
Markus Wüst,
Einar Nielsen,
Andreas Ziltener,
Katrin S. Lohan
Abstract:
In this paper, we present an iterative development process for a functional model of an autonomous, location-orienting rollator. An interdisciplinary team was involved in the development, working closely with the end-users. This example shows that the design thinking method is suitable for the development of frontier technology devices in the care sector.
In this paper, we present an iterative development process for a functional model of an autonomous, location-orienting rollator. An interdisciplinary team was involved in the development, working closely with the end-users. This example shows that the design thinking method is suitable for the development of frontier technology devices in the care sector.
△ Less
Submitted 16 September, 2021; v1 submitted 31 August, 2021;
originally announced August 2021.
-
Robust Explainability: A Tutorial on Gradient-Based Attribution Methods for Deep Neural Networks
Authors:
Ian E. Nielsen,
Dimah Dera,
Ghulam Rasool,
Nidhal Bouaynaya,
Ravi P. Ramachandran
Abstract:
With the rise of deep neural networks, the challenge of explaining the predictions of these networks has become increasingly recognized. While many methods for explaining the decisions of deep neural networks exist, there is currently no consensus on how to evaluate them. On the other hand, robustness is a popular topic for deep learning research; however, it is hardly talked about in explainabili…
▽ More
With the rise of deep neural networks, the challenge of explaining the predictions of these networks has become increasingly recognized. While many methods for explaining the decisions of deep neural networks exist, there is currently no consensus on how to evaluate them. On the other hand, robustness is a popular topic for deep learning research; however, it is hardly talked about in explainability until very recently. In this tutorial paper, we start by presenting gradient-based interpretability methods. These techniques use gradient signals to assign the burden of the decision on the input features. Later, we discuss how gradient-based methods can be evaluated for their robustness and the role that adversarial robustness plays in having meaningful explanations. We also discuss the limitations of gradient-based methods. Finally, we present the best practices and attributes that should be examined before choosing an explainability method. We conclude with the future directions for research in the area at the convergence of robustness and explainability.
△ Less
Submitted 13 January, 2022; v1 submitted 23 July, 2021;
originally announced July 2021.
-
Prosodic segmentation for parsing spoken dialogue
Authors:
Elizabeth Nielsen,
Mark Steedman,
Sharon Goldwater
Abstract:
Parsing spoken dialogue poses unique difficulties, including disfluencies and unmarked boundaries between sentence-like units. Previous work has shown that prosody can help with parsing disfluent speech (Tran et al. 2018), but has assumed that the input to the parser is already segmented into sentence-like units (SUs), which isn't true in existing speech applications. We investigate how prosody af…
▽ More
Parsing spoken dialogue poses unique difficulties, including disfluencies and unmarked boundaries between sentence-like units. Previous work has shown that prosody can help with parsing disfluent speech (Tran et al. 2018), but has assumed that the input to the parser is already segmented into sentence-like units (SUs), which isn't true in existing speech applications. We investigate how prosody affects a parser that receives an entire dialogue turn as input (a turn-based model), instead of gold standard pre-segmented SUs (an SU-based model). In experiments on the English Switchboard corpus, we find that when using transcripts alone, the turn-based model has trouble segmenting SUs, leading to worse parse performance than the SU-based model. However, prosody can effectively replace gold standard SU boundaries: with prosody, the turn-based model performs as well as the SU-based model (90.79 vs. 90.65 F1 score, respectively), despite performing two tasks (SU segmentation and parsing) rather than one (parsing alone). Analysis shows that pitch and intensity features are the most important for this corpus, since they allow the model to correctly distinguish an SU boundary from a speech disfluency -- a distinction that the model otherwise struggles to make.
△ Less
Submitted 12 October, 2021; v1 submitted 26 May, 2021;
originally announced May 2021.
-
The role of context in neural pitch accent detection in English
Authors:
Elizabeth Nielsen,
Mark Steedman,
Sharon Goldwater
Abstract:
Prosody is a rich information source in natural language, serving as a marker for phenomena such as contrast. In order to make this information available to downstream tasks, we need a way to detect prosodic events in speech. We propose a new model for pitch accent detection, inspired by the work of Stehwien et al. (2018), who presented a CNN-based model for this task. Our model makes greater use…
▽ More
Prosody is a rich information source in natural language, serving as a marker for phenomena such as contrast. In order to make this information available to downstream tasks, we need a way to detect prosodic events in speech. We propose a new model for pitch accent detection, inspired by the work of Stehwien et al. (2018), who presented a CNN-based model for this task. Our model makes greater use of context by using full utterances as input and adding an LSTM layer. We find that these innovations lead to an improvement from 87.5% to 88.7% accuracy on pitch accent detection on American English speech in the Boston University Radio News Corpus, a state-of-the-art result. We also find that a simple baseline that just predicts a pitch accent on every content word yields 82.2% accuracy, and we suggest that this is the appropriate baseline for this task. Finally, we conduct ablation tests that show pitch is the most important acoustic feature for this task and this corpus.
△ Less
Submitted 12 October, 2020; v1 submitted 30 April, 2020;
originally announced April 2020.
-
TensorFlow.js: Machine Learning for the Web and Beyond
Authors:
Daniel Smilkov,
Nikhil Thorat,
Yannick Assogba,
Ann Yuan,
Nick Kreeger,
Ping Yu,
Kangyi Zhang,
Shanqing Cai,
Eric Nielsen,
David Soergel,
Stan Bileschi,
Michael Terry,
Charles Nicholson,
Sandeep N. Gupta,
Sarah Sirajuddin,
D. Sculley,
Rajat Monga,
Greg Corrado,
Fernanda B. Viégas,
Martin Wattenberg
Abstract:
TensorFlow.js is a library for building and executing machine learning algorithms in JavaScript. TensorFlow.js models run in a web browser and in the Node.js environment. The library is part of the TensorFlow ecosystem, providing a set of APIs that are compatible with those in Python, allowing models to be ported between the Python and JavaScript ecosystems. TensorFlow.js has empowered a new set o…
▽ More
TensorFlow.js is a library for building and executing machine learning algorithms in JavaScript. TensorFlow.js models run in a web browser and in the Node.js environment. The library is part of the TensorFlow ecosystem, providing a set of APIs that are compatible with those in Python, allowing models to be ported between the Python and JavaScript ecosystems. TensorFlow.js has empowered a new set of developers from the extensive JavaScript community to build and deploy machine learning models and enabled new classes of on-device computation. This paper describes the design, API, and implementation of TensorFlow.js, and highlights some of the impactful use cases.
△ Less
Submitted 27 February, 2019; v1 submitted 16 January, 2019;
originally announced January 2019.