Search | arXiv e-print repository

Apple Intelligence Foundation Language Models

Authors: Tom Gunter, Zirui Wang, Chong Wang, Ruoming Pang, Andy Narayanan, Aonan Zhang, Bowen Zhang, Chen Chen, Chung-Cheng Chiu, David Qiu, Deepak Gopinath, Dian Ang Yap, Dong Yin, Feng Nan, Floris Weers, Guoli Yin, Haoshuo Huang, Jianyu Wang, Jiarui Lu, John Peebles, Ke Ye, Mark Lee, Nan Du, Qibin Chen, Quentin Keunebroek , et al. (130 additional authors not shown)

Abstract: We present foundation language models developed to power Apple Intelligence features, including a ~3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute. These models are designed to perform a wide range of tasks efficiently, accurately, and responsibly. This report describes the model architecture, the data used… ▽ More We present foundation language models developed to power Apple Intelligence features, including a ~3 billion parameter model designed to run efficiently on devices and a large server-based language model designed for Private Cloud Compute. These models are designed to perform a wide range of tasks efficiently, accurately, and responsibly. This report describes the model architecture, the data used to train the model, the training process, how the models are optimized for inference, and the evaluation results. We highlight our focus on Responsible AI and how the principles are applied throughout the model development. △ Less

Submitted 29 July, 2024; originally announced July 2024.

arXiv:2407.08730 [pdf, other]

Evaluating Deep Neural Networks in Deployment (A Comparative and Replicability Study)

Authors: Eduard Pinconschi, Divya Gopinath, Rui Abreu, Corina S. Pasareanu

Abstract: As deep neural networks (DNNs) are increasingly used in safety-critical applications, there is a growing concern for their reliability. Even highly trained, high-performant networks are not 100% accurate. However, it is very difficult to predict their behavior during deployment without ground truth. In this paper, we provide a comparative and replicability study on recent approaches that have been… ▽ More As deep neural networks (DNNs) are increasingly used in safety-critical applications, there is a growing concern for their reliability. Even highly trained, high-performant networks are not 100% accurate. However, it is very difficult to predict their behavior during deployment without ground truth. In this paper, we provide a comparative and replicability study on recent approaches that have been proposed to evaluate the reliability of DNNs in deployment. We find that it is hard to run and reproduce the results for these approaches on their replication packages and even more difficult to run them on artifacts other than their own. Further, it is difficult to compare the effectiveness of the approaches, due to the lack of clearly defined evaluation metrics. Our results indicate that more effort is needed in our research community to obtain sound techniques for evaluating the reliability of neural networks in safety-critical domains. To this end, we contribute an evaluation framework that incorporates the considered approaches and enables evaluation on common benchmarks, using common metrics. △ Less

Submitted 27 July, 2024; v1 submitted 11 July, 2024; originally announced July 2024.

arXiv:2406.09810 [pdf, other]

Think Deep and Fast: Learning Neural Nonlinear Opinion Dynamics from Inverse Dynamic Games for Split-Second Interactions

Authors: Haimin Hu, Jonathan DeCastro, Deepak Gopinath, Guy Rosman, Naomi Ehrich Leonard, Jaime Fernández Fisac

Abstract: Non-cooperative interactions commonly occur in multi-agent scenarios such as car racing, where an ego vehicle can choose to overtake the rival, or stay behind it until a safe overtaking "corridor" opens. While an expert human can do well at making such time-sensitive decisions, the development of safe and efficient game-theoretic trajectory planners capable of rapidly reasoning discrete options is… ▽ More Non-cooperative interactions commonly occur in multi-agent scenarios such as car racing, where an ego vehicle can choose to overtake the rival, or stay behind it until a safe overtaking "corridor" opens. While an expert human can do well at making such time-sensitive decisions, the development of safe and efficient game-theoretic trajectory planners capable of rapidly reasoning discrete options is yet to be fully addressed. The recently developed nonlinear opinion dynamics (NOD) show promise in enabling fast opinion formation and avoiding safety-critical deadlocks. However, it remains an open challenge to determine the model parameters of NOD automatically and adaptively, accounting for the ever-changing environment of interaction. In this work, we propose for the first time a learning-based, game-theoretic approach to synthesize a Neural NOD model from expert demonstrations, given as a dataset containing (possibly incomplete) state and action trajectories of interacting agents. The learned NOD can be used by existing dynamic game solvers to plan decisively while accounting for the predicted change of other agents' intents, thus enabling situational awareness in planning. We demonstrate Neural NOD's ability to make fast and robust decisions in a simulated autonomous racing example, leading to tangible improvements in safety and overtaking performance over state-of-the-art data-driven game-theoretic planning methods. △ Less

Submitted 14 June, 2024; originally announced June 2024.

arXiv:2403.19837 [pdf, other]

Concept-based Analysis of Neural Networks via Vision-Language Models

Authors: Ravi Mangal, Nina Narodytska, Divya Gopinath, Boyue Caroline Hu, Anirban Roy, Susmit Jha, Corina Pasareanu

Abstract: The analysis of vision-based deep neural networks (DNNs) is highly desirable but it is very challenging due to the difficulty of expressing formal specifications for vision tasks and the lack of efficient verification procedures. In this paper, we propose to leverage emerging multimodal, vision-language, foundation models (VLMs) as a lens through which we can reason about vision models. VLMs have… ▽ More The analysis of vision-based deep neural networks (DNNs) is highly desirable but it is very challenging due to the difficulty of expressing formal specifications for vision tasks and the lack of efficient verification procedures. In this paper, we propose to leverage emerging multimodal, vision-language, foundation models (VLMs) as a lens through which we can reason about vision models. VLMs have been trained on a large body of images accompanied by their textual description, and are thus implicitly aware of high-level, human-understandable concepts describing the images. We describe a logical specification language $\texttt{Con}_{\texttt{spec}}$ designed to facilitate writing specifications in terms of these concepts. To define and formally check $\texttt{Con}_{\texttt{spec}}$ specifications, we build a map between the internal representations of a given vision model and a VLM, leading to an efficient verification procedure of natural-language properties for vision models. We demonstrate our techniques on a ResNet-based classifier trained on the RIVAL-10 dataset using CLIP as the multimodal model. △ Less

Submitted 10 April, 2024; v1 submitted 28 March, 2024; originally announced March 2024.

arXiv:2402.14174 [pdf, other]

Blending Data-Driven Priors in Dynamic Games

Authors: Justin Lidard, Haimin Hu, Asher Hancock, Zixu Zhang, Albert Gimó Contreras, Vikash Modi, Jonathan DeCastro, Deepak Gopinath, Guy Rosman, Naomi Ehrich Leonard, María Santos, Jaime Fernández Fisac

Abstract: As intelligent robots like autonomous vehicles become increasingly deployed in the presence of people, the extent to which these systems should leverage model-based game-theoretic planners versus data-driven policies for safe, interaction-aware motion planning remains an open question. Existing dynamic game formulations assume all agents are task-driven and behave optimally. However, in reality, h… ▽ More As intelligent robots like autonomous vehicles become increasingly deployed in the presence of people, the extent to which these systems should leverage model-based game-theoretic planners versus data-driven policies for safe, interaction-aware motion planning remains an open question. Existing dynamic game formulations assume all agents are task-driven and behave optimally. However, in reality, humans tend to deviate from the decisions prescribed by these models, and their behavior is better approximated under a noisy-rational paradigm. In this work, we investigate a principled methodology to blend a data-driven reference policy with an optimization-based game-theoretic policy. We formulate KLGame, an algorithm for solving non-cooperative dynamic game with Kullback-Leibler (KL) regularization with respect to a general, stochastic, and possibly multi-modal reference policy. Our method incorporates, for each decision maker, a tunable parameter that permits modulation between task-driven and data-driven behaviors. We propose an efficient algorithm for computing multi-modal approximate feedback Nash equilibrium strategies of KLGame in real time. Through a series of simulated and real-world autonomous driving scenarios, we demonstrate that KLGame policies can more effectively incorporate guidance from the reference policy and account for noisily-rational human behaviors versus non-regularized baselines. Website with additional information, videos, and code: https://kl-games.github.io/. △ Less

Submitted 6 July, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

Comments: 20 pages, 12 figures

arXiv:2402.05893 [pdf, other]

Personalizing Driver Safety Interfaces via Driver Cognitive Factors Inference

Authors: Emily S Sumner, Jonathan DeCastro, Jean Costa, Deepak E Gopinath, Everlyne Kimani, Shabnam Hakimi, Allison Morgan, Andrew Best, Hieu Nguyen, Daniel J Brooks, Bassam ul Haq, Andrew Patrikalakis, Hiroshi Yasuda, Kate Sieck, Avinash Balachandran, Tiffany Chen, Guy Rosman

Abstract: Recent advances in AI and intelligent vehicle technology hold promise to revolutionize mobility and transportation, in the form of advanced driving assistance (ADAS) interfaces. Although it is widely recognized that certain cognitive factors, such as impulsivity and inhibitory control, are related to risky driving behavior, play a significant role in on-road risk-taking, existing systems fail to l… ▽ More Recent advances in AI and intelligent vehicle technology hold promise to revolutionize mobility and transportation, in the form of advanced driving assistance (ADAS) interfaces. Although it is widely recognized that certain cognitive factors, such as impulsivity and inhibitory control, are related to risky driving behavior, play a significant role in on-road risk-taking, existing systems fail to leverage such factors. Varying levels of these cognitive factors could influence the effectiveness and acceptance of driver safety interfaces. We demonstrate an approach for personalizing driver interaction via driver safety interfaces that are triggered based on a learned recurrent neural network. The network is trained from a population of human drivers to infer impulsivity and inhibitory control from recent driving behavior. Using a high-fidelity vehicle motion simulator, we demonstrate the ability to deduce these factors from driver behavior. We then use these inferred factors to make instantaneous determinations on whether or not to engage a driver safety interface. This interface aims to decrease a driver's speed during yellow lights and reduce their inclination to run through them. △ Less

Submitted 8 February, 2024; originally announced February 2024.

Comments: 12 pages, 7 figures

arXiv:2305.20041 [pdf, other]

Simulation and Retargeting of Complex Multi-Character Interactions

Authors: Yunbo Zhang, Deepak Gopinath, Yuting Ye, Jessica Hodgins, Greg Turk, Jungdam Won

Abstract: We present a method for reproducing complex multi-character interactions for physically simulated humanoid characters using deep reinforcement learning. Our method learns control policies for characters that imitate not only individual motions, but also the interactions between characters, while maintaining balance and matching the complexity of reference data. Our approach uses a novel reward for… ▽ More We present a method for reproducing complex multi-character interactions for physically simulated humanoid characters using deep reinforcement learning. Our method learns control policies for characters that imitate not only individual motions, but also the interactions between characters, while maintaining balance and matching the complexity of reference data. Our approach uses a novel reward formulation based on an interaction graph that measures distances between pairs of interaction landmarks. This reward encourages control policies to efficiently imitate the character's motion while preserving the spatial relationships of the interactions in the reference motion. We evaluate our method on a variety of activities, from simple interactions such as a high-five greeting to more complex interactions such as gymnastic exercises, Salsa dancing, and box carrying and throwing. This approach can be used to ``clean-up'' existing motion capture data to produce physically plausible interactions or to retarget motion to new characters with different sizes, kinematics or morphologies while maintaining the interactions in the original data. △ Less

Submitted 31 May, 2023; originally announced May 2023.

Comments: 11 pages. Accepted to SIGGRAPH 2023

arXiv:2305.18372 [pdf, other]

Assumption Generation for the Verification of Learning-Enabled Autonomous Systems

Authors: Corina Pasareanu, Ravi Mangal, Divya Gopinath, Huafeng Yu

Abstract: Providing safety guarantees for autonomous systems is difficult as these systems operate in complex environments that require the use of learning-enabled components, such as deep neural networks (DNNs) for visual perception. DNNs are hard to analyze due to their size (they can have thousands or millions of parameters), lack of formal specifications (DNNs are typically learnt from labeled data, in… ▽ More Providing safety guarantees for autonomous systems is difficult as these systems operate in complex environments that require the use of learning-enabled components, such as deep neural networks (DNNs) for visual perception. DNNs are hard to analyze due to their size (they can have thousands or millions of parameters), lack of formal specifications (DNNs are typically learnt from labeled data, in the absence of any formal requirements), and sensitivity to small changes in the environment. We present an assume-guarantee style compositional approach for the formal verification of system-level safety properties of such autonomous systems. Our insight is that we can analyze the system in the absence of the DNN perception components by automatically synthesizing assumptions on the DNN behaviour that guarantee the satisfaction of the required safety properties. The synthesized assumptions are the weakest in the sense that they characterize the output sequences of all the possible DNNs that, plugged into the autonomous system, guarantee the required safety properties. The assumptions can be leveraged as run-time monitors over a deployed DNN to guarantee the safety of the overall system; they can also be mined to extract local specifications for use during training and testing of DNNs. We illustrate our approach on a case study taken from the autonomous airplanes domain that uses a complex DNN for perception. △ Less

Submitted 27 May, 2023; originally announced May 2023.

arXiv:2303.17912 [pdf, other]

CIRCLE: Capture In Rich Contextual Environments

Authors: Joao Pedro Araujo, Jiaman Li, Karthik Vetrivel, Rishi Agarwal, Deepak Gopinath, Jiajun Wu, Alexander Clegg, C. Karen Liu

Abstract: Synthesizing 3D human motion in a contextual, ecological environment is important for simulating realistic activities people perform in the real world. However, conventional optics-based motion capture systems are not suited for simultaneously capturing human movements and complex scenes. The lack of rich contextual 3D human motion datasets presents a roadblock to creating high-quality generative… ▽ More Synthesizing 3D human motion in a contextual, ecological environment is important for simulating realistic activities people perform in the real world. However, conventional optics-based motion capture systems are not suited for simultaneously capturing human movements and complex scenes. The lack of rich contextual 3D human motion datasets presents a roadblock to creating high-quality generative human motion models. We propose a novel motion acquisition system in which the actor perceives and operates in a highly contextual virtual world while being motion captured in the real world. Our system enables rapid collection of high-quality human motion in highly diverse scenes, without the concern of occlusion or the need for physical scene construction in the real world. We present CIRCLE, a dataset containing 10 hours of full-body reaching motion from 5 subjects across nine scenes, paired with ego-centric information of the environment represented in various forms, such as RGBD videos. We use this dataset to train a model that generates human motion conditioned on scene information. Leveraging our dataset, the model learns to use ego-centric scene information to achieve nontrivial reaching tasks in the context of complex 3D scenes. To download the data please visit https://stanford-tml.github.io/circle_dataset/. △ Less

Submitted 31 March, 2023; originally announced March 2023.

arXiv:2302.04634 [pdf, other]

Closed-loop Analysis of Vision-based Autonomous Systems: A Case Study

Authors: Corina S. Pasareanu, Ravi Mangal, Divya Gopinath, Sinem Getir Yaman, Calum Imrie, Radu Calinescu, Huafeng Yu

Abstract: Deep neural networks (DNNs) are increasingly used in safety-critical autonomous systems as perception components processing high-dimensional image data. Formal analysis of these systems is particularly challenging due to the complexity of the perception DNNs, the sensors (cameras), and the environment conditions. We present a case study applying formal probabilistic analysis techniques to an exper… ▽ More Deep neural networks (DNNs) are increasingly used in safety-critical autonomous systems as perception components processing high-dimensional image data. Formal analysis of these systems is particularly challenging due to the complexity of the perception DNNs, the sensors (cameras), and the environment conditions. We present a case study applying formal probabilistic analysis techniques to an experimental autonomous system that guides airplanes on taxiways using a perception DNN. We address the above challenges by replacing the camera and the network with a compact probabilistic abstraction built from the confusion matrices computed for the DNN on a representative image data set. We also show how to leverage local, DNN-specific analyses as run-time guards to increase the safety of the overall system. Our findings are applicable to other autonomous systems that use complex DNNs for perception. △ Less

Submitted 6 February, 2023; originally announced February 2023.

arXiv:2211.12796 [pdf, other]

IMaSC -- ICFOSS Malayalam Speech Corpus

Authors: Deepa P Gopinath, Thennal D K, Vrinda V Nair, Swaraj K S, Sachin G

Abstract: Modern text-to-speech (TTS) systems use deep learning to synthesize speech increasingly approaching human quality, but they require a database of high quality audio-text sentence pairs for training. Malayalam, the official language of the Indian state of Kerala and spoken by 35+ million people, is a low resource language in terms of available corpora for TTS systems. In this paper, we present IMaS… ▽ More Modern text-to-speech (TTS) systems use deep learning to synthesize speech increasingly approaching human quality, but they require a database of high quality audio-text sentence pairs for training. Malayalam, the official language of the Indian state of Kerala and spoken by 35+ million people, is a low resource language in terms of available corpora for TTS systems. In this paper, we present IMaSC, a Malayalam text and speech corpora containing approximately 50 hours of recorded speech. With 8 speakers and a total of 34,473 text-audio pairs, IMaSC is larger than every other publicly available alternative. We evaluated the database by using it to train TTS models for each speaker based on a modern deep learning architecture. Via subjective evaluation, we show that our models perform significantly better in terms of naturalness compared to previous studies and publicly available models, with an average mean opinion score of 4.50, indicating that the synthesized speech is close to human quality. △ Less

Submitted 23 November, 2022; originally announced November 2022.

Comments: 18 pages, 8 figures

arXiv:2210.14685 [pdf, other]

Leveraging Demonstrations with Latent Space Priors

Authors: Jonas Gehring, Deepak Gopinath, Jungdam Won, Andreas Krause, Gabriel Synnaeve, Nicolas Usunier

Abstract: Demonstrations provide insight into relevant state or action space regions, bearing great potential to boost the efficiency and practicality of reinforcement learning agents. In this work, we propose to leverage demonstration datasets by combining skill learning and sequence modeling. Starting with a learned joint latent space, we separately train a generative model of demonstration sequences and… ▽ More Demonstrations provide insight into relevant state or action space regions, bearing great potential to boost the efficiency and practicality of reinforcement learning agents. In this work, we propose to leverage demonstration datasets by combining skill learning and sequence modeling. Starting with a learned joint latent space, we separately train a generative model of demonstration sequences and an accompanying low-level policy. The sequence model forms a latent space prior over plausible demonstration behaviors to accelerate learning of high-level policies. We show how to acquire such priors from state-only motion capture demonstrations and explore several methods for integrating them into policy learning on transfer tasks. Our experimental results confirm that latent space priors provide significant gains in learning speed and final performance. We benchmark our approach on a set of challenging sparse-reward environments with a complex, simulated humanoid, and on offline RL benchmarks for navigation and object manipulation. Videos, source code and pre-trained models are available at the corresponding project website at https://facebookresearch.github.io/latent-space-priors . △ Less

Submitted 13 March, 2023; v1 submitted 26 October, 2022; originally announced October 2022.

Comments: Published in Transactions on Machine Learning Research (03/2023)

arXiv:2208.03407 [pdf, other]

An Overview of Structural Coverage Metrics for Testing Neural Networks

Authors: Muhammad Usman, Youcheng Sun, Divya Gopinath, Rishi Dange, Luca Manolache, Corina S. Pasareanu

Abstract: Deep neural network (DNN) models, including those used in safety-critical domains, need to be thoroughly tested to ensure that they can reliably perform well in different scenarios. In this article, we provide an overview of structural coverage metrics for testing DNN models, including neuron coverage (NC), k-multisection neuron coverage (kMNC), top-k neuron coverage (TKNC), neuron boundary covera… ▽ More Deep neural network (DNN) models, including those used in safety-critical domains, need to be thoroughly tested to ensure that they can reliably perform well in different scenarios. In this article, we provide an overview of structural coverage metrics for testing DNN models, including neuron coverage (NC), k-multisection neuron coverage (kMNC), top-k neuron coverage (TKNC), neuron boundary coverage (NBC), strong neuron activation coverage (SNAC) and modified condition/decision coverage (MC/DC). We evaluate the metrics on realistic DNN models used for perception tasks (including LeNet-1, LeNet-4, LeNet-5, and ResNet20) as well as on networks used in autonomy (TaxiNet). We also provide a tool, DNNCov, which can measure the testing coverage for all these metrics. DNNCov outputs an informative coverage report to enable researchers and practitioners to assess the adequacy of DNN testing, compare different coverage measures, and to more conveniently inspect the model's internals during testing. △ Less

Submitted 5 August, 2022; originally announced August 2022.

arXiv:2207.09619 [pdf, other]

Learning Latent Traits for Simulated Cooperative Driving Tasks

Authors: Jonathan A. DeCastro, Deepak Gopinath, Guy Rosman, Emily Sumner, Shabnam Hakimi, Simon Stent

Abstract: To construct effective teaming strategies between humans and AI systems in complex, risky situations requires an understanding of individual preferences and behaviors of humans. Previously this problem has been treated in case-specific or data-agnostic ways. In this paper, we build a framework capable of capturing a compact latent representation of the human in terms of their behavior and preferen… ▽ More To construct effective teaming strategies between humans and AI systems in complex, risky situations requires an understanding of individual preferences and behaviors of humans. Previously this problem has been treated in case-specific or data-agnostic ways. In this paper, we build a framework capable of capturing a compact latent representation of the human in terms of their behavior and preferences based on data from a simulated population of drivers. Our framework leverages, to the extent available, knowledge of individual preferences and types from samples within the population to deploy interaction policies appropriate for specific drivers. We then build a lightweight simulation environment, HMIway-env, for modelling one form of distracted driving behavior, and use it to generate data for different driver types and train intervention policies. We finally use this environment to quantify both the ability to discriminate drivers and the effectiveness of intervention policies. △ Less

Submitted 19 July, 2022; originally announced July 2022.

arXiv:2205.03894 [pdf, ps, other]

VPN: Verification of Poisoning in Neural Networks

Authors: Youcheng Sun, Muhammad Usman, Divya Gopinath, Corina S. Păsăreanu

Abstract: Neural networks are successfully used in a variety of applications, many of them having safety and security concerns. As a result researchers have proposed formal verification techniques for verifying neural network properties. While previous efforts have mainly focused on checking local robustness in neural networks, we instead study another neural network security issue, namely data poisoning. I… ▽ More Neural networks are successfully used in a variety of applications, many of them having safety and security concerns. As a result researchers have proposed formal verification techniques for verifying neural network properties. While previous efforts have mainly focused on checking local robustness in neural networks, we instead study another neural network security issue, namely data poisoning. In this case an attacker inserts a trigger into a subset of the training data, in such a way that at test time, this trigger in an input causes the trained model to misclassify to some target class. We show how to formulate the check for data poisoning as a property that can be checked with off-the-shelf verification tools, such as Marabou and nneum, where counterexamples of failed checks constitute the triggers. We further show that the discovered triggers are `transferable' from a small model to a larger, better-trained model, allowing us to analyze state-of-the art performant models trained for image classification tasks. △ Less

Submitted 8 May, 2022; originally announced May 2022.

arXiv:2203.15720 [pdf, other]

doi 10.1145/3550469.3555428

Transformer Inertial Poser: Real-time Human Motion Reconstruction from Sparse IMUs with Simultaneous Terrain Generation

Authors: Yifeng Jiang, Yuting Ye, Deepak Gopinath, Jungdam Won, Alexander W. Winkler, C. Karen Liu

Abstract: Real-time human motion reconstruction from a sparse set of (e.g. six) wearable IMUs provides a non-intrusive and economic approach to motion capture. Without the ability to acquire position information directly from IMUs, recent works took data-driven approaches that utilize large human motion datasets to tackle this under-determined problem. Still, challenges remain such as temporal consistency,… ▽ More Real-time human motion reconstruction from a sparse set of (e.g. six) wearable IMUs provides a non-intrusive and economic approach to motion capture. Without the ability to acquire position information directly from IMUs, recent works took data-driven approaches that utilize large human motion datasets to tackle this under-determined problem. Still, challenges remain such as temporal consistency, drifting of global and joint motions, and diverse coverage of motion types on various terrains. We propose a novel method to simultaneously estimate full-body motion and generate plausible visited terrain from only six IMU sensors in real-time. Our method incorporates 1. a conditional Transformer decoder model giving consistent predictions by explicitly reasoning prediction history, 2. a simple yet general learning target named "stationary body points" (SBPs) which can be stably predicted by the Transformer model and utilized by analytical routines to correct joint and global drifting, and 3. an algorithm to generate regularized terrain height maps from noisy SBP predictions which can in turn correct noisy global motion estimation. We evaluate our framework extensively on synthesized and real IMU data, and with real-time live demos, and show superior performance over strong baseline methods. △ Less

Submitted 8 December, 2022; v1 submitted 29 March, 2022; originally announced March 2022.

Comments: SIGGRAPH Asia 2022. Video: https://youtu.be/rXb6SaXsnc0. Code: https://github.com/jyf588/transformer-inertial-poser

arXiv:2202.01179 [pdf, other]

AntidoteRT: Run-time Detection and Correction of Poison Attacks on Neural Networks

Authors: Muhammad Usman, Youcheng Sun, Divya Gopinath, Corina S. Pasareanu

Abstract: We study backdoor poisoning attacks against image classification networks, whereby an attacker inserts a trigger into a subset of the training data, in such a way that at test time, this trigger causes the classifier to predict some target class. %There are several techniques proposed in the literature that aim to detect the attack but only a few also propose to defend against it, and they typical… ▽ More We study backdoor poisoning attacks against image classification networks, whereby an attacker inserts a trigger into a subset of the training data, in such a way that at test time, this trigger causes the classifier to predict some target class. %There are several techniques proposed in the literature that aim to detect the attack but only a few also propose to defend against it, and they typically involve retraining the network which is not always possible in practice. We propose lightweight automated detection and correction techniques against poisoning attacks, which are based on neuron patterns mined from the network using a small set of clean and poisoned test samples with known labels. The patterns built based on the mis-classified samples are used for run-time detection of new poisoned inputs. For correction, we propose an input correction technique that uses a differential analysis to identify the trigger in the detected poisoned images, which is then reset to a neutral color. Our detection and correction are performed at run-time and input level, which is in contrast to most existing work that is focused on offline model-level defenses. We demonstrate that our technique outperforms existing defenses such as NeuralCleanse and STRIP on popular benchmarks such as MNIST, CIFAR-10, and GTSRB against the popular BadNets attack and the more complex DFST attack. △ Less

Submitted 31 January, 2022; originally announced February 2022.

arXiv:2110.12588 [pdf, other]

doi 10.4204/EPTCS.348.6

QuantifyML: How Good is my Machine Learning Model?

Authors: Muhammad Usman, Divya Gopinath, Corina S. Păsăreanu

Abstract: The efficacy of machine learning models is typically determined by computing their accuracy on test data sets. However, this may often be misleading, since the test data may not be representative of the problem that is being studied. With QuantifyML we aim to precisely quantify the extent to which machine learning models have learned and generalized from the given data. Given a trained model, Quan… ▽ More The efficacy of machine learning models is typically determined by computing their accuracy on test data sets. However, this may often be misleading, since the test data may not be representative of the problem that is being studied. With QuantifyML we aim to precisely quantify the extent to which machine learning models have learned and generalized from the given data. Given a trained model, QuantifyML translates it into a C program and feeds it to the CBMC model checker to produce a formula in Conjunctive Normal Form (CNF). The formula is analyzed with off-the-shelf model counters to obtain precise counts with respect to different model behavior. QuantifyML enables i) evaluating learnability by comparing the counts for the outputs to ground truth, expressed as logical predicates, ii) comparing the performance of models built with different machine learning algorithms (decision-trees vs. neural networks), and iii) quantifying the safety and robustness of models. △ Less

Submitted 24 October, 2021; originally announced October 2021.

Comments: In Proceedings FMAS 2021, arXiv:2110.11527

Journal ref: EPTCS 348, 2021, pp. 92-100

arXiv:2110.08610 [pdf, other]

MAAD: A Model and Dataset for "Attended Awareness" in Driving

Authors: Deepak Gopinath, Guy Rosman, Simon Stent, Katsuya Terahata, Luke Fletcher, Brenna Argall, John Leonard

Abstract: We propose a computational model to estimate a person's attended awareness of their environment. We define attended awareness to be those parts of a potentially dynamic scene which a person has attended to in recent history and which they are still likely to be physically aware of. Our model takes as input scene information in the form of a video and noisy gaze estimates, and outputs visual salien… ▽ More We propose a computational model to estimate a person's attended awareness of their environment. We define attended awareness to be those parts of a potentially dynamic scene which a person has attended to in recent history and which they are still likely to be physically aware of. Our model takes as input scene information in the form of a video and noisy gaze estimates, and outputs visual saliency, a refined gaze estimate, and an estimate of the person's attended awareness. In order to test our model, we capture a new dataset with a high-precision gaze tracker including 24.5 hours of gaze sequences from 23 subjects attending to videos of driving scenes. The dataset also contains third-party annotations of the subjects' attended awareness based on observations of their scan path. Our results show that our model is able to reasonably estimate attended awareness in a controlled setting, and in the future could potentially be extended to real egocentric driving data to help enable more effective ahead-of-time warnings in safety systems and thereby augment driver performance. We also demonstrate our model's effectiveness on the tasks of saliency, gaze calibration, and denoising, using both our dataset and an existing saliency dataset. We make our model and dataset available at https://github.com/ToyotaResearchInstitute/att-aware/. △ Less

Submitted 16 October, 2021; originally announced October 2021.

Comments: 25 pages, 13 figures, 14 tables, Accepted at EPIC@ICCV 2021 Workshop. Main paper + Supplementary Material

arXiv:2110.04663 [pdf, other]

Learning to Control Complex Robots Using High-Dimensional Interfaces: Preliminary Insights

Authors: Jongmin M. Lee, Temesgen Gebrekristos, Dalia De Santis, Mahdieh Nejati-Javaremi, Deepak Gopinath, Biraj Parikh, Ferdinando A. Mussa-Ivaldi, Brenna D. Argall

Abstract: Human body motions can be captured as a high-dimensional continuous signal using motion sensor technologies. The resulting data can be surprisingly rich in information, even when captured from persons with limited mobility. In this work, we explore the use of limited upper-body motions, captured via motion sensors, as inputs to control a 7 degree-of-freedom assistive robotic arm. It is possible th… ▽ More Human body motions can be captured as a high-dimensional continuous signal using motion sensor technologies. The resulting data can be surprisingly rich in information, even when captured from persons with limited mobility. In this work, we explore the use of limited upper-body motions, captured via motion sensors, as inputs to control a 7 degree-of-freedom assistive robotic arm. It is possible that even dense sensor signals lack the salient information and independence necessary for reliable high-dimensional robot control. As the human learns over time in the context of this limitation, intelligence on the robot can be leveraged to better identify key learning challenges, provide useful feedback, and support individuals until the challenges are managed. In this short paper, we examine two uninjured participants' data from an ongoing study, to extract preliminary results and share insights. We observe opportunities for robot intelligence to step in, including the identification of inconsistencies in time spent across all control dimensions, asymmetries in individual control dimensions, and user progress in learning. Machine reasoning about these situations may facilitate novel interface learning in the future. △ Less

Submitted 9 October, 2021; originally announced October 2021.

Comments: Presented at AI-HRI symposium as part of AAAI-FSS 2021 (arXiv:2109.10836)

Report number: AIHRI/2021/54

arXiv:2109.11451 [pdf, other]

doi 10.1145/3472749.3474814

MedKnowts: Unified Documentation and Information Retrieval for Electronic Health Records

Authors: Luke Murray, Divya Gopinath, Monica Agrawal, Steven Horng, David Sontag, David R. Karger

Abstract: Clinical documentation can be transformed by Electronic Health Records, yet the documentation process is still a tedious, time-consuming, and error-prone process. Clinicians are faced with multi-faceted requirements and fragmented interfaces for information exploration and documentation. These challenges are only exacerbated in the Emergency Department -- clinicians often see 35 patients in one sh… ▽ More Clinical documentation can be transformed by Electronic Health Records, yet the documentation process is still a tedious, time-consuming, and error-prone process. Clinicians are faced with multi-faceted requirements and fragmented interfaces for information exploration and documentation. These challenges are only exacerbated in the Emergency Department -- clinicians often see 35 patients in one shift, during which they have to synthesize an often previously unknown patient's medical records in order to reach a tailored diagnosis and treatment plan. To better support this information synthesis, clinical documentation tools must enable rapid contextual access to the patient's medical record. MedKnowts is an integrated note-taking editor and information retrieval system which unifies the documentation and search process and provides concise synthesized concept-oriented slices of the patient's medical record. MedKnowts automatically captures structured data while still allowing users the flexibility of natural language. MedKnowts leverages this structure to enable easier parsing of long notes, auto-populated text, and proactive information retrieval, easing the documentation burden. △ Less

Submitted 23 September, 2021; originally announced September 2021.

Comments: 15 Pages, 8 figures, UIST 21, October 10-13

arXiv:2103.12535 [pdf, other]

NNrepair: Constraint-based Repair of Neural Network Classifiers

Authors: Muhammad Usman, Divya Gopinath, Youcheng Sun, Yannic Noller, Corina Pasareanu

Abstract: We present NNrepair, a constraint-based technique for repairing neural network classifiers. The technique aims to fix the logic of the network at an intermediate layer or at the last layer. NNrepair first uses fault localization to find potentially faulty network parameters (such as the weights) and then performs repair using constraint solving to apply small modifications to the parameters to rem… ▽ More We present NNrepair, a constraint-based technique for repairing neural network classifiers. The technique aims to fix the logic of the network at an intermediate layer or at the last layer. NNrepair first uses fault localization to find potentially faulty network parameters (such as the weights) and then performs repair using constraint solving to apply small modifications to the parameters to remedy the defects. We present novel strategies to enable precise yet efficient repair such as inferring correctness specifications to act as oracles for intermediate layer repair, and generation of experts for each class. We demonstrate the technique in the context of three different scenarios: (1) Improving the overall accuracy of a model, (2) Fixing security vulnerabilities caused by poisoning of training data and (3) Improving the robustness of the network against adversarial attacks. Our evaluation on MNIST and CIFAR-10 models shows that NNrepair can improve the accuracy by 45.56 percentage points on poisoned data and 10.40 percentage points on adversarial data. NNrepair also provides small improvement in the overall accuracy of models, without requiring new data or re-training. △ Less

Submitted 14 June, 2021; v1 submitted 23 March, 2021; originally announced March 2021.

arXiv:2103.00124 [pdf, other]

NEUROSPF: A tool for the Symbolic Analysis of Neural Networks

Authors: Muhammad Usman, Yannic Noller, Corina Pasareanu, Youcheng Sun, Divya Gopinath

Abstract: This paper presents NEUROSPF, a tool for the symbolic analysis of neural networks. Given a trained neural network model, the tool extracts the architecture and model parameters and translates them into a Java representation that is amenable for analysis using the Symbolic PathFinder symbolic execution tool. Notably, NEUROSPF encodes specialized peer classes for parsing the model's parameters, ther… ▽ More This paper presents NEUROSPF, a tool for the symbolic analysis of neural networks. Given a trained neural network model, the tool extracts the architecture and model parameters and translates them into a Java representation that is amenable for analysis using the Symbolic PathFinder symbolic execution tool. Notably, NEUROSPF encodes specialized peer classes for parsing the model's parameters, thereby enabling efficient analysis. With NEUROSPF the user has the flexibility to specify either the inputs or the network internal parameters as symbolic, promoting the application of program analysis and testing approaches from software engineering to the field of machine learning. For instance, NEUROSPF can be used for coverage-based testing and test generation, finding adversarial examples and also constraint-based repair of neural networks, thus improving the reliability of neural networks and of the applications that use them. Video URL: https://youtu.be/seal8fG78LI △ Less

Submitted 26 February, 2021; originally announced March 2021.

arXiv:2102.12898 [pdf, other]

ShuffleUNet: Super resolution of diffusion-weighted MRIs using deep learning

Authors: Soumick Chatterjee, Alessandro Sciarra, Max Dünnwald, Raghava Vinaykanth Mushunuri, Ranadheer Podishetti, Rajatha Nagaraja Rao, Geetha Doddapaneni Gopinath, Steffen Oeltze-Jafra, Oliver Speck, Andreas Nürnberger

Abstract: Diffusion-weighted magnetic resonance imaging (DW-MRI) can be used to characterise the microstructure of the nervous tissue, e.g. to delineate brain white matter connections in a non-invasive manner via fibre tracking. Magnetic Resonance Imaging (MRI) in high spatial resolution would play an important role in visualising such fibre tracts in a superior manner. However, obtaining an image of such r… ▽ More Diffusion-weighted magnetic resonance imaging (DW-MRI) can be used to characterise the microstructure of the nervous tissue, e.g. to delineate brain white matter connections in a non-invasive manner via fibre tracking. Magnetic Resonance Imaging (MRI) in high spatial resolution would play an important role in visualising such fibre tracts in a superior manner. However, obtaining an image of such resolution comes at the expense of longer scan time. Longer scan time can be associated with the increase of motion artefacts, due to the patient's psychological and physical conditions. Single Image Super-Resolution (SISR), a technique aimed to obtain high-resolution (HR) details from one single low-resolution (LR) input image, achieved with Deep Learning, is the focus of this study. Compared to interpolation techniques or sparse-coding algorithms, deep learning extracts prior knowledge from big datasets and produces superior MRI images from the low-resolution counterparts. In this research, a deep learning based super-resolution technique is proposed and has been applied for DW-MRI. Images from the IXI dataset have been used as the ground-truth and were artificially downsampled to simulate the low-resolution images. The proposed method has shown statistically significant improvement over the baselines and achieved an SSIM of $0.913\pm0.045$. △ Less

Submitted 25 February, 2021; originally announced February 2021.

arXiv:2007.15153 [pdf, other]

Fast, Structured Clinical Documentation via Contextual Autocomplete

Authors: Divya Gopinath, Monica Agrawal, Luke Murray, Steven Horng, David Karger, David Sontag

Abstract: We present a system that uses a learned autocompletion mechanism to facilitate rapid creation of semi-structured clinical documentation. We dynamically suggest relevant clinical concepts as a doctor drafts a note by leveraging features from both unstructured and structured medical data. By constraining our architecture to shallow neural networks, we are able to make these suggestions in real time.… ▽ More We present a system that uses a learned autocompletion mechanism to facilitate rapid creation of semi-structured clinical documentation. We dynamically suggest relevant clinical concepts as a doctor drafts a note by leveraging features from both unstructured and structured medical data. By constraining our architecture to shallow neural networks, we are able to make these suggestions in real time. Furthermore, as our algorithm is used to write a note, we can automatically annotate the documentation with clean labels of clinical concepts drawn from medical vocabularies, making notes more structured and readable for physicians, patients, and future algorithms. To our knowledge, this system is the only machine learning-based documentation utility for clinical notes deployed in a live hospital setting, and it reduces keystroke burden of clinical concepts by 67% in real environments. △ Less

Submitted 29 July, 2020; originally announced July 2020.

Comments: Published in Machine Learning for Healthcare 2020 conference

arXiv:2007.02092 [pdf, other]

Customized Handling of Unintended Interface Operation in Assistive Robots

Authors: Deepak Gopinath, Mahdieh Nejati Javaremi, Brenna D. Argall

Abstract: We present an assistance system that reasons about a human's intended actions during robot teleoperation in order to provide appropriate corrections for unintended behavior. We model the human's physical interaction with a control interface during robot teleoperation and distinguish between intended and measured physical actions explicitly. By reasoning over the unobserved intentions using model-b… ▽ More We present an assistance system that reasons about a human's intended actions during robot teleoperation in order to provide appropriate corrections for unintended behavior. We model the human's physical interaction with a control interface during robot teleoperation and distinguish between intended and measured physical actions explicitly. By reasoning over the unobserved intentions using model-based inference techniques, our assistive system provides customized corrections on a user's issued commands. We validate our algorithm with a 10-person human subject study in which we evaluate the performance of the proposed assistance paradigms. Our results show that the assistance paradigms helped to significantly reduce task completion time, number of mode switches, cognitive workload, and user frustration and improve overall user satisfaction. △ Less

Submitted 5 November, 2020; v1 submitted 4 July, 2020; originally announced July 2020.

Comments: 10 pages, 7 figures, preprint

arXiv:2005.03652 [pdf, ps, other]

doi 10.1109/TNSRE.2020.2987878

Active Intent Disambiguation for Shared Control Robots

Authors: Deepak E. Gopinath, Brenna D. Argall

Abstract: Assistive shared-control robots have the potential to transform the lives of millions of people afflicted with severe motor impairments. The usefulness of shared-control robots typically relies on the underlying autonomy's ability to infer the user's needs and intentions, and the ability to do so unambiguously is often a limiting factor for providing appropriate assistance confidently and accurate… ▽ More Assistive shared-control robots have the potential to transform the lives of millions of people afflicted with severe motor impairments. The usefulness of shared-control robots typically relies on the underlying autonomy's ability to infer the user's needs and intentions, and the ability to do so unambiguously is often a limiting factor for providing appropriate assistance confidently and accurately. The contributions of this paper are four-fold. First, we introduce the idea of intent disambiguation via control mode selection, and present a mathematical formalism for the same. Second, we develop a control mode selection algorithm which selects the control mode in which the user-initiated motion helps the autonomy to maximally disambiguate user intent. Third, we present a pilot study with eight subjects to evaluate the efficacy of the disambiguation algorithm. Our results suggest that the disambiguation system (a) helps to significantly reduce task effort, as measured by number of button presses, and (b) is of greater utility for more limited control interfaces and more complex tasks. We also observe that (c) subjects demonstrated a wide range of disambiguation request behaviors, with the common thread of concentrating requests early in the execution. As our last contribution, we introduce a novel field-theoretic approach to intent inference inspired by dynamic field theory that works in tandem with the disambiguation scheme. △ Less

Submitted 7 May, 2020; originally announced May 2020.

arXiv:2004.08440 [pdf, other]

Parallelization Techniques for Verifying Neural Networks

Authors: Haoze Wu, Alex Ozdemir, Aleksandar Zeljić, Ahmed Irfan, Kyle Julian, Divya Gopinath, Sadjad Fouladi, Guy Katz, Corina Pasareanu, Clark Barrett

Abstract: Inspired by recent successes with parallel optimization techniques for solving Boolean satisfiability, we investigate a set of strategies and heuristics that aim to leverage parallel computing to improve the scalability of neural network verification. We introduce an algorithm based on partitioning the verification problem in an iterative manner and explore two partitioning strategies, that work b… ▽ More Inspired by recent successes with parallel optimization techniques for solving Boolean satisfiability, we investigate a set of strategies and heuristics that aim to leverage parallel computing to improve the scalability of neural network verification. We introduce an algorithm based on partitioning the verification problem in an iterative manner and explore two partitioning strategies, that work by partitioning the input space or by case splitting on the phases of the neuron activations, respectively. We also introduce a highly parallelizable pre-processing algorithm that uses the neuron activation phases to simplify the neural network verification problems. An extensive experimental evaluation shows the benefit of these techniques on both existing benchmarks and new benchmarks from the aviation domain. A preliminary experiment with ultra-scaling our algorithm using a large distributed cloud-based platform also shows promising results. △ Less

Submitted 21 August, 2020; v1 submitted 17 April, 2020; originally announced April 2020.

arXiv:1912.00289 [pdf, other]

A Programmatic and Semantic Approach to Explaining and DebuggingNeural Network Based Object Detectors

Authors: Edward Kim, Divya Gopinath, Corina Pasareanu, Sanjit Seshia

Abstract: Even as deep neural networks have become very effective for tasks in vision and perception, it remains difficult to explain and debug their behavior. In this paper, we present a programmatic and semantic approach to explaining, understanding, and debugging the correct and incorrect behaviors of a neural network-based perception system. Our approach is semantic in that it employs a high-level repre… ▽ More Even as deep neural networks have become very effective for tasks in vision and perception, it remains difficult to explain and debug their behavior. In this paper, we present a programmatic and semantic approach to explaining, understanding, and debugging the correct and incorrect behaviors of a neural network-based perception system. Our approach is semantic in that it employs a high-level representation of the distribution of environment scenarios that the detector is intended to work on. It is programmatic in that scenario representation is a program in a domain-specific probabilistic programming language which can be used to generate synthetic data to test a given perception module. Our framework assesses the performance of a perception module to identify correct and incorrect detections, extracts rules from those results that semantically characterizes the correct and incorrect scenarios, and then specializes the probabilistic program with those rules in order to more precisely characterize the scenarios in which the perception module operates correctly or not. We demonstrate our results using the SCENIC probabilistic programming language and a neural network-based object detector. Our experiments show that it is possible to automatically generate compact rules that significantly increase the correct detection rate (or conversely the incorrect detection rate) of the network and can thus help with understanding and debugging its behavior. △ Less

Submitted 16 June, 2020; v1 submitted 30 November, 2019; originally announced December 2019.

Journal ref: CVPR (2020)

arXiv:1909.13755 [pdf, other]

Hamiltonicity in Semi-Regular Tessellation Dual Graphs

Authors: Divya Gopinath, Rohan Kodialam, Kevin Lu, Jayson Lynch, Santiago Ospina

Abstract: This paper shows NP-completeness for finding Hamiltonian cycles in induced subgraphs of the dual graphs of semi-regular tessilations. It also shows NP-hardness for a new, wide class of graphs called augmented square grids. This work follows up on prior studies of the complexity of finding Hamiltonian cycles in regular and semi-regular grid graphs. This paper shows NP-completeness for finding Hamiltonian cycles in induced subgraphs of the dual graphs of semi-regular tessilations. It also shows NP-hardness for a new, wide class of graphs called augmented square grids. This work follows up on prior studies of the complexity of finding Hamiltonian cycles in regular and semi-regular grid graphs. △ Less

Submitted 30 September, 2019; originally announced September 2019.

arXiv:1909.06515 [pdf, other]

Harnessing Indirect Training Data for End-to-End Automatic Speech Translation: Tricks of the Trade

Authors: Juan Pino, Liezl Puzon, Jiatao Gu, Xutai Ma, Arya D. McCarthy, Deepak Gopinath

Abstract: For automatic speech translation (AST), end-to-end approaches are outperformed by cascaded models that transcribe with automatic speech recognition (ASR), then translate with machine translation (MT). A major cause of the performance gap is that, while existing AST corpora are small, massive datasets exist for both the ASR and MT subsystems. In this work, we evaluate several data augmentation and… ▽ More For automatic speech translation (AST), end-to-end approaches are outperformed by cascaded models that transcribe with automatic speech recognition (ASR), then translate with machine translation (MT). A major cause of the performance gap is that, while existing AST corpora are small, massive datasets exist for both the ASR and MT subsystems. In this work, we evaluate several data augmentation and pretraining approaches for AST, by comparing all on the same datasets. Simple data augmentation by translating ASR transcripts proves most effective on the English--French augmented LibriSpeech dataset, closing the performance gap from 8.2 to 1.4 BLEU, compared to a very strong cascade that could directly utilize copious ASR and MT data. The same end-to-end approach plus fine-tuning closes the gap on the English--Romanian MuST-C dataset from 6.7 to 3.7 BLEU. In addition to these results, we present practical recommendations for augmentation and pretraining approaches. Finally, we decrease the performance gap to 0.01 BLEU using a Transformer-based architecture. △ Less

Submitted 22 October, 2019; v1 submitted 13 September, 2019; originally announced September 2019.

Comments: IWSLT 2019

arXiv:1904.13215 [pdf, other]

Property Inference for Deep Neural Networks

Authors: Divya Gopinath, Hayes Converse, Corina S. Pasareanu, Ankur Taly

Abstract: We present techniques for automatically inferring formal properties of feed-forward neural networks. We observe that a significant part (if not all) of the logic of feed forward networks is captured in the activation status ('on' or 'off') of its neurons. We propose to extract patterns based on neuron decisions as preconditions that imply certain desirable output property e.g., the prediction bein… ▽ More We present techniques for automatically inferring formal properties of feed-forward neural networks. We observe that a significant part (if not all) of the logic of feed forward networks is captured in the activation status ('on' or 'off') of its neurons. We propose to extract patterns based on neuron decisions as preconditions that imply certain desirable output property e.g., the prediction being a certain class. We present techniques to extract input properties, encoding convex predicates on the input space that imply given output properties and layer properties, representing network properties captured in the hidden layers that imply the desired output behavior. We apply our techniques on networks for the MNIST and ACASXU applications. Our experiments highlight the use of the inferred properties in a variety of tasks, such as explaining predictions, providing robustness guarantees, simplifying proofs, and network distillation. △ Less

Submitted 10 September, 2020; v1 submitted 29 April, 2019; originally announced April 2019.

Comments: Errata: This version updates the ASE'19 conference version by correcting the definition of the three properties that were checked for ACASXU

arXiv:1810.08303 [pdf, other]

Compositional Verification for Autonomous Systems with Deep Learning Components

Authors: Corina S. Pasareanu, Divya Gopinath, Huafeng Yu

Abstract: As autonomy becomes prevalent in many applications, ranging from recommendation systems to fully autonomous vehicles, there is an increased need to provide safety guarantees for such systems. The problem is difficult, as these are large, complex systems which operate in uncertain environments, requiring data-driven machine-learning components. However, learning techniques such as Deep Neural Netwo… ▽ More As autonomy becomes prevalent in many applications, ranging from recommendation systems to fully autonomous vehicles, there is an increased need to provide safety guarantees for such systems. The problem is difficult, as these are large, complex systems which operate in uncertain environments, requiring data-driven machine-learning components. However, learning techniques such as Deep Neural Networks, widely used today, are inherently unpredictable and lack the theoretical foundations to provide strong assurance guarantees. We present a compositional approach for the scalable, formal verification of autonomous systems that contain Deep Neural Network components. The approach uses assume-guarantee reasoning whereby {\em contracts}, encoding the input-output behavior of individual components, allow the designer to model and incorporate the behavior of the learning-enabled components working side-by-side with the other components. We illustrate the approach on an example taken from the autonomous vehicles domain. △ Less

Submitted 18 October, 2018; originally announced October 2018.

arXiv:1807.10439 [pdf, other]

Symbolic Execution for Deep Neural Networks

Authors: Divya Gopinath, Kaiyuan Wang, Mengshi Zhang, Corina S. Pasareanu, Sarfraz Khurshid

Abstract: Deep Neural Networks (DNN) are increasingly used in a variety of applications, many of them with substantial safety and security concerns. This paper introduces DeepCheck, a new approach for validating DNNs based on core ideas from program analysis, specifically from symbolic execution. The idea is to translate a DNN into an imperative program, thereby enabling program analysis to assist with DNN… ▽ More Deep Neural Networks (DNN) are increasingly used in a variety of applications, many of them with substantial safety and security concerns. This paper introduces DeepCheck, a new approach for validating DNNs based on core ideas from program analysis, specifically from symbolic execution. The idea is to translate a DNN into an imperative program, thereby enabling program analysis to assist with DNN validation. A basic translation however creates programs that are very complex to analyze. DeepCheck introduces novel techniques for lightweight symbolic analysis of DNNs and applies them in the context of image classification to address two challenging problems in DNN analysis: 1) identification of important pixels (for attribution and adversarial generation); and 2) creation of 1-pixel and 2-pixel attacks. Experimental results using the MNIST data-set show that DeepCheck's lightweight symbolic analysis provides a valuable tool for DNN validation. △ Less

Submitted 27 July, 2018; originally announced July 2018.

arXiv:1710.00486 [pdf, other]

DeepSafe: A Data-driven Approach for Checking Adversarial Robustness in Neural Networks

Authors: Divya Gopinath, Guy Katz, Corina S. Pasareanu, Clark Barrett

Abstract: Deep neural networks have become widely used, obtaining remarkable results in domains such as computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, and bio-informatics, where they have produced results comparable to human experts. However, these networks can be easily fooled by adversarial perturbations: minimal changes… ▽ More Deep neural networks have become widely used, obtaining remarkable results in domains such as computer vision, speech recognition, natural language processing, audio recognition, social network filtering, machine translation, and bio-informatics, where they have produced results comparable to human experts. However, these networks can be easily fooled by adversarial perturbations: minimal changes to correctly-classified inputs, that cause the network to mis-classify them. This phenomenon represents a concern for both safety and security, but it is currently unclear how to measure a network's robustness against such perturbations. Existing techniques are limited to checking robustness around a few individual input points, providing only very limited guarantees. We propose a novel approach for automatically identifying safe regions of the input space, within which the network is robust against adversarial perturbations. The approach is data-guided, relying on clustering to identify well-defined geometric regions as candidate safe regions. We then utilize verification techniques to confirm that these regions are safe or to provide counter-examples showing that they are not safe. We also introduce the notion of targeted robustness which, for a given target label and region, ensures that a NN does not map any input in the region to the target label. We evaluated our technique on the MNIST dataset and on a neural network implementation of a controller for the next-generation Airborne Collision Avoidance System for unmanned aircraft (ACAS Xu). For these networks, our approach identified multiple regions which were completely safe as well as some which were only safe for specific labels. It also discovered several adversarial perturbations of interest. △ Less

Submitted 30 January, 2020; v1 submitted 2 October, 2017; originally announced October 2017.

arXiv:1612.04391 [pdf, other]

A Robotic Prosthesis for an Amputee Drummer

Authors: Mason Bretan, Deepak Gopinath, Philip Mullins, Gil Weinberg

Abstract: The design and evaluation of a robotic prosthesis for a drummer with a transradial amputation is presented. The principal objective of the prosthesis is to simulate the role fingers play in drumming. This primarily includes controlling the manner in which the drum stick rebounds after initial impact. This is achieved using a DC motor driven by a variable impedance control framework in a shared con… ▽ More The design and evaluation of a robotic prosthesis for a drummer with a transradial amputation is presented. The principal objective of the prosthesis is to simulate the role fingers play in drumming. This primarily includes controlling the manner in which the drum stick rebounds after initial impact. This is achieved using a DC motor driven by a variable impedance control framework in a shared control system. The user's ability to perform with and control the prosthesis is evaluated using a musical synchronization study. A secondary objective of the prosthesis is to explore the implications of musical expression and human-robotic interaction when a second, completely autonomous, stick is added to the prosthesis. This wearable robotic musician interacts with the user by listening to the music and responding with different rhythms and behaviors. We recount some empirical findings based on the user's experience of performing under such a paradigm. △ Less

Submitted 13 December, 2016; originally announced December 2016.

arXiv:1409.2153 [pdf]

doi 10.1109/ITMC.2014.6918605

GUI system for Elders/Patients in Intensive Care

Authors: J. L. Raheja, Dhiraj, D. Gopinath, Ankit Chaudhary

Abstract: In the old age, few people need special care if they are suffering from specific diseases as they can get stroke while they are in normal life routine. Also patients of any age, who are not able to walk, need to be taken care of personally but for this, either they have to be in hospital or someone like nurse should be with them for better care. This is costly in terms of money and man power. A pe… ▽ More In the old age, few people need special care if they are suffering from specific diseases as they can get stroke while they are in normal life routine. Also patients of any age, who are not able to walk, need to be taken care of personally but for this, either they have to be in hospital or someone like nurse should be with them for better care. This is costly in terms of money and man power. A person is needed for 24x7 care of these people. To help in this aspect we purposes a vision based system which will take input from the patient and will provide information to the specified person, who is currently may not in the patient room. This will reduce the need of man power, also a continuous monitoring would not be needed. The system is using MS Kinect for gesture detection for better accuracy and this system can be installed at home or hospital easily. The system provides GUI for simple usage and gives visual and audio feedback to user. This system work on natural hand interaction and need no training before using and also no need to wear any glove or color strip. △ Less

Submitted 7 September, 2014; originally announced September 2014.

Comments: In proceedings of the 4th IEEE International Conference on International Technology Management Conference, Chicago, IL USA, 12-15 June, 2014

Showing 1–37 of 37 results for author: Gopinath, D