Search | arXiv e-print repository

PoPE: Legendre Orthogonal Polynomials Based Position Encoding for Large Language Models

Abstract: There are several improvements proposed over the baseline Absolute Positional Encoding (APE) method used in original transformer. In this study, we aim to investigate the implications of inadequately representing positional encoding in higher dimensions on crucial aspects of the attention mechanism, the model's capacity to learn relative positional information, and the convergence of models, all s… ▽ More There are several improvements proposed over the baseline Absolute Positional Encoding (APE) method used in original transformer. In this study, we aim to investigate the implications of inadequately representing positional encoding in higher dimensions on crucial aspects of the attention mechanism, the model's capacity to learn relative positional information, and the convergence of models, all stemming from the choice of sinusoidal basis functions. Through a combination of theoretical insights and empirical analyses, we elucidate how these challenges extend beyond APEs and may adversely affect the performance of Relative Positional Encoding (RPE) methods, such as Rotatory Positional Encoding (RoPE). Subsequently, we introduce an innovative solution termed Orthogonal Polynomial Based Positional Encoding (PoPE) to address some of the limitations associated with existing methods. The PoPE method encodes positional information by leveraging Orthogonal Legendre polynomials. Legendre polynomials as basis functions offers several desirable properties for positional encoding, including improved correlation structure, non-periodicity, orthogonality, and distinct functional forms among polynomials of varying orders. Our experimental findings demonstrate that transformer models incorporating PoPE outperform baseline transformer models on the $Multi30k$ English-to-German translation task, thus establishing a new performance benchmark. Furthermore, PoPE-based transformers exhibit significantly accelerated convergence rates. Additionally, we will present novel theoretical perspectives on position encoding based on the superior performance of PoPE. △ Less

Submitted 29 April, 2024; originally announced May 2024.

arXiv:2312.07627 [pdf]

Multimodal Sentiment Analysis: Perceived vs Induced Sentiments

Authors: Aditi Aggarwal, Deepika Varshney, Saurabh Patel

Abstract: Social media has created a global network where people can easily access and exchange vast information. This information gives rise to a variety of opinions, reflecting both positive and negative viewpoints. GIFs stand out as a multimedia format offering a visually engaging way for users to communicate. In this research, we propose a multimodal framework that integrates visual and textual features… ▽ More Social media has created a global network where people can easily access and exchange vast information. This information gives rise to a variety of opinions, reflecting both positive and negative viewpoints. GIFs stand out as a multimedia format offering a visually engaging way for users to communicate. In this research, we propose a multimodal framework that integrates visual and textual features to predict the GIF sentiment. It also incorporates attributes including face emotion detection and OCR generated captions to capture the semantic aspects of the GIF. The developed classifier achieves an accuracy of 82.7% on Twitter GIFs, which is an improvement over state-of-the-art models. Moreover, we have based our research on the ReactionGIF dataset, analysing the variance in sentiment perceived by the author and sentiment induced in the reader △ Less

Submitted 12 December, 2023; originally announced December 2023.

arXiv:2312.06129 [pdf, other]

Household navigation and manipulation for everyday object rearrangement tasks

Authors: Shrutheesh R. Iyer, Anwesan Pal, Jiaming Hu, Akanimoh Adeleye, Aditya Aggarwal, Henrik I. Christensen

Abstract: We consider the problem of building an assistive robotic system that can help humans in daily household cleanup tasks. Creating such an autonomous system in real-world environments is inherently quite challenging, as a general solution may not suit the preferences of a particular customer. Moreover, such a system consists of multi-objective tasks comprising -- (i) Detection of misplaced objects an… ▽ More We consider the problem of building an assistive robotic system that can help humans in daily household cleanup tasks. Creating such an autonomous system in real-world environments is inherently quite challenging, as a general solution may not suit the preferences of a particular customer. Moreover, such a system consists of multi-objective tasks comprising -- (i) Detection of misplaced objects and prediction of their potentially correct placements, (ii) Fine-grained manipulation for stable object grasping, and (iii) Room-to-room navigation for transferring objects in unseen environments. This work systematically tackles each component and integrates them into a complete object rearrangement pipeline. To validate our proposed system, we conduct multiple experiments on a real robotic platform involving multi-room object transfer, user preference-based placement, and complex pick-and-place tasks. Project page: https://sites.google.com/eng.ucsd.edu/home-robot △ Less

Submitted 11 December, 2023; originally announced December 2023.

Comments: Paper accepted at IEEE IRC-2023

arXiv:2311.18426 [pdf, other]

Convergence Analysis of Fractional Gradient Descent

Authors: Ashwani Aggarwal

Abstract: Fractional derivatives are a well-studied generalization of integer order derivatives. Naturally, for optimization, it is of interest to understand the convergence properties of gradient descent using fractional derivatives. Convergence analysis of fractional gradient descent is currently limited both in the methods analyzed and the settings analyzed. This paper aims to fill in these gaps by analy… ▽ More Fractional derivatives are a well-studied generalization of integer order derivatives. Naturally, for optimization, it is of interest to understand the convergence properties of gradient descent using fractional derivatives. Convergence analysis of fractional gradient descent is currently limited both in the methods analyzed and the settings analyzed. This paper aims to fill in these gaps by analyzing variations of fractional gradient descent in smooth and convex, smooth and strongly convex, and smooth and non-convex settings. First, novel bounds will be established bridging fractional and integer derivatives. Then, these bounds will be applied to the aforementioned settings to prove linear convergence for smooth and strongly convex functions and $O(1/T)$ convergence for smooth and convex functions. Additionally, we prove $O(1/T)$ convergence for smooth and non-convex functions using an extended notion of smoothness - Hölder smoothness - that is more natural for fractional derivatives. Finally, empirical results will be presented on the potential speed up of fractional gradient descent over standard gradient descent as well as some preliminary theoretical results explaining this speed up. △ Less

Submitted 4 June, 2024; v1 submitted 30 November, 2023; originally announced November 2023.

Comments: 27 pages, 4 figures. Published in Transactions on Machine Learning Research

ACM Class: G.1.6

arXiv:2308.02748 [pdf]

Discrimination of Radiologists Utilizing Eye-Tracking Technology and Machine Learning: A Case Study

Authors: Stanford Martinez, Carolina Ramirez-Tamayo, Syed Hasib Akhter Faruqui, Kal L. Clark, Adel Alaeddini, Nicholas Czarnek, Aarushi Aggarwal, Sahra Emamzadeh, Jeffrey R. Mock, Edward J. Golob

Abstract: Perception-related errors comprise most diagnostic mistakes in radiology. To mitigate this problem, radiologists employ personalized and high-dimensional visual search strategies, otherwise known as search patterns. Qualitative descriptions of these search patterns, which involve the physician verbalizing or annotating the order he/she analyzes the image, can be unreliable due to discrepancies in… ▽ More Perception-related errors comprise most diagnostic mistakes in radiology. To mitigate this problem, radiologists employ personalized and high-dimensional visual search strategies, otherwise known as search patterns. Qualitative descriptions of these search patterns, which involve the physician verbalizing or annotating the order he/she analyzes the image, can be unreliable due to discrepancies in what is reported versus the actual visual patterns. This discrepancy can interfere with quality improvement interventions and negatively impact patient care. This study presents a novel discretized feature encoding based on spatiotemporal binning of fixation data for efficient geometric alignment and temporal ordering of eye movement when reading chest X-rays. The encoded features of the eye-fixation data are employed by machine learning classifiers to discriminate between faculty and trainee radiologists. We include a clinical trial case study utilizing the Area Under the Curve (AUC), Accuracy, F1, Sensitivity, and Specificity metrics for class separability to evaluate the discriminability between the two subjects in regard to their level of experience. We then compare the classification performance to state-of-the-art methodologies. A repeatability experiment using a separate dataset, experimental protocol, and eye tracker was also performed using eight subjects to evaluate the robustness of the proposed approach. The numerical results from both experiments demonstrate that classifiers employing the proposed feature encoding methods outperform the current state-of-the-art in differentiating between radiologists in terms of experience level. This signifies the potential impact of the proposed method for identifying radiologists' level of expertise and those who would benefit from additional training. △ Less

Submitted 4 August, 2023; originally announced August 2023.

Comments: Submitting for Review in "IEEE Journal of Biomedical and Health Informatics"

arXiv:2306.15174 [pdf, other]

Examining Lower Latency Routing with Overlay Networks

Authors: Aakriti Kedia, Akhilan Ganesh, Aman Aggarwal

Abstract: In today's rapidly expanding digital landscape, where access to timely online content is paramount to users, the underlying network infrastructure and latency performance significantly influence the user experience. We present an empirical study of the current Internet's connectivity and the achievable latencies to propose better routing paths if available. Understanding the severity of the non-op… ▽ More In today's rapidly expanding digital landscape, where access to timely online content is paramount to users, the underlying network infrastructure and latency performance significantly influence the user experience. We present an empirical study of the current Internet's connectivity and the achievable latencies to propose better routing paths if available. Understanding the severity of the non-optimal internet topology with RIPE Atlas stats, we conduct practical experiments to demonstrate that local traffic from the San Diego area to the University of California, San Diego reaches up to Los Angeles before serving responses. We examine the traceroutes and build an experimental overlay network to constrain the San Diego traffic within the city to get better round-trip time latencies. △ Less

Submitted 26 June, 2023; originally announced June 2023.

arXiv:2305.06394 [pdf, other]

Local Region-to-Region Mapping-based Approach to Classify Articulated Objects

Authors: Ayush Aggarwal, Rustam Stolkin, Naresh Marturi

Abstract: Autonomous robots operating in real-world environments encounter a variety of objects that can be both rigid and articulated in nature. Having knowledge of these specific object properties not only helps in designing appropriate manipulation strategies but also aids in developing reliable tracking and pose estimation techniques for many robotic and vision applications. In this context, this paper… ▽ More Autonomous robots operating in real-world environments encounter a variety of objects that can be both rigid and articulated in nature. Having knowledge of these specific object properties not only helps in designing appropriate manipulation strategies but also aids in developing reliable tracking and pose estimation techniques for many robotic and vision applications. In this context, this paper presents a registration-based local region-to-region mapping approach to classify an object as either articulated or rigid. Using the point clouds of the intended object, the proposed method performs classification by estimating unique local transformations between point clouds over the observed sequence of movements of the object. The significant advantage of the proposed method is that it is a constraint-free approach that can classify any articulated object and is not limited to a specific type of articulation. Additionally, it is a model-free approach with no learning components, which means it can classify whether an object is articulated without requiring any object models or labelled data. We analyze the performance of the proposed method on two publicly available benchmark datasets with a combination of articulated and rigid objects. It is observed that the proposed method can classify articulated and rigid objects with good accuracy. △ Less

Submitted 10 May, 2023; originally announced May 2023.

Comments: 7 pages, 4 figures, Conference on Robots and Vision, Articulated Object Classification

arXiv:2305.03546 [pdf, other]

Breast Cancer Immunohistochemical Image Generation: a Benchmark Dataset and Challenge Review

Authors: Chuang Zhu, Shengjie Liu, Zekuan Yu, Feng Xu, Arpit Aggarwal, Germán Corredor, Anant Madabhushi, Qixun Qu, Hongwei Fan, Fangda Li, Yueheng Li, Xianchao Guan, Yongbing Zhang, Vivek Kumar Singh, Farhan Akram, Md. Mostafa Kamal Sarker, Zhongyue Shi, Mulan Jin

Abstract: For invasive breast cancer, immunohistochemical (IHC) techniques are often used to detect the expression level of human epidermal growth factor receptor-2 (HER2) in breast tissue to formulate a precise treatment plan. From the perspective of saving manpower, material and time costs, directly generating IHC-stained images from Hematoxylin and Eosin (H&E) stained images is a valuable research direct… ▽ More For invasive breast cancer, immunohistochemical (IHC) techniques are often used to detect the expression level of human epidermal growth factor receptor-2 (HER2) in breast tissue to formulate a precise treatment plan. From the perspective of saving manpower, material and time costs, directly generating IHC-stained images from Hematoxylin and Eosin (H&E) stained images is a valuable research direction. Therefore, we held the breast cancer immunohistochemical image generation challenge, aiming to explore novel ideas of deep learning technology in pathological image generation and promote research in this field. The challenge provided registered H&E and IHC-stained image pairs, and participants were required to use these images to train a model that can directly generate IHC-stained images from corresponding H&E-stained images. We selected and reviewed the five highest-ranking methods based on their PSNR and SSIM metrics, while also providing overviews of the corresponding pipelines and implementations. In this paper, we further analyze the current limitations in the field of breast cancer immunohistochemical image generation and forecast the future development of this field. We hope that the released dataset and the challenge will inspire more scholars to jointly study higher-quality IHC-stained image generation. △ Less

Submitted 22 September, 2023; v1 submitted 5 May, 2023; originally announced May 2023.

Comments: 12 pages, 12 figures, 2tables

arXiv:2302.03800 [pdf, other]

MACOptions: Multi-Agent Learning with Centralized Controller and Options Framework

Authors: Alakh Aggarwal, Rishita Bansal, Parth Padalkar, Sriraam Natarajan

Abstract: These days automation is being applied everywhere. In every environment, planning for the actions to be taken by the agents is an important aspect. In this paper, we plan to implement planning for multi-agents with a centralized controller. We compare three approaches: random policy, Q-learning, and Q-learning with Options Framework. We also show the effectiveness of planners by showing performanc… ▽ More These days automation is being applied everywhere. In every environment, planning for the actions to be taken by the agents is an important aspect. In this paper, we plan to implement planning for multi-agents with a centralized controller. We compare three approaches: random policy, Q-learning, and Q-learning with Options Framework. We also show the effectiveness of planners by showing performance comparison between Q-Learning with Planner and without Planner. △ Less

Submitted 7 February, 2023; originally announced February 2023.

arXiv:2212.01700 [pdf, other]

Towards Robust NLG Bias Evaluation with Syntactically-diverse Prompts

Authors: Arshiya Aggarwal, Jiao Sun, Nanyun Peng

Abstract: We present a robust methodology for evaluating biases in natural language generation(NLG) systems. Previous works use fixed hand-crafted prefix templates with mentions of various demographic groups to prompt models to generate continuations for bias analysis. These fixed prefix templates could themselves be specific in terms of styles or linguistic structures, which may lead to unreliable fairness… ▽ More We present a robust methodology for evaluating biases in natural language generation(NLG) systems. Previous works use fixed hand-crafted prefix templates with mentions of various demographic groups to prompt models to generate continuations for bias analysis. These fixed prefix templates could themselves be specific in terms of styles or linguistic structures, which may lead to unreliable fairness conclusions that are not representative of the general trends from tone varying prompts. To study this problem, we paraphrase the prompts with different syntactic structures and use these to evaluate demographic bias in NLG systems. Our results suggest similar overall bias trends but some syntactic structures lead to contradictory conclusions compared to past works. We show that our methodology is more robust and that some syntactic structures prompt more toxic content while others could prompt less biased generation. This suggests the importance of not relying on a fixed syntactic structure and using tone-invariant prompts. Introducing syntactically-diverse prompts can achieve more robust NLG (bias) evaluation. △ Less

Submitted 3 December, 2022; originally announced December 2022.

Comments: EMNLP Findings 2022

arXiv:2211.11931 [pdf, other]

Layered-Garment Net: Generating Multiple Implicit Garment Layers from a Single Image

Authors: Alakh Aggarwal, Jikai Wang, Steven Hogue, Saifeng Ni, Madhukar Budagavi, Xiaohu Guo

Abstract: Recent research works have focused on generating human models and garments from their 2D images. However, state-of-the-art researches focus either on only a single layer of the garment on a human model or on generating multiple garment layers without any guarantee of the intersection-free geometric relationship between them. In reality, people wear multiple layers of garments in their daily life,… ▽ More Recent research works have focused on generating human models and garments from their 2D images. However, state-of-the-art researches focus either on only a single layer of the garment on a human model or on generating multiple garment layers without any guarantee of the intersection-free geometric relationship between them. In reality, people wear multiple layers of garments in their daily life, where an inner layer of garment could be partially covered by an outer one. In this paper, we try to address this multi-layer modeling problem and propose the Layered-Garment Net (LGN) that is capable of generating intersection-free multiple layers of garments defined by implicit function fields over the body surface, given the person's near front-view image. With a special design of garment indication fields (GIF), we can enforce an implicit covering relationship between the signed distance fields (SDF) of different layers to avoid self-intersections among different garment surfaces and the human body. Experiments demonstrate the strength of our proposed LGN framework in generating multi-layer garments as compared to state-of-the-art methods. To the best of our knowledge, LGN is the first research work to generate intersection-free multiple layers of garments on the human body from a single image. △ Less

Submitted 21 November, 2022; originally announced November 2022.

Comments: 16th Asian Conference on Computer Vision (ACCV2022)

arXiv:2210.00868 [pdf, other]

doi 10.1016/j.cma.2022.115812

Strain energy density as a Gaussian process and its utilization in stochastic finite element analysis: application to planar soft tissues

Authors: Ankush Aggarwal, Bjørn Sand Jensen, Sanjay Pant, Chung-Hao Lee

Abstract: Data-based approaches are promising alternatives to the traditional analytical constitutive models for solid mechanics. Herein, we propose a Gaussian process (GP) based constitutive modeling framework, specifically focusing on planar, hyperelastic and incompressible soft tissues. The strain energy density of soft tissues is modeled as a GP, which can be regressed to experimental stress-strain data… ▽ More Data-based approaches are promising alternatives to the traditional analytical constitutive models for solid mechanics. Herein, we propose a Gaussian process (GP) based constitutive modeling framework, specifically focusing on planar, hyperelastic and incompressible soft tissues. The strain energy density of soft tissues is modeled as a GP, which can be regressed to experimental stress-strain data obtained from biaxial experiments. Moreover, the GP model can be weakly constrained to be convex. A key advantage of a GP-based model is that, in addition to the mean value, it provides a probability density (i.e. associated uncertainty) for the strain energy density. To simulate the effect of this uncertainty, a non-intrusive stochastic finite element analysis (SFEA) framework is proposed. The proposed framework is verified against an artificial dataset based on the Gasser--Ogden--Holzapfel model and applied to a real experimental dataset of a porcine aortic valve leaflet tissue. Results show that the proposed framework can be trained with limited experimental data and fits the data better than several existing models. The SFEA framework provides a straightforward way of using the experimental data and quantifying the resulting uncertainty in simulation-based predictions. △ Less

Submitted 22 November, 2022; v1 submitted 28 September, 2022; originally announced October 2022.

arXiv:2209.02941 [pdf, other]

Can GAN-induced Attribute Manipulations Impact Face Recognition?

Authors: Sudipta Banerjee, Aditi Aggarwal, Arun Ross

Abstract: Impact due to demographic factors such as age, sex, race, etc., has been studied extensively in automated face recognition systems. However, the impact of \textit{digitally modified} demographic and facial attributes on face recognition is relatively under-explored. In this work, we study the effect of attribute manipulations induced via generative adversarial networks (GANs) on face recognition p… ▽ More Impact due to demographic factors such as age, sex, race, etc., has been studied extensively in automated face recognition systems. However, the impact of \textit{digitally modified} demographic and facial attributes on face recognition is relatively under-explored. In this work, we study the effect of attribute manipulations induced via generative adversarial networks (GANs) on face recognition performance. We conduct experiments on the CelebA dataset by intentionally modifying thirteen attributes using AttGAN and STGAN and evaluating their impact on two deep learning-based face verification methods, ArcFace and VGGFace. Our findings indicate that some attribute manipulations involving eyeglasses and digital alteration of sex cues can significantly impair face recognition by up to 73% and need further analysis. △ Less

Submitted 7 September, 2022; originally announced September 2022.

arXiv:2208.05552 [pdf, other]

Towards Automating Retinoscopy for Refractive Error Diagnosis

Authors: Aditya Aggarwal, Siddhartha Gairola, Uddeshya Upadhyay, Akshay P Vasishta, Diwakar Rao, Aditya Goyal, Kaushik Murali, Nipun Kwatra, Mohit Jain

Abstract: Refractive error is the most common eye disorder and is the key cause behind correctable visual impairment, responsible for nearly 80% of the visual impairment in the US. Refractive error can be diagnosed using multiple methods, including subjective refraction, retinoscopy, and autorefractors. Although subjective refraction is the gold standard, it requires cooperation from the patient and hence i… ▽ More Refractive error is the most common eye disorder and is the key cause behind correctable visual impairment, responsible for nearly 80% of the visual impairment in the US. Refractive error can be diagnosed using multiple methods, including subjective refraction, retinoscopy, and autorefractors. Although subjective refraction is the gold standard, it requires cooperation from the patient and hence is not suitable for infants, young children, and developmentally delayed adults. Retinoscopy is an objective refraction method that does not require any input from the patient. However, retinoscopy requires a lens kit and a trained examiner, which limits its use for mass screening. In this work, we automate retinoscopy by attaching a smartphone to a retinoscope and recording retinoscopic videos with the patient wearing a custom pair of paper frames. We develop a video processing pipeline that takes retinoscopic videos as input and estimates the net refractive error based on our proposed extension of the retinoscopy mathematical model. Our system alleviates the need for a lens kit and can be performed by an untrained examiner. In a clinical trial with 185 eyes, we achieved a sensitivity of 91.0% and specificity of 74.0% on refractive error diagnosis. Moreover, the mean absolute error of our approach was 0.75$\pm$0.67D on net refractive error estimation compared to subjective refraction measurements. Our results indicate that our approach has the potential to be used as a retinoscopy-based refractive error screening tool in real-world medical settings. △ Less

Submitted 10 August, 2022; originally announced August 2022.

Comments: This paper is accepted for publication in IMWUT 2022

arXiv:2205.06249 [pdf, ps, other]

Optimal-Degree Polynomial Approximations for Exponentials and Gaussian Kernel Density Estimation

Authors: Amol Aggarwal, Josh Alman

Abstract: For any real numbers $B \ge 1$ and $δ\in (0, 1)$ and function $f: [0, B] \rightarrow \mathbb{R}$, let $d_{B; δ} (f) \in \mathbb{Z}_{> 0}$ denote the minimum degree of a polynomial $p(x)$ satisfying $\sup_{x \in [0, B]} \big| p(x) - f(x) \big| < δ$. In this paper, we provide precise asymptotics for $d_{B; δ} (e^{-x})$ and $d_{B; δ} (e^{x})$ in terms of both $B$ and $δ$, improving both the previousl… ▽ More For any real numbers $B \ge 1$ and $δ\in (0, 1)$ and function $f: [0, B] \rightarrow \mathbb{R}$, let $d_{B; δ} (f) \in \mathbb{Z}_{> 0}$ denote the minimum degree of a polynomial $p(x)$ satisfying $\sup_{x \in [0, B]} \big| p(x) - f(x) \big| < δ$. In this paper, we provide precise asymptotics for $d_{B; δ} (e^{-x})$ and $d_{B; δ} (e^{x})$ in terms of both $B$ and $δ$, improving both the previously known upper bounds and lower bounds. In particular, we show $$d_{B; δ} (e^{-x}) = Θ\left( \max \left\{ \sqrt{B \log(δ^{-1})}, \frac{\log(δ^{-1}) }{ \log(B^{-1} \log(δ^{-1}))} \right\}\right), \text{ and}$$ $$d_{B; δ} (e^{x}) = Θ\left( \max \left\{ B, \frac{\log(δ^{-1}) }{ \log(B^{-1} \log(δ^{-1}))} \right\}\right).$$ Polynomial approximations for $e^{-x}$ and $e^x$ have applications to the design of algorithms for many problems, and our degree bounds show both the power and limitations of these algorithms. We focus in particular on the Batch Gaussian Kernel Density Estimation problem for $n$ sample points in $Θ(\log n)$ dimensions with error $δ= n^{-Θ(1)}$. We show that the running time one can achieve depends on the square of the diameter of the point set, $B$, with a transition at $B = Θ(\log n)$ mirroring the corresponding transition in $d_{B; δ} (e^{-x})$: - When $B=o(\log n)$, we give the first algorithm running in time $n^{1 + o(1)}$. - When $B = κ\log n$ for a small constant $κ>0$, we give an algorithm running in time $n^{1 + O(\log \log κ^{-1} /\log κ^{-1})}$. The $\log \log κ^{-1} /\log κ^{-1}$ term in the exponent comes from analyzing the behavior of the leading constant in our computation of $d_{B; δ} (e^{-x})$. - When $B = ω(\log n)$, we show that time $n^{2 - o(1)}$ is necessary assuming SETH. △ Less

Submitted 12 May, 2022; originally announced May 2022.

Comments: 27 pages, to appear in the 37th Computational Complexity Conference (CCC 2022)

arXiv:2205.04731 [pdf, other]

Explainable Data Imputation using Constraints

Authors: Sandeep Hans, Diptikalyan Saha, Aniya Aggarwal

Abstract: Data values in a dataset can be missing or anomalous due to mishandling or human error. Analysing data with missing values can create bias and affect the inferences. Several analysis methods, such as principle components analysis or singular value decomposition, require complete data. Many approaches impute numeric data and some do not consider dependency of attributes on other attributes, while s… ▽ More Data values in a dataset can be missing or anomalous due to mishandling or human error. Analysing data with missing values can create bias and affect the inferences. Several analysis methods, such as principle components analysis or singular value decomposition, require complete data. Many approaches impute numeric data and some do not consider dependency of attributes on other attributes, while some require human intervention and domain knowledge. We present a new algorithm for data imputation based on different data type values and their association constraints in data, which are not handled currently by any system. We show experimental results using different metrics comparing our algorithm with state of the art imputation techniques. Our algorithm not only imputes the missing values but also generates human readable explanations describing the significance of attributes used for every imputation. △ Less

Submitted 10 May, 2022; originally announced May 2022.

arXiv:2111.02161 [pdf, other]

Data Synthesis for Testing Black-Box Machine Learning Models

Authors: Diptikalyan Saha, Aniya Aggarwal, Sandeep Hans

Abstract: The increasing usage of machine learning models raises the question of the reliability of these models. The current practice of testing with limited data is often insufficient. In this paper, we provide a framework for automated test data synthesis to test black-box ML/DL models. We address an important challenge of generating realistic user-controllable data with model agnostic coverage criteria… ▽ More The increasing usage of machine learning models raises the question of the reliability of these models. The current practice of testing with limited data is often insufficient. In this paper, we provide a framework for automated test data synthesis to test black-box ML/DL models. We address an important challenge of generating realistic user-controllable data with model agnostic coverage criteria to test a varied set of properties, essentially to increase trust in machine learning models. We experimentally demonstrate the effectiveness of our technique. △ Less

Submitted 3 November, 2021; originally announced November 2021.

Comments: Accepted as a 4-pages short paper in Research track at CODS-COMAD 2022

arXiv:2108.05935 [pdf, other]

Data Quality Toolkit: Automatic assessment of data quality and remediation for machine learning datasets

Authors: Nitin Gupta, Hima Patel, Shazia Afzal, Naveen Panwar, Ruhi Sharma Mittal, Shanmukha Guttula, Abhinav Jain, Lokesh Nagalapatti, Sameep Mehta, Sandeep Hans, Pranay Lohia, Aniya Aggarwal, Diptikalyan Saha

Abstract: The quality of training data has a huge impact on the efficiency, accuracy and complexity of machine learning tasks. Various tools and techniques are available that assess data quality with respect to general cleaning and profiling checks. However these techniques are not applicable to detect data issues in the context of machine learning tasks, like noisy labels, existence of overlapping classes… ▽ More The quality of training data has a huge impact on the efficiency, accuracy and complexity of machine learning tasks. Various tools and techniques are available that assess data quality with respect to general cleaning and profiling checks. However these techniques are not applicable to detect data issues in the context of machine learning tasks, like noisy labels, existence of overlapping classes etc. We attempt to re-look at the data quality issues in the context of building a machine learning pipeline and build a tool that can detect, explain and remediate issues in the data, and systematically and automatically capture all the changes applied to the data. We introduce the Data Quality Toolkit for machine learning as a library of some key quality metrics and relevant remediation techniques to analyze and enhance the readiness of structured training datasets for machine learning projects. The toolkit can reduce the turn-around times of data preparation pipelines and streamline the data quality assessment process. Our toolkit is publicly available via IBM API Hub [1] platform, any developer can assess the data quality using the IBM's Data Quality for AI apis [2]. Detailed tutorials are also available on IBM Learning Path [3]. △ Less

Submitted 5 September, 2021; v1 submitted 12 August, 2021; originally announced August 2021.

arXiv:2107.03022 [pdf, other]

Reconstructing Test Labels from Noisy Loss Functions

Authors: Abhinav Aggarwal, Shiva Prasad Kasiviswanathan, Zekun Xu, Oluwaseyi Feyisetan, Nathanael Teissier

Abstract: Machine learning classifiers rely on loss functions for performance evaluation, often on a private (hidden) dataset. In a recent line of research, label inference was introduced as the problem of reconstructing the ground truth labels of this private dataset from just the (possibly perturbed) cross-entropy loss function values evaluated at chosen prediction vectors (without any other access to the… ▽ More Machine learning classifiers rely on loss functions for performance evaluation, often on a private (hidden) dataset. In a recent line of research, label inference was introduced as the problem of reconstructing the ground truth labels of this private dataset from just the (possibly perturbed) cross-entropy loss function values evaluated at chosen prediction vectors (without any other access to the hidden dataset). In this paper, we formally study the necessary and sufficient conditions under which label inference is possible from \emph{any} (noisy) loss function value. Using tools from analytical number theory, we show that a broad class of commonly used loss functions, including general Bregman divergence-based losses and multiclass cross-entropy with common activation functions like sigmoid and softmax, it is possible to design label inference attacks that succeed even for arbitrary noise levels and using only a single query from the adversary. We formally study the computational complexity of label inference and show that while in general, designing adversarial prediction vectors for these attacks is co-NP-hard, once we have these vectors, the attacks can also be carried out through a lightweight augmentation to any neural network model, making them look benign and hard to detect. The observations in this paper provide a deeper understanding of the vulnerabilities inherent in modern machine learning and could be used for designing future trustworthy ML. △ Less

Submitted 30 October, 2021; v1 submitted 7 July, 2021; originally announced July 2021.

Comments: Accepted at NeurIPS 2021 Workshop on Privacy in Machine Learning (PriML)

arXiv:2105.08266 [pdf, other]

Label Inference Attacks from Log-loss Scores

Authors: Abhinav Aggarwal, Shiva Prasad Kasiviswanathan, Zekun Xu, Oluwaseyi Feyisetan, Nathanael Teissier

Abstract: Log-loss (also known as cross-entropy loss) metric is ubiquitously used across machine learning applications to assess the performance of classification algorithms. In this paper, we investigate the problem of inferring the labels of a dataset from single (or multiple) log-loss score(s), without any other access to the dataset. Surprisingly, we show that for any finite number of label classes, it… ▽ More Log-loss (also known as cross-entropy loss) metric is ubiquitously used across machine learning applications to assess the performance of classification algorithms. In this paper, we investigate the problem of inferring the labels of a dataset from single (or multiple) log-loss score(s), without any other access to the dataset. Surprisingly, we show that for any finite number of label classes, it is possible to accurately infer the labels of the dataset from the reported log-loss score of a single carefully constructed prediction vector if we allow arbitrary precision arithmetic. Additionally, we present label inference algorithms (attacks) that succeed even under addition of noise to the log-loss scores and under limited precision arithmetic. All our algorithms rely on ideas from number theory and combinatorics and require no model training. We run experimental simulations on some real datasets to demonstrate the ease of running these attacks in practice. △ Less

Submitted 11 June, 2021; v1 submitted 18 May, 2021; originally announced May 2021.

Comments: Accepted at ICML 2021

arXiv:2105.02743 [pdf, other]

doi 10.1103/PhysRevA.104.022438

Random density matrices: Analytical results for mean root fidelity and mean square Bures distance

Authors: Aritra Laha, Agrim Aggarwal, Santosh Kumar

Abstract: Bures distance holds a special place among various distance measures due to its several distinguished features and finds applications in diverse problems in quantum information theory. It is related to fidelity and, among other things, it serves as a bona fide measure for quantifying the separability of quantum states. In this work, we calculate exact analytical results for the mean root fidelity… ▽ More Bures distance holds a special place among various distance measures due to its several distinguished features and finds applications in diverse problems in quantum information theory. It is related to fidelity and, among other things, it serves as a bona fide measure for quantifying the separability of quantum states. In this work, we calculate exact analytical results for the mean root fidelity and mean square Bures distance between a fixed density matrix and a random density matrix, and also between two random density matrices. In the course of derivation, we also obtain spectral density for product of above pairs of density matrices. We corroborate our analytical results using Monte Carlo simulations. Moreover, we compare these results with the mean square Bures distance between reduced density matrices generated using coupled kicked tops and find very good agreement. △ Less

Submitted 18 November, 2022; v1 submitted 6 May, 2021; originally announced May 2021.

Comments: 13 pages, 10 figures ; Published Version

MSC Class: 94A15; 62B10; 81P47; 60B20; 15B52

Journal ref: Phys. Rev. A 104, 022438 (2021)

arXiv:2104.11838 [pdf, other]

On a Utilitarian Approach to Privacy Preserving Text Generation

Authors: Zekun Xu, Abhinav Aggarwal, Oluwaseyi Feyisetan, Nathanael Teissier

Abstract: Differentially-private mechanisms for text generation typically add carefully calibrated noise to input words and use the nearest neighbor to the noised input as the output word. When the noise is small in magnitude, these mechanisms are susceptible to reconstruction of the original sensitive text. This is because the nearest neighbor to the noised input is likely to be the original input. To miti… ▽ More Differentially-private mechanisms for text generation typically add carefully calibrated noise to input words and use the nearest neighbor to the noised input as the output word. When the noise is small in magnitude, these mechanisms are susceptible to reconstruction of the original sensitive text. This is because the nearest neighbor to the noised input is likely to be the original input. To mitigate this empirical privacy risk, we propose a novel class of differentially private mechanisms that parameterizes the nearest neighbor selection criterion in traditional mechanisms. Motivated by Vickrey auction, where only the second highest price is revealed and the highest price is kept private, we balance the choice between the first and the second nearest neighbors in the proposed class of mechanisms using a tuning parameter. This parameter is selected by empirically solving a constrained optimization problem for maximizing utility, while maintaining the desired privacy guarantees. We argue that this empirical measurement framework can be used to align different mechanisms along a common benchmark for their privacy-utility tradeoff, particularly when different distance metrics are used to calibrate the amount of noise added. Our experiments on real text classification datasets show up to 50% improvement in utility compared to the existing state-of-the-art with the same empirical privacy guarantee. △ Less

Submitted 23 April, 2021; originally announced April 2021.

Comments: 10 pages, 3 figures

arXiv:2102.09943 [pdf, other]

Towards Emotion Recognition in Hindi-English Code-Mixed Data: A Transformer Based Approach

Authors: Anshul Wadhawan, Akshita Aggarwal

Abstract: In the last few years, emotion detection in social-media text has become a popular problem due to its wide ranging application in better understanding the consumers, in psychology, in aiding human interaction with computers, designing smart systems etc. Because of the availability of huge amounts of data from social-media, which is regularly used for expressing sentiments and opinions, this proble… ▽ More In the last few years, emotion detection in social-media text has become a popular problem due to its wide ranging application in better understanding the consumers, in psychology, in aiding human interaction with computers, designing smart systems etc. Because of the availability of huge amounts of data from social-media, which is regularly used for expressing sentiments and opinions, this problem has garnered great attention. In this paper, we present a Hinglish dataset labelled for emotion detection. We highlight a deep learning based approach for detecting emotions in Hindi-English code mixed tweets, using bilingual word embeddings derived from FastText and Word2Vec approaches, as well as transformer based models. We experiment with various deep learning models, including CNNs, LSTMs, Bi-directional LSTMs (with and without attention), along with transformers like BERT, RoBERTa, and ALBERT. The transformer based BERT model outperforms all other models giving the best performance with an accuracy of 71.43%. △ Less

Submitted 28 February, 2021; v1 submitted 19 February, 2021; originally announced February 2021.

arXiv:2102.06166 [pdf, other]

Testing Framework for Black-box AI Models

Authors: Aniya Aggarwal, Samiulla Shaikh, Sandeep Hans, Swastik Haldar, Rema Ananthanarayanan, Diptikalyan Saha

Abstract: With widespread adoption of AI models for important decision making, ensuring reliability of such models remains an important challenge. In this paper, we present an end-to-end generic framework for testing AI Models which performs automated test generation for different modalities such as text, tabular, and time-series data and across various properties such as accuracy, fairness, and robustness.… ▽ More With widespread adoption of AI models for important decision making, ensuring reliability of such models remains an important challenge. In this paper, we present an end-to-end generic framework for testing AI Models which performs automated test generation for different modalities such as text, tabular, and time-series data and across various properties such as accuracy, fairness, and robustness. Our tool has been used for testing industrial AI models and was very effective to uncover issues present in those models. Demo video link: https://youtu.be/984UCU17YZI △ Less

Submitted 11 February, 2021; originally announced February 2021.

Comments: 4 pages Demonstrations track paper accepted at ICSE 2021

arXiv:2012.05403 [pdf, other]

Research Challenges in Designing Differentially Private Text Generation Mechanisms

Authors: Oluwaseyi Feyisetan, Abhinav Aggarwal, Zekun Xu, Nathanael Teissier

Abstract: Accurately learning from user data while ensuring quantifiable privacy guarantees provides an opportunity to build better Machine Learning (ML) models while maintaining user trust. Recent literature has demonstrated the applicability of a generalized form of Differential Privacy to provide guarantees over text queries. Such mechanisms add privacy preserving noise to vectorial representations of te… ▽ More Accurately learning from user data while ensuring quantifiable privacy guarantees provides an opportunity to build better Machine Learning (ML) models while maintaining user trust. Recent literature has demonstrated the applicability of a generalized form of Differential Privacy to provide guarantees over text queries. Such mechanisms add privacy preserving noise to vectorial representations of text in high dimension and return a text based projection of the noisy vectors. However, these mechanisms are sub-optimal in their trade-off between privacy and utility. This is due to factors such as a fixed global sensitivity which leads to too much noise added in dense spaces while simultaneously guaranteeing protection for sensitive outliers. In this proposal paper, we describe some challenges in balancing the tradeoff between privacy and utility for these differentially private text mechanisms. At a high level, we provide two proposals: (1) a framework called LAC which defers some of the noise to a privacy amplification step and (2), an additional suite of three different techniques for calibrating the noise based on the local region around a word. Our objective in this paper is not to evaluate a single solution but to further the conversation on these challenges and chart pathways for building better mechanisms. △ Less

Submitted 9 December, 2020; originally announced December 2020.

Comments: 14 pages, 1 figure

arXiv:2010.11947 [pdf, other]

A Differentially Private Text Perturbation Method Using a Regularized Mahalanobis Metric

Authors: Zekun Xu, Abhinav Aggarwal, Oluwaseyi Feyisetan, Nathanael Teissier

Abstract: Balancing the privacy-utility tradeoff is a crucial requirement of many practical machine learning systems that deal with sensitive customer data. A popular approach for privacy-preserving text analysis is noise injection, in which text data is first mapped into a continuous embedding space, perturbed by sampling a spherical noise from an appropriate distribution, and then projected back to the di… ▽ More Balancing the privacy-utility tradeoff is a crucial requirement of many practical machine learning systems that deal with sensitive customer data. A popular approach for privacy-preserving text analysis is noise injection, in which text data is first mapped into a continuous embedding space, perturbed by sampling a spherical noise from an appropriate distribution, and then projected back to the discrete vocabulary space. While this allows the perturbation to admit the required metric differential privacy, often the utility of downstream tasks modeled on this perturbed data is low because the spherical noise does not account for the variability in the density around different words in the embedding space. In particular, words in a sparse region are likely unchanged even when the noise scale is large. %Using the global sensitivity of the mechanism can potentially add too much noise to the words in the dense regions of the embedding space, causing a high utility loss, whereas using local sensitivity can leak information through the scale of the noise added. In this paper, we propose a text perturbation mechanism based on a carefully designed regularized variant of the Mahalanobis metric to overcome this problem. For any given noise scale, this metric adds an elliptical noise to account for the covariance structure in the embedding space. This heterogeneity in the noise scale along different directions helps ensure that the words in the sparse region have sufficient likelihood of replacement without sacrificing the overall utility. We provide a text-perturbation algorithm based on this metric and formally prove its privacy guarantees. Additionally, we empirically show that our mechanism improves the privacy statistics to achieve the same level of utility as compared to the state-of-the-art Laplace mechanism. △ Less

Submitted 22 October, 2020; originally announced October 2020.

Comments: 11 pages, 7 figures

arXiv:2010.00310 [pdf, other]

doi 10.18653/v1/2020.wnut-1.2

"Did you really mean what you said?" : Sarcasm Detection in Hindi-English Code-Mixed Data using Bilingual Word Embeddings

Authors: Akshita Aggarwal, Anshul Wadhawan, Anshima Chaudhary, Kavita Maurya

Abstract: With the increased use of social media platforms by people across the world, many new interesting NLP problems have come into existence. One such being the detection of sarcasm in the social media texts. We present a corpus of tweets for training custom word embeddings and a Hinglish dataset labelled for sarcasm detection. We propose a deep learning based approach to address the issue of sarcasm d… ▽ More With the increased use of social media platforms by people across the world, many new interesting NLP problems have come into existence. One such being the detection of sarcasm in the social media texts. We present a corpus of tweets for training custom word embeddings and a Hinglish dataset labelled for sarcasm detection. We propose a deep learning based approach to address the issue of sarcasm detection in Hindi-English code mixed tweets using bilingual word embeddings derived from FastText and Word2Vec approaches. We experimented with various deep learning models, including CNNs, LSTMs, Bi-directional LSTMs (with and without attention). We were able to outperform all state-of-the-art performances with our deep learning models, with attention based Bi-directional LSTMs giving the best performance exhibiting an accuracy of 78.49%. △ Less

Submitted 15 October, 2020; v1 submitted 1 October, 2020; originally announced October 2020.

arXiv:2009.12718 [pdf, other]

Differentially Private Adversarial Robustness Through Randomized Perturbations

Authors: Nan Xu, Oluwaseyi Feyisetan, Abhinav Aggarwal, Zekun Xu, Nathanael Teissier

Abstract: Deep Neural Networks, despite their great success in diverse domains, are provably sensitive to small perturbations on correctly classified examples and lead to erroneous predictions. Recently, it was proposed that this behavior can be combatted by optimizing the worst case loss function over all possible substitutions of training examples. However, this can be prone to weighing unlikely substitut… ▽ More Deep Neural Networks, despite their great success in diverse domains, are provably sensitive to small perturbations on correctly classified examples and lead to erroneous predictions. Recently, it was proposed that this behavior can be combatted by optimizing the worst case loss function over all possible substitutions of training examples. However, this can be prone to weighing unlikely substitutions higher, limiting the accuracy gain. In this paper, we study adversarial robustness through randomized perturbations, which has two immediate advantages: (1) by ensuring that substitution likelihood is weighted by the proximity to the original word, we circumvent optimizing the worst case guarantees and achieve performance gains; and (2) the calibrated randomness imparts differentially-private model training, which additionally improves robustness against adversarial attacks on the model outputs. Our approach uses a novel density-based mechanism based on truncated Gumbel noise, which ensures training on substitutions of both rare and dense words in the vocabulary while maintaining semantic similarity for model robustness. △ Less

Submitted 26 September, 2020; originally announced September 2020.

arXiv:2009.08559 [pdf, ps, other]

On Primes, Log-Loss Scores and (No) Privacy

Authors: Abhinav Aggarwal, Zekun Xu, Oluwaseyi Feyisetan, Nathanael Teissier

Abstract: Membership Inference Attacks exploit the vulnerabilities of exposing models trained on customer data to queries by an adversary. In a recently proposed implementation of an auditing tool for measuring privacy leakage from sensitive datasets, more refined aggregates like the Log-Loss scores are exposed for simulating inference attacks as well as to assess the total privacy leakage based on the adve… ▽ More Membership Inference Attacks exploit the vulnerabilities of exposing models trained on customer data to queries by an adversary. In a recently proposed implementation of an auditing tool for measuring privacy leakage from sensitive datasets, more refined aggregates like the Log-Loss scores are exposed for simulating inference attacks as well as to assess the total privacy leakage based on the adversary's predictions. In this paper, we prove that this additional information enables the adversary to infer the membership of any number of datapoints with full accuracy in a single query, causing complete membership privacy breach. Our approach obviates any attack model training or access to side knowledge with the adversary. Moreover, our algorithms are agnostic to the model under attack and hence, enable perfect membership inference even for models that do not memorize or overfit. In particular, our observations provide insight into the extent of information leakage from statistical aggregates and how they can be exploited. △ Less

Submitted 17 September, 2020; originally announced September 2020.

arXiv:2009.00156 [pdf, other]

LoCUS: A multi-robot loss-tolerant algorithm for surveying volcanic plumes

Authors: John Erickson, Abhinav Aggarwal, G. Matthew Fricke, Melanie E. Moses

Abstract: Measurement of volcanic CO2 flux by a drone swarm poses special challenges. Drones must be able to follow gas concentration gradients while tolerating frequent drone loss. We present the LoCUS algorithm as a solution to this problem and prove its robustness. LoCUS relies on swarm coordination and self-healing to solve the task. As a point of contrast we also implement the MoBS algorithm, derived f… ▽ More Measurement of volcanic CO2 flux by a drone swarm poses special challenges. Drones must be able to follow gas concentration gradients while tolerating frequent drone loss. We present the LoCUS algorithm as a solution to this problem and prove its robustness. LoCUS relies on swarm coordination and self-healing to solve the task. As a point of contrast we also implement the MoBS algorithm, derived from previously published work, which allows drones to solve the task independently. We compare the effectiveness of these algorithms using drone simulations, and find that LoCUS provides a reliable and efficient solution to the volcano survey problem. Further, the novel data-structures and algorithms underpinning LoCUS have application in other areas of fault-tolerant algorithm research. △ Less

Submitted 31 August, 2020; originally announced September 2020.

Comments: Accepted to IRC 2020 (8 pages, 7 figures)

arXiv:2007.12934 [pdf, other]

SOTERIA: In Search of Efficient Neural Networks for Private Inference

Authors: Anshul Aggarwal, Trevor E. Carlson, Reza Shokri, Shruti Tople

Abstract: ML-as-a-service is gaining popularity where a cloud server hosts a trained model and offers prediction (inference) service to users. In this setting, our objective is to protect the confidentiality of both the users' input queries as well as the model parameters at the server, with modest computation and communication overhead. Prior solutions primarily propose fine-tuning cryptographic methods to… ▽ More ML-as-a-service is gaining popularity where a cloud server hosts a trained model and offers prediction (inference) service to users. In this setting, our objective is to protect the confidentiality of both the users' input queries as well as the model parameters at the server, with modest computation and communication overhead. Prior solutions primarily propose fine-tuning cryptographic methods to make them efficient for known fixed model architectures. The drawback with this line of approach is that the model itself is never designed to operate with existing efficient cryptographic computations. We observe that the network architecture, internal functions, and parameters of a model, which are all chosen during training, significantly influence the computation and communication overhead of a cryptographic method, during inference. Based on this observation, we propose SOTERIA -- a training method to construct model architectures that are by-design efficient for private inference. We use neural architecture search algorithms with the dual objective of optimizing the accuracy of the model and the overhead of using cryptographic primitives for secure inference. Given the flexibility of modifying a model during training, we find accurate models that are also efficient for private computation. We select garbled circuits as our underlying cryptographic primitive, due to their expressiveness and efficiency, but this approach can be extended to hybrid multi-party computation settings. We empirically evaluate SOTERIA on MNIST and CIFAR10 datasets, to compare with the prior work. Our results confirm that SOTERIA is indeed effective in balancing performance and accuracy. △ Less

Submitted 25 July, 2020; originally announced July 2020.

arXiv:2007.02072 [pdf, other]

Quo Vadis, Skeleton Action Recognition ?

Authors: Pranay Gupta, Anirudh Thatipelli, Aditya Aggarwal, Shubh Maheshwari, Neel Trivedi, Sourav Das, Ravi Kiran Sarvadevabhatla

Abstract: In this paper, we study current and upcoming frontiers across the landscape of skeleton-based human action recognition. To study skeleton-action recognition in the wild, we introduce Skeletics-152, a curated and 3-D pose-annotated subset of RGB videos sourced from Kinetics-700, a large-scale action dataset. We extend our study to include out-of-context actions by introducing Skeleton-Mimetics, a d… ▽ More In this paper, we study current and upcoming frontiers across the landscape of skeleton-based human action recognition. To study skeleton-action recognition in the wild, we introduce Skeletics-152, a curated and 3-D pose-annotated subset of RGB videos sourced from Kinetics-700, a large-scale action dataset. We extend our study to include out-of-context actions by introducing Skeleton-Mimetics, a dataset derived from the recently introduced Mimetics dataset. We also introduce Metaphorics, a dataset with caption-style annotated YouTube videos of the popular social game Dumb Charades and interpretative dance performances. We benchmark state-of-the-art models on the NTU-120 dataset and provide multi-layered assessment of the results. The results from benchmarking the top performers of NTU-120 on the newly introduced datasets reveal the challenges and domain gap induced by actions in the wild. Overall, our work characterizes the strengths and limitations of existing approaches and datasets. Via the introduced datasets, our work enables new frontiers for human action recognition. △ Less

Submitted 7 April, 2021; v1 submitted 4 July, 2020; originally announced July 2020.

Comments: To appear in International Journal of Computer Vision (IJCV). Project page: https://skeleton.iiit.ac.in/

arXiv:2004.12232 [pdf, other]

Reconstruct, Rasterize and Backprop: Dense shape and pose estimation from a single image

Authors: Aniket Pokale, Aditya Aggarwal, K. Madhava Krishna

Abstract: This paper presents a new system to obtain dense object reconstructions along with 6-DoF poses from a single image. Geared towards high fidelity reconstruction, several recent approaches leverage implicit surface representations and deep neural networks to estimate a 3D mesh of an object, given a single image. However, all such approaches recover only the shape of an object; the reconstruction is… ▽ More This paper presents a new system to obtain dense object reconstructions along with 6-DoF poses from a single image. Geared towards high fidelity reconstruction, several recent approaches leverage implicit surface representations and deep neural networks to estimate a 3D mesh of an object, given a single image. However, all such approaches recover only the shape of an object; the reconstruction is often in a canonical frame, unsuitable for downstream robotics tasks. To this end, we leverage recent advances in differentiable rendering (in particular, rasterization) to close the loop with 3D reconstruction in camera frame. We demonstrate that our approach---dubbed reconstruct, rasterize and backprop (RRB) achieves significantly lower pose estimation errors compared to prior art, and is able to recover dense object shapes and poses from imagery. We further extend our results to an (offline) setup, where we demonstrate a dense monocular object-centric egomotion estimation system. △ Less

Submitted 25 April, 2020; originally announced April 2020.

Comments: 8 pages, 5 figures, 2 tables

arXiv:2004.05865 [pdf, other]

Detecting and Characterizing Extremist Reviewer Groups in Online Product Reviews

Authors: Viresh Gupta, Aayush Aggarwal, Tanmoy Chakraborty

Abstract: Online marketplaces often witness opinion spam in the form of reviews. People are often hired to target specific brands for promoting or impeding them by writing highly positive or negative reviews. This often is done collectively in groups. Although some previous studies attempted to identify and analyze such opinion spam groups, little has been explored to spot those groups who target a brand as… ▽ More Online marketplaces often witness opinion spam in the form of reviews. People are often hired to target specific brands for promoting or impeding them by writing highly positive or negative reviews. This often is done collectively in groups. Although some previous studies attempted to identify and analyze such opinion spam groups, little has been explored to spot those groups who target a brand as a whole, instead of just products. In this paper, we collected reviews from the Amazon product review site and manually labelled a set of 923 candidate reviewer groups. The groups are extracted using frequent itemset mining over brand similarities such that users are clustered together if they have mutually reviewed (products of) a lot of brands. We hypothesize that the nature of the reviewer groups is dependent on 8 features specific to a (group, brand) pair. We develop a feature-based supervised model to classify candidate groups as extremist entities. We run multiple classifiers for the task of classifying a group based on the reviews written by the users of that group, to determine if the group shows signs of extremity. A 3-layer Perceptron based classifier turns out to be the best classifier. We further study the behaviours of such groups in detail to understand the dynamics of brand-level opinion fraud better. These behaviours include consistency in ratings, review sentiment, verified purchase, review dates and helpful votes received on reviews. Surprisingly, we observe that there are a lot of verified reviewers showing extreme sentiment, which on further investigation leads to ways to circumvent existing mechanisms in place to prevent unofficial incentives on Amazon. △ Less

Submitted 13 April, 2020; originally announced April 2020.

Comments: 6 figures, 5 tables, Accepted in IEEE Transactions on Computational Social Systems

arXiv:1912.10413 [pdf]

doi 10.33969/AIS.2019.11009

Hiding Data in Images Using Cryptography and Deep Neural Network

Authors: Kartik Sharma, Ashutosh Aggarwal, Tanay Singhania, Deepak Gupta, Ashish Khanna

Abstract: Steganography is an art of obscuring data inside another quotidian file of similar or varying types. Hiding data has always been of significant importance to digital forensics. Previously, steganography has been combined with cryptography and neural networks separately. Whereas, this research combines steganography, cryptography with the neural networks all together to hide an image inside another… ▽ More Steganography is an art of obscuring data inside another quotidian file of similar or varying types. Hiding data has always been of significant importance to digital forensics. Previously, steganography has been combined with cryptography and neural networks separately. Whereas, this research combines steganography, cryptography with the neural networks all together to hide an image inside another container image of the larger or same size. Although the cryptographic technique used is quite simple, but is effective when convoluted with deep neural nets. Other steganography techniques involve hiding data efficiently, but in a uniform pattern which makes it less secure. This method targets both the challenges and make data hiding secure and non-uniform. △ Less

Submitted 1 January, 2020; v1 submitted 22 December, 2019; originally announced December 2019.

Comments: 20 pages, 9 figures, 5 tables

arXiv:1911.11974 [pdf, other]

On the Minimal Set of Inputs Required for Efficient Neuro-Evolved Foraging

Authors: John Erickson, Abhinav Aggarwal, Melanie E. Moses

Abstract: In this paper, we perform an ablation study of \neatfa, a neuro-evolved foraging algorithm that has recently been shown to forage efficiently under different resource distributions. Through selective disabling of input signals, we identify a \emph{sufficiently} minimal set of input features that contribute the most towards determining search trajectories which favor high resource collection rates.… ▽ More In this paper, we perform an ablation study of \neatfa, a neuro-evolved foraging algorithm that has recently been shown to forage efficiently under different resource distributions. Through selective disabling of input signals, we identify a \emph{sufficiently} minimal set of input features that contribute the most towards determining search trajectories which favor high resource collection rates. Our experiments reveal that, independent of how the resources are distributed in the arena, the signals involved in imparting the controller the ability to switch from searching of resources to transporting them back to the nest are the most critical. Additionally, we find that pheromones play a key role in boosting performance of the controller by providing signals for informed locomotion in search for unforaged resources. △ Less

Submitted 27 November, 2019; originally announced November 2019.

Comments: Presented at BDA 2019 (Colocated with PODC 2019)

arXiv:1911.11973 [pdf, other]

A Most Irrational Foraging Algorithm

Authors: Abhinav Aggarwal, William F. Vining, Diksha Gupta, Jared Saia, Melanie E. Moses

Abstract: We present a foraging algorithm, GoldenFA, in which search direction is chosen based on the Golden Ratio. We show both theoretically and empirically that GoldenFA is more efficient for a single searcher than a comparable algorithm where search direction is chosen uniformly at random. Moreover, we give a variant of our algorithm that parallelizes linearly with the number of searchers. We present a foraging algorithm, GoldenFA, in which search direction is chosen based on the Golden Ratio. We show both theoretically and empirically that GoldenFA is more efficient for a single searcher than a comparable algorithm where search direction is chosen uniformly at random. Moreover, we give a variant of our algorithm that parallelizes linearly with the number of searchers. △ Less

Submitted 27 November, 2019; originally announced November 2019.

Comments: Presented at BDA 2019 (co-located with PODC 2019)

arXiv:1909.09771 [pdf, ps, other]

Multithreaded Filtering Preconditioner for Diffusion Equation on Structured Grid

Authors: Abhinav Aggarwal, Shivam Kakkar, Pawan Kumar

Abstract: A parallel and nested version of a frequency filtering preconditioner is proposed for linear systems corresponding to diffusion equation on a structured grid. The proposed preconditioner is found to be robust with respect to jumps in the diffusion coefficients. The storage requirement for the preconditioner is O(N),where N is number of rows of matrix, hence, a fairly large problem of size more tha… ▽ More A parallel and nested version of a frequency filtering preconditioner is proposed for linear systems corresponding to diffusion equation on a structured grid. The proposed preconditioner is found to be robust with respect to jumps in the diffusion coefficients. The storage requirement for the preconditioner is O(N),where N is number of rows of matrix, hence, a fairly large problem of size more than 42 million unknowns has been solved on a quad core machine with 64GB RAM. The parallelism is achieved using twisted factorization and SIMD operations. The preconditioner achieves a speedup of 3.3 times on a quad core processor clocked at 4.2 GHz, and compared to a well known algebraic multigrid method, it is significantly faster in both setup and solve times for diffusion equations with jumps. △ Less

Submitted 20 September, 2021; v1 submitted 21 September, 2019; originally announced September 2019.

Comments: 9 pages

arXiv:1903.00087 [pdf, other]

doi 10.1007/978-981-13-7403-6_31

Broad Neural Network for Change Detection in Aerial Images

Authors: Shailesh Shrivastava, Alakh Aggarwal, Pratik Chattopadhyay

Abstract: A change detection system takes as input two images of a region captured at two different times, and predicts which pixels in the region have undergone change over the time period. Since pixel-based analysis can be erroneous due to noise, illumination difference and other factors, contextual information is usually used to determine the class of a pixel (changed or not). This contextual information… ▽ More A change detection system takes as input two images of a region captured at two different times, and predicts which pixels in the region have undergone change over the time period. Since pixel-based analysis can be erroneous due to noise, illumination difference and other factors, contextual information is usually used to determine the class of a pixel (changed or not). This contextual information is taken into account by considering a pixel of the difference image along with its neighborhood. With the help of ground truth information, the labeled patterns are generated. Finally, Broad Learning classifier is used to get prediction about the class of each pixel. Results show that Broad Learning can classify the data set with a significantly higher F-Score than that of Multilayer Perceptron. Performance comparison has also been made with other popular classifiers, namely Multilayer Perceptron and Random Forest. △ Less

Submitted 20 July, 2019; v1 submitted 28 February, 2019; originally announced March 2019.

Comments: $\textbf{Accepted at}$: IEMGraph (International Conference on Emerging Technology in Modelling and Graphics) 2018 $$ $$ $\textbf{Date of Conference}$: 6-7 September, 2018 $$ $$ $\textbf{Location of Conference}$: Kolkatta, India

Journal ref: Advances in Intelligent Systems and Computing book series (AISC, volume 937), Springer, Singapore, 2019

arXiv:1808.02113 [pdf, other]

Paying Attention to Attention: Highlighting Influential Samples in Sequential Analysis

Authors: Cynthia Freeman, Jonathan Merriman, Abhinav Aggarwal, Ian Beaver, Abdullah Mueen

Abstract: In (Yang et al. 2016), a hierarchical attention network (HAN) is created for document classification. The attention layer can be used to visualize text influential in classifying the document, thereby explaining the model's prediction. We successfully applied HAN to a sequential analysis task in the form of real-time monitoring of turn taking in conversations. However, we discovered instances wher… ▽ More In (Yang et al. 2016), a hierarchical attention network (HAN) is created for document classification. The attention layer can be used to visualize text influential in classifying the document, thereby explaining the model's prediction. We successfully applied HAN to a sequential analysis task in the form of real-time monitoring of turn taking in conversations. However, we discovered instances where the attention weights were uniform at the stopping point (indicating all turns were equivalently influential to the classifier), preventing meaningful visualization for real-time human review or classifier improvement. We observed that attention weights for turns fluctuated as the conversations progressed, indicating turns had varying influence based on conversation state. Leveraging this observation, we develop a method to create more informative real-time visuals (as confirmed by human reviewers) in cases of uniform attention weights using the changes in turn importance as a conversation progresses over time. △ Less

Submitted 6 August, 2018; originally announced August 2018.

arXiv:1802.03625 [pdf, other]

The Follower Count Fallacy: Detecting Twitter Users with Manipulated Follower Count

Authors: Anupama Aggarwal, Saravana Kumar, Kushagra Bhargava, Ponnurangam Kumaraguru

Abstract: Online Social Networks (OSN) are increasingly being used as platform for an effective communication, to engage with other users, and to create a social worth via number of likes, followers and shares. Such metrics and crowd-sourced ratings give the OSN user a sense of social reputation which she tries to maintain and boost to be more influential. Users artificially bolster their social reputation… ▽ More Online Social Networks (OSN) are increasingly being used as platform for an effective communication, to engage with other users, and to create a social worth via number of likes, followers and shares. Such metrics and crowd-sourced ratings give the OSN user a sense of social reputation which she tries to maintain and boost to be more influential. Users artificially bolster their social reputation via black-market web services. In this work, we identify users which manipulate their projected follower count using an unsupervised local neighborhood detection method. We identify a neighborhood of the user based on a robust set of features which reflect user similarity in terms of the expected follower count. We show that follower count estimation using our method has 84.2% accuracy with a low error rate. In addition, we estimate the follower count of the user under suspicion by finding its neighborhood drawn from a large random sample of Twitter. We show that our method is highly tolerant to synthetic manipulation of followers. Using the deviation of predicted follower count from the displayed count, we are also able to detect customers with a high precision of 98.62% △ Less

Submitted 10 February, 2018; originally announced February 2018.

Comments: Accepted at ACM SAC'18

arXiv:1802.01548 [pdf, other]

Regularized Evolution for Image Classifier Architecture Search

Authors: Esteban Real, Alok Aggarwal, Yanping Huang, Quoc V Le

Abstract: The effort devoted to hand-crafting neural network image classifiers has motivated the use of architecture search to discover them automatically. Although evolutionary algorithms have been repeatedly applied to neural network topologies, the image classifiers thus discovered have remained inferior to human-crafted ones. Here, we evolve an image classifier---AmoebaNet-A---that surpasses hand-design… ▽ More The effort devoted to hand-crafting neural network image classifiers has motivated the use of architecture search to discover them automatically. Although evolutionary algorithms have been repeatedly applied to neural network topologies, the image classifiers thus discovered have remained inferior to human-crafted ones. Here, we evolve an image classifier---AmoebaNet-A---that surpasses hand-designs for the first time. To do this, we modify the tournament selection evolutionary algorithm by introducing an age property to favor the younger genotypes. Matching size, AmoebaNet-A has comparable accuracy to current state-of-the-art ImageNet models discovered with more complex architecture-search methods. Scaled to larger size, AmoebaNet-A sets a new state-of-the-art 83.9% / 96.6% top-5 ImageNet accuracy. In a controlled comparison against a well known reinforcement learning algorithm, we give evidence that evolution can obtain results faster with the same hardware, especially at the earlier stages of the search. This is relevant when fewer compute resources are available. Evolution is, thus, a simple method to effectively discover high-quality architectures. △ Less

Submitted 16 February, 2019; v1 submitted 5 February, 2018; originally announced February 2018.

Comments: Accepted for publication at AAAI 2019, the Thirty-Third AAAI Conference on Artificial Intelligence

ACM Class: I.2.6; I.5.1; I.5.2

arXiv:1710.04142 [pdf, other]

Bollywood Movie Corpus for Text, Images and Videos

Authors: Nishtha Madaan, Sameep Mehta, Mayank Saxena, Aditi Aggarwal, Taneea S Agrawaal, Vrinda Malhotra

Abstract: In past few years, several data-sets have been released for text and images. We present an approach to create the data-set for use in detecting and removing gender bias from text. We also include a set of challenges we have faced while creating this corpora. In this work, we have worked with movie data from Wikipedia plots and movie trailers from YouTube. Our Bollywood Movie corpus contains 4000 m… ▽ More In past few years, several data-sets have been released for text and images. We present an approach to create the data-set for use in detecting and removing gender bias from text. We also include a set of challenges we have faced while creating this corpora. In this work, we have worked with movie data from Wikipedia plots and movie trailers from YouTube. Our Bollywood Movie corpus contains 4000 movies extracted from Wikipedia and 880 trailers extracted from YouTube which were released from 1970-2017. The corpus contains csv files with the following data about each movie - Wikipedia title of movie, cast, plot text, co-referenced plot text, soundtrack information, link to movie poster, caption of movie poster, number of males in poster, number of females in poster. In addition to that, corresponding to each cast member the following data is available - cast name, cast gender, cast verbs, cast adjectives, cast relations, cast centrality, cast mentions. We present some preliminary results on the task of bias removal which suggest that the data-set is quite useful for performing such tasks. △ Less

Submitted 11 October, 2017; originally announced October 2017.

arXiv:1710.04117 [pdf, other]

Analyzing Gender Stereotyping in Bollywood Movies

Authors: Nishtha Madaan, Sameep Mehta, Taneea S Agrawaal, Vrinda Malhotra, Aditi Aggarwal, Mayank Saxena

Abstract: The presence of gender stereotypes in many aspects of society is a well-known phenomenon. In this paper, we focus on studying such stereotypes and bias in Hindi movie industry (Bollywood). We analyze movie plots and posters for all movies released since 1970. The gender bias is detected by semantic modeling of plots at inter-sentence and intra-sentence level. Different features like occupation, in… ▽ More The presence of gender stereotypes in many aspects of society is a well-known phenomenon. In this paper, we focus on studying such stereotypes and bias in Hindi movie industry (Bollywood). We analyze movie plots and posters for all movies released since 1970. The gender bias is detected by semantic modeling of plots at inter-sentence and intra-sentence level. Different features like occupation, introduction of cast in text, associated actions and descriptions are captured to show the pervasiveness of gender bias and stereo- type in movies. We derive a semantic graph and compute centrality of each character and observe similar bias there. We also show that such bias is not applicable for movie posters where females get equal importance even though their character has little or no impact on the movie plot. Furthermore, we explore the movie trailers to estimate on-screen time for males and females and also study the portrayal of emotions by gender in them. The silver lining is that our system was able to identify 30 movies over last 3 years where such stereotypes were broken. △ Less

Submitted 11 October, 2017; originally announced October 2017.

arXiv:1709.04569 [pdf, other]

REMOTEGATE: Incentive-Compatible Remote Configuration of Security Gateways

Authors: Abhinav Aggarwal, Mahdi Zamani, Mihai Christodorescu

Abstract: Imagine that a malicious hacker is trying to attack a server over the Internet and the server wants to block the attack packets as close to their point of origin as possible. However, the security gateway ahead of the source of attack is untrusted. How can the server block the attack packets through this gateway? In this paper, we introduce REMOTEGATE, a trustworthy mechanism for allowing any part… ▽ More Imagine that a malicious hacker is trying to attack a server over the Internet and the server wants to block the attack packets as close to their point of origin as possible. However, the security gateway ahead of the source of attack is untrusted. How can the server block the attack packets through this gateway? In this paper, we introduce REMOTEGATE, a trustworthy mechanism for allowing any party (server) on the Internet to configure a security gateway owned by a second party, at a certain agreed upon reward that the former pays to the latter for its service. We take an interactive incentive-compatible approach, for the case when both the server and the gateway are rational, to devise a protocol that will allow the server to help the security gateway generate and deploy a policy rule that filters the attack packets before they reach the server. The server will reward the gateway only when the latter can successfully verify that it has generated and deployed the correct rule for the issue. This mechanism will enable an Internet-scale approach to improving security and privacy, backed by digital payment incentives. △ Less

Submitted 13 September, 2017; originally announced September 2017.

Comments: Working manuscript

arXiv:1708.05640 [pdf, ps, other]

A similarity criterion for sequential programs using truth-preserving partial functions

Authors: Abhinav Aggarwal

Abstract: The execution of sequential programs allows them to be represented using mathematical functions formed by the composition of statements following one after the other. Each such statement is in itself a partial function, which allows only inputs satisfying a particular Boolean condition to carry forward the execution and hence, the composition of such functions (as a result of sequential execution… ▽ More The execution of sequential programs allows them to be represented using mathematical functions formed by the composition of statements following one after the other. Each such statement is in itself a partial function, which allows only inputs satisfying a particular Boolean condition to carry forward the execution and hence, the composition of such functions (as a result of sequential execution of the statements) strengthens the valid set of input state variables for the program to complete its execution and halt succesfully. With this thought in mind, this paper tries to study a particular class of partial functions, which tend to preserve the truth of two given Boolean conditions whenever the state variables satisfying one are mapped through such functions into a domain of state variables satisfying the other. The existence of such maps allows us to study isomorphism between different programs, based not only on their structural characteristics (e.g. the kind of programming constructs used and the overall input-output transformation), but also the nature of computation performed on seemingly different inputs. Consequently, we can now relate programs which perform a given type of computation, like a loop counting down indefinitely, without caring about the input sets they work on individually or the set of statements each program contains. △ Less

Submitted 18 August, 2017; originally announced August 2017.

Comments: Submitted as term paper in 2014

arXiv:1708.04668 [pdf, other]

Beating the Multiplicative Weights Update Algorithm

Authors: Abhinav Aggarwal, José Abel Castellanos Joo, Diksha Gupta

Abstract: Multiplicative weights update algorithms have been used extensively in designing iterative algorithms for many computational tasks. The core idea is to maintain a distribution over a set of experts and update this distribution in an online fashion based on the parameters of the underlying optimization problem. In this report, we study the behavior of a special MWU algorithm used for generating a g… ▽ More Multiplicative weights update algorithms have been used extensively in designing iterative algorithms for many computational tasks. The core idea is to maintain a distribution over a set of experts and update this distribution in an online fashion based on the parameters of the underlying optimization problem. In this report, we study the behavior of a special MWU algorithm used for generating a global coin flip in the presence of an adversary that tampers the experts' advice. Specifically, we focus our attention on two adversarial strategies: (1) non-adaptive, in which the adversary chooses a fixed set of experts a priori and corrupts their advice in each round; and (2) adaptive, in which this set is chosen as the rounds of the algorithm progress. We formulate these adversarial strategies as being greedy in terms of trying to maximize the share of the corrupted experts in the final weighted advice the MWU computes and provide the underlying optimization problem that needs to be solved to achieve this goal. We provide empirical results to show that in the presence of either of the above adversaries, the MWU algorithm takes $\mathcal{O}(n)$ rounds in expectation to produce the desired output. This result compares well with the current state of the art of $\mathcal{O}(n^3)$ for the general Byzantine consensus problem. Finally, we briefly discuss the extension of these adversarial strategies for a general MWU algorithm and provide an outline for the framework in that setting. △ Less

Submitted 15 August, 2017; originally announced August 2017.

Comments: Word done as part of UNM CS506 term paper

arXiv:1705.00145 [pdf, other]

Replica Placement on Bounded Treewidth Graphs

Authors: Anshul Aggarwal, Venkatesan T. Chakaravarthy, Neelima Gupta, Yogish Sabharwal, Sachin Sharma, Sonika Thakral

Abstract: We consider the replica placement problem: given a graph with clients and nodes, place replicas on a minimum set of nodes to serve all the clients; each client is associated with a request and maximum distance that it can travel to get served and there is a maximum limit (capacity) on the amount of request a replica can serve. The problem falls under the general framework of capacitated set coveri… ▽ More We consider the replica placement problem: given a graph with clients and nodes, place replicas on a minimum set of nodes to serve all the clients; each client is associated with a request and maximum distance that it can travel to get served and there is a maximum limit (capacity) on the amount of request a replica can serve. The problem falls under the general framework of capacitated set covering. It admits an O(\log n)-approximation and it is NP-hard to approximate within a factor of $o(\log n)$. We study the problem in terms of the treewidth $t$ of the graph and present an O(t)-approximation algorithm. △ Less

Submitted 10 September, 2017; v1 submitted 29 April, 2017; originally announced May 2017.

Comments: An abridged version of this paper is to appear in the proceedings of WADS'17

arXiv:1612.05943 [pdf, other]

Distributed Computing with Channel Noise

Authors: Abhinav Aggarwal, Varsha Dani, Thomas P. Hayes, Jared Saia

Abstract: A group of $n$ users want to run a distributed protocol $π$ over a network where communication occurs via private point-to-point channels. Unfortunately, an adversary, who knows $π$, is able to maliciously flip bits on the channels. Can we efficiently simulate $π$ in the presence of such an adversary? We show that this is possible, even when $L$, the number of bits sent in $π$, and $T$, the number… ▽ More A group of $n$ users want to run a distributed protocol $π$ over a network where communication occurs via private point-to-point channels. Unfortunately, an adversary, who knows $π$, is able to maliciously flip bits on the channels. Can we efficiently simulate $π$ in the presence of such an adversary? We show that this is possible, even when $L$, the number of bits sent in $π$, and $T$, the number of bits flipped by the adversary are not known in advance. In particular, we show how to create a robust version of $π$ that 1) fails with probability at most $δ$, for any $δ>0$; and 2) sends $\tilde{O}(L + T)$ bits, where the $\tilde{O}$ notation hides a $\log (nL/ δ)$ term multiplying $L$. Additionally, we show how to improve this result when the average message size $α$ is not constant. In particular, we give an algorithm that sends $O( L (1 + (1/α) \log (n L/δ) + T)$ bits. This algorithm is adaptive in that it does not require a priori knowledge of $α$. We note that if $α$ is $Ω\left( \log (n L/δ) \right)$, then this improved algorithm sends only $O(L+T)$ bits, and is therefore within a constant factor of optimal. △ Less

Submitted 24 July, 2017; v1 submitted 18 December, 2016; originally announced December 2016.

Comments: 29 pages, 6 figures

arXiv:1612.00766 [pdf, other]

I Spy with My Little Eye: Analysis and Detection of Spying Browser Extensions

Authors: Anupama Aggarwal, Bimal Viswanath, Saravana Kumar, Ayush Shah, Liang Zhang, Ponnurangam Kumaraguru

Abstract: Several studies have been conducted on understanding third-party user tracking on the web. However, web trackers can only track users on sites where they are embedded by the publisher, thus obtaining a fragmented view of a user's online footprint. In this work, we investigate a different form of user tracking, where browser extensions are repurposed to capture the complete online activities of a u… ▽ More Several studies have been conducted on understanding third-party user tracking on the web. However, web trackers can only track users on sites where they are embedded by the publisher, thus obtaining a fragmented view of a user's online footprint. In this work, we investigate a different form of user tracking, where browser extensions are repurposed to capture the complete online activities of a user and communicate the collected sensitive information to a third-party domain. We conduct an empirical study of spying browser extensions on the Chrome Web Store. First, we present an in-depth analysis of the spying behavior of these extensions. We observe that these extensions steal a variety of sensitive user information, such as the complete browsing history (e.g., the sequence of web traversals), online social network (OSN) access tokens, IP address, and user geolocation. Second, we investigate the potential for automatically detecting spying extensions by applying machine learning schemes. We show that using a Recurrent Neural Network (RNN), the sequences of browser API calls can be a robust feature, outperforming hand-crafted features (used in prior work on malicious extensions) to detect spying extensions. Our RNN based detection scheme achieves a high precision (90.02%) and recall (93.31%) in detecting spying extensions. △ Less

Submitted 3 May, 2018; v1 submitted 2 December, 2016; originally announced December 2016.

Showing 1–50 of 64 results for author: Aggarwal, A