Search | arXiv e-print repository

GC-Bench: A Benchmark Framework for Graph Condensation with New Insights

Authors: Shengbo Gong, Juntong Ni, Noveen Sachdeva, Carl Yang, Wei Jin

Abstract: Graph condensation (GC) is an emerging technique designed to learn a significantly smaller graph that retains the essential information of the original graph. This condensed graph has shown promise in accelerating graph neural networks while preserving performance comparable to those achieved with the original, larger graphs. Additionally, this technique facilitates downstream applications such as… ▽ More Graph condensation (GC) is an emerging technique designed to learn a significantly smaller graph that retains the essential information of the original graph. This condensed graph has shown promise in accelerating graph neural networks while preserving performance comparable to those achieved with the original, larger graphs. Additionally, this technique facilitates downstream applications such as neural architecture search and enhances our understanding of redundancy in large graphs. Despite the rapid development of GC methods, a systematic evaluation framework remains absent, which is necessary to clarify the critical designs for particular evaluative aspects. Furthermore, several meaningful questions have not been investigated, such as whether GC inherently preserves certain graph properties and offers robustness even without targeted design efforts. In this paper, we introduce GC-Bench, a comprehensive framework to evaluate recent GC methods across multiple dimensions and to generate new insights. Our experimental findings provide a deeper insights into the GC process and the characteristics of condensed graphs, guiding future efforts in enhancing performance and exploring new applications. Our code is available at \url{https://github.com/Emory-Melody/GraphSlim/tree/main/benchmark}. △ Less

Submitted 24 June, 2024; originally announced June 2024.

Comments: 9 pages

arXiv:2402.09668 [pdf, other]

How to Train Data-Efficient LLMs

Authors: Noveen Sachdeva, Benjamin Coleman, Wang-Cheng Kang, Jianmo Ni, Lichan Hong, Ed H. Chi, James Caverlee, Julian McAuley, Derek Zhiyuan Cheng

Abstract: The training of large language models (LLMs) is expensive. In this paper, we study data-efficient approaches for pre-training LLMs, i.e., techniques that aim to optimize the Pareto frontier of model quality and training resource/data consumption. We seek to understand the tradeoffs associated with data selection routines based on (i) expensive-to-compute data-quality estimates, and (ii) maximizati… ▽ More The training of large language models (LLMs) is expensive. In this paper, we study data-efficient approaches for pre-training LLMs, i.e., techniques that aim to optimize the Pareto frontier of model quality and training resource/data consumption. We seek to understand the tradeoffs associated with data selection routines based on (i) expensive-to-compute data-quality estimates, and (ii) maximization of coverage and diversity-based measures in the feature space. Our first technique, Ask-LLM, leverages the zero-shot reasoning capabilities of instruction-tuned LLMs to directly assess the quality of a training example. To target coverage, we propose Density sampling, which models the data distribution to select a diverse sample. In our comparison of 19 samplers, involving hundreds of evaluation tasks and pre-training runs, we find that Ask-LLM and Density are the best methods in their respective categories. Coverage sampling can recover the performance of the full data, while models trained on Ask-LLM data consistently outperform full-data training -- even when we reject 90% of the original dataset, while converging up to 70% faster. △ Less

Submitted 14 February, 2024; originally announced February 2024.

Comments: Under review. 44 pages, 30 figures

arXiv:2310.15433 [pdf, other]

Off-Policy Evaluation for Large Action Spaces via Policy Convolution

Authors: Noveen Sachdeva, Lequn Wang, Dawen Liang, Nathan Kallus, Julian McAuley

Abstract: Developing accurate off-policy estimators is crucial for both evaluating and optimizing for new policies. The main challenge in off-policy estimation is the distribution shift between the logging policy that generates data and the target policy that we aim to evaluate. Typically, techniques for correcting distribution shift involve some form of importance sampling. This approach results in unbiase… ▽ More Developing accurate off-policy estimators is crucial for both evaluating and optimizing for new policies. The main challenge in off-policy estimation is the distribution shift between the logging policy that generates data and the target policy that we aim to evaluate. Typically, techniques for correcting distribution shift involve some form of importance sampling. This approach results in unbiased value estimation but often comes with the trade-off of high variance, even in the simpler case of one-step contextual bandits. Furthermore, importance sampling relies on the common support assumption, which becomes impractical when the action space is large. To address these challenges, we introduce the Policy Convolution (PC) family of estimators. These methods leverage latent structure within actions -- made available through action embeddings -- to strategically convolve the logging and target policies. This convolution introduces a unique bias-variance trade-off, which can be controlled by adjusting the amount of convolution. Our experiments on synthetic and benchmark datasets demonstrate remarkable mean squared error (MSE) improvements when using PC, especially when either the action space or policy mismatch becomes large, with gains of up to 5 - 6 orders of magnitude over existing estimators. △ Less

Submitted 23 October, 2023; originally announced October 2023.

Comments: Under review. 36 pages, 31 figures

arXiv:2310.11266 [pdf]

Emulating Human Cognitive Processes for Expert-Level Medical Question-Answering with Large Language Models

Authors: Khushboo Verma, Marina Moore, Stephanie Wottrich, Karla Robles López, Nishant Aggarwal, Zeel Bhatt, Aagamjit Singh, Bradford Unroe, Salah Basheer, Nitish Sachdeva, Prinka Arora, Harmanjeet Kaur, Tanupreet Kaur, Tevon Hood, Anahi Marquez, Tushar Varshney, Nanfu Deng, Azaan Ramani, Pawanraj Ishwara, Maimoona Saeed, Tatiana López Velarde Peña, Bryan Barksdale, Sushovan Guha, Satwant Kumar

Abstract: In response to the pressing need for advanced clinical problem-solving tools in healthcare, we introduce BooksMed, a novel framework based on a Large Language Model (LLM). BooksMed uniquely emulates human cognitive processes to deliver evidence-based and reliable responses, utilizing the GRADE (Grading of Recommendations, Assessment, Development, and Evaluations) framework to effectively quantify… ▽ More In response to the pressing need for advanced clinical problem-solving tools in healthcare, we introduce BooksMed, a novel framework based on a Large Language Model (LLM). BooksMed uniquely emulates human cognitive processes to deliver evidence-based and reliable responses, utilizing the GRADE (Grading of Recommendations, Assessment, Development, and Evaluations) framework to effectively quantify evidence strength. For clinical decision-making to be appropriately assessed, an evaluation metric that is clinically aligned and validated is required. As a solution, we present ExpertMedQA, a multispecialty clinical benchmark comprised of open-ended, expert-level clinical questions, and validated by a diverse group of medical professionals. By demanding an in-depth understanding and critical appraisal of up-to-date clinical literature, ExpertMedQA rigorously evaluates LLM performance. BooksMed outperforms existing state-of-the-art models Med-PaLM 2, Almanac, and ChatGPT in a variety of medical scenarios. Therefore, a framework that mimics human cognitive stages could be a useful tool for providing reliable and evidence-based responses to clinical inquiries. △ Less

Submitted 17 October, 2023; originally announced October 2023.

arXiv:2310.09983 [pdf, other]

Farzi Data: Autoregressive Data Distillation

Authors: Noveen Sachdeva, Zexue He, Wang-Cheng Kang, Jianmo Ni, Derek Zhiyuan Cheng, Julian McAuley

Abstract: We study data distillation for auto-regressive machine learning tasks, where the input and output have a strict left-to-right causal structure. More specifically, we propose Farzi, which summarizes an event sequence dataset into a small number of synthetic sequences -- Farzi Data -- which are optimized to maintain (if not improve) model performance compared to training on the full dataset. Under t… ▽ More We study data distillation for auto-regressive machine learning tasks, where the input and output have a strict left-to-right causal structure. More specifically, we propose Farzi, which summarizes an event sequence dataset into a small number of synthetic sequences -- Farzi Data -- which are optimized to maintain (if not improve) model performance compared to training on the full dataset. Under the hood, Farzi conducts memory-efficient data distillation by (i) deriving efficient reverse-mode differentiation of the Adam optimizer by leveraging Hessian-Vector Products; and (ii) factorizing the high-dimensional discrete event-space into a latent-space which provably promotes implicit regularization. Empirically, for sequential recommendation and language modeling tasks, we are able to achieve 98-120% of downstream full-data performance when training state-of-the-art models on Farzi Data of size as little as 0.1% of the original dataset. Notably, being able to train better models with significantly less data sheds light on the design of future large auto-regressive models, and opens up new opportunities to further scale up model and data sizes. △ Less

Submitted 15 October, 2023; originally announced October 2023.

Comments: Under review. 23 pages, 9 figures

arXiv:2301.04272 [pdf, other]

Data Distillation: A Survey

Authors: Noveen Sachdeva, Julian McAuley

Abstract: The popularity of deep learning has led to the curation of a vast number of massive and multifarious datasets. Despite having close-to-human performance on individual tasks, training parameter-hungry models on large datasets poses multi-faceted problems such as (a) high model-training time; (b) slow research iteration; and (c) poor eco-sustainability. As an alternative, data distillation approache… ▽ More The popularity of deep learning has led to the curation of a vast number of massive and multifarious datasets. Despite having close-to-human performance on individual tasks, training parameter-hungry models on large datasets poses multi-faceted problems such as (a) high model-training time; (b) slow research iteration; and (c) poor eco-sustainability. As an alternative, data distillation approaches aim to synthesize terse data summaries, which can serve as effective drop-in replacements of the original dataset for scenarios like model training, inference, architecture search, etc. In this survey, we present a formal framework for data distillation, along with providing a detailed taxonomy of existing approaches. Additionally, we cover data distillation approaches for different data modalities, namely images, graphs, and user-item interactions (recommender systems), while also identifying current challenges and future research directions. △ Less

Submitted 26 September, 2023; v1 submitted 10 January, 2023; originally announced January 2023.

Comments: Accepted at TMLR '23. 21 pages, 4 figures

arXiv:2206.02626 [pdf, other]

Infinite Recommendation Networks: A Data-Centric Approach

Authors: Noveen Sachdeva, Mehak Preet Dhaliwal, Carole-Jean Wu, Julian McAuley

Abstract: We leverage the Neural Tangent Kernel and its equivalence to training infinitely-wide neural networks to devise $\infty$-AE: an autoencoder with infinitely-wide bottleneck layers. The outcome is a highly expressive yet simplistic recommendation model with a single hyper-parameter and a closed-form solution. Leveraging $\infty$-AE's simplicity, we also develop Distill-CF for synthesizing tiny, high… ▽ More We leverage the Neural Tangent Kernel and its equivalence to training infinitely-wide neural networks to devise $\infty$-AE: an autoencoder with infinitely-wide bottleneck layers. The outcome is a highly expressive yet simplistic recommendation model with a single hyper-parameter and a closed-form solution. Leveraging $\infty$-AE's simplicity, we also develop Distill-CF for synthesizing tiny, high-fidelity data summaries which distill the most important knowledge from the extremely large and sparse user-item interaction matrix for efficient and accurate subsequent data-usage like model training, inference, architecture search, etc. This takes a data-centric approach to recommendation, where we aim to improve the quality of logged user-feedback data for subsequent modeling, independent of the learning algorithm. We particularly utilize the concept of differentiable Gumbel-sampling to handle the inherent data heterogeneity, sparsity, and semi-structuredness, while being scalable to datasets with hundreds of millions of user-item interactions. Both of our proposed approaches significantly outperform their respective state-of-the-art and when used together, we observe 96-105% of $\infty$-AE's performance on the full dataset with as little as 0.1% of the original dataset size, leading us to explore the counter-intuitive question: Is more data what you need for better recommendation? △ Less

Submitted 12 October, 2022; v1 submitted 2 June, 2022; originally announced June 2022.

Comments: Published at NeurIPS '22. $\infty$-AE code available at https://github.com/noveens/infinite_ae_cf and Distill-CF code available at https://github.com/noveens/distill_cf

arXiv:2201.04768 [pdf, other]

doi 10.1145/3488560.3498439

On Sampling Collaborative Filtering Datasets

Authors: Noveen Sachdeva, Carole-Jean Wu, Julian McAuley

Abstract: We study the practical consequences of dataset sampling strategies on the ranking performance of recommendation algorithms. Recommender systems are generally trained and evaluated on samples of larger datasets. Samples are often taken in a naive or ad-hoc fashion: e.g. by sampling a dataset randomly or by selecting users or items with many interactions. As we demonstrate, commonly-used data sampli… ▽ More We study the practical consequences of dataset sampling strategies on the ranking performance of recommendation algorithms. Recommender systems are generally trained and evaluated on samples of larger datasets. Samples are often taken in a naive or ad-hoc fashion: e.g. by sampling a dataset randomly or by selecting users or items with many interactions. As we demonstrate, commonly-used data sampling schemes can have significant consequences on algorithm performance. Following this observation, this paper makes three main contributions: (1) characterizing the effect of sampling on algorithm performance, in terms of algorithm and dataset characteristics (e.g. sparsity characteristics, sequential dynamics, etc.); (2) designing SVP-CF, which is a data-specific sampling strategy, that aims to preserve the relative performance of models after sampling, and is especially suited to long-tailed interaction data; and (3) developing an oracle, Data-Genie, which can suggest the sampling scheme that is most likely to preserve model performance for a given dataset. The main benefit of Data-Genie is that it will allow recommender system practitioners to quickly prototype and compare various approaches, while remaining confident that algorithm performance will be preserved, once the algorithm is retrained and deployed on the complete data. Detailed experiments show that using Data-Genie, we can discard upto 5x more data than any sampling strategy with the same level of performance. △ Less

Submitted 12 January, 2022; originally announced January 2022.

Comments: 9 pages, 4 figures, accepted for publication at WSDM '22. arXiv admin note: substantial text overlap with arXiv:2107.04984

arXiv:2108.00261 [pdf, other]

doi 10.1145/3442381.3449815

ECLARE: Extreme Classification with Label Graph Correlations

Authors: Anshul Mittal, Noveen Sachdeva, Sheshansh Agrawal, Sumeet Agarwal, Purushottam Kar, Manik Varma

Abstract: Deep extreme classification (XC) seeks to train deep architectures that can tag a data point with its most relevant subset of labels from an extremely large label set. The core utility of XC comes from predicting labels that are rarely seen during training. Such rare labels hold the key to personalized recommendations that can delight and surprise a user. However, the large number of rare labels a… ▽ More Deep extreme classification (XC) seeks to train deep architectures that can tag a data point with its most relevant subset of labels from an extremely large label set. The core utility of XC comes from predicting labels that are rarely seen during training. Such rare labels hold the key to personalized recommendations that can delight and surprise a user. However, the large number of rare labels and small amount of training data per rare label offer significant statistical and computational challenges. State-of-the-art deep XC methods attempt to remedy this by incorporating textual descriptions of labels but do not adequately address the problem. This paper presents ECLARE, a scalable deep learning architecture that incorporates not only label text, but also label correlations, to offer accurate real-time predictions within a few milliseconds. Core contributions of ECLARE include a frugal architecture and scalable techniques to train deep models along with label correlation graphs at the scale of millions of labels. In particular, ECLARE offers predictions that are 2 to 14% more accurate on both publicly available benchmark datasets as well as proprietary datasets for a related products recommendation task sourced from the Bing search engine. Code for ECLARE is available at https://github.com/Extreme-classification/ECLARE. △ Less

Submitted 31 July, 2021; originally announced August 2021.

ACM Class: F.2.2; I.2.7

Journal ref: The Web Conference 2021

arXiv:2107.04984 [pdf, other]

SVP-CF: Selection via Proxy for Collaborative Filtering Data

Authors: Noveen Sachdeva, Carole-Jean Wu, Julian McAuley

Abstract: We study the practical consequences of dataset sampling strategies on the performance of recommendation algorithms. Recommender systems are generally trained and evaluated on samples of larger datasets. Samples are often taken in a naive or ad-hoc fashion: e.g. by sampling a dataset randomly or by selecting users or items with many interactions. As we demonstrate, commonly-used data sampling schem… ▽ More We study the practical consequences of dataset sampling strategies on the performance of recommendation algorithms. Recommender systems are generally trained and evaluated on samples of larger datasets. Samples are often taken in a naive or ad-hoc fashion: e.g. by sampling a dataset randomly or by selecting users or items with many interactions. As we demonstrate, commonly-used data sampling schemes can have significant consequences on algorithm performance -- masking performance deficiencies in algorithms or altering the relative performance of algorithms, as compared to models trained on the complete dataset. Following this observation, this paper makes the following main contributions: (1) characterizing the effect of sampling on algorithm performance, in terms of algorithm and dataset characteristics (e.g. sparsity characteristics, sequential dynamics, etc.); and (2) designing SVP-CF, which is a data-specific sampling strategy, that aims to preserve the relative performance of models after sampling, and is especially suited to long-tail interaction data. Detailed experiments show that SVP-CF is more accurate than commonly used sampling schemes in retaining the relative ranking of different recommendation algorithms. △ Less

Submitted 11 July, 2021; originally announced July 2021.

Comments: 11 pages, 3 figures, accepted at the SubSetML workshop at ICML '21 (Link: https://sites.google.com/view/icml-2021-subsetml/home)

arXiv:2010.11704 [pdf, other]

doi 10.1007/s11701-020-01149-5

Using Conditional Generative Adversarial Networks to Reduce the Effects of Latency in Robotic Telesurgery

Authors: Neil Sachdeva, Misha Klopukh, Rachel St. Clair, William Hahn

Abstract: The introduction of surgical robots brought about advancements in surgical procedures. The applications of remote telesurgery range from building medical clinics in underprivileged areas, to placing robots abroad in military hot-spots where accessibility and diversity of medical experience may be limited. Poor wireless connectivity may result in a prolonged delay, referred to as latency, between a… ▽ More The introduction of surgical robots brought about advancements in surgical procedures. The applications of remote telesurgery range from building medical clinics in underprivileged areas, to placing robots abroad in military hot-spots where accessibility and diversity of medical experience may be limited. Poor wireless connectivity may result in a prolonged delay, referred to as latency, between a surgeon's input and action a robot takes. In surgery, any micro-delay can injure a patient severely and in some cases, result in fatality. One was to increase safety is to mitigate the effects of latency using deep learning aided computer vision. While the current surgical robots use calibrated sensors to measure the position of the arms and tools, in this work we present a purely optical approach that provides a measurement of the tool position in relation to the patient's tissues. This research aimed to produce a neural network that allowed a robot to detect its own mechanical manipulator arms. A conditional generative adversarial networks (cGAN) was trained on 1107 frames of mock gastrointestinal robotic surgery data from the 2015 EndoVis Instrument Challenge and corresponding hand-drawn labels for each frame. When run on new testing data, the network generated near-perfect labels of the input images which were visually consistent with the hand-drawn labels and was able to do this in 299 milliseconds. These accurately generated labels can then be used as simplified identifiers for the robot to track its own controlled tools. These results show potential for conditional GANs as a reaction mechanism such that the robot can detect when its arms move outside the operating area within a patient. This system allows for more accurate monitoring of the position of surgical instruments in relation to the patient's tissue, increasing safety measures that are integral to successful telesurgery systems. △ Less

Submitted 7 October, 2020; originally announced October 2020.

Comments: 6 pages with 5 figures and 1 table. J Robotic Surg (2020)

ACM Class: I.4.6; I.2.6; J.3

arXiv:2006.09438 [pdf, other]

doi 10.1145/3394486.3403139

Off-policy Bandits with Deficient Support

Authors: Noveen Sachdeva, Yi Su, Thorsten Joachims

Abstract: Learning effective contextual-bandit policies from past actions of a deployed system is highly desirable in many settings (e.g. voice assistants, recommendation, search), since it enables the reuse of large amounts of log data. State-of-the-art methods for such off-policy learning, however, are based on inverse propensity score (IPS) weighting. A key theoretical requirement of IPS weighting is tha… ▽ More Learning effective contextual-bandit policies from past actions of a deployed system is highly desirable in many settings (e.g. voice assistants, recommendation, search), since it enables the reuse of large amounts of log data. State-of-the-art methods for such off-policy learning, however, are based on inverse propensity score (IPS) weighting. A key theoretical requirement of IPS weighting is that the policy that logged the data has "full support", which typically translates into requiring non-zero probability for any action in any context. Unfortunately, many real-world systems produce support deficient data, especially when the action space is large, and we show how existing methods can fail catastrophically. To overcome this gap between theory and applications, we identify three approaches that provide various guarantees for IPS-based learning despite the inherent limitations of support-deficient data: restricting the action space, reward extrapolation, and restricting the policy space. We systematically analyze the statistical and computational properties of these three approaches, and we empirically evaluate their effectiveness. In addition to providing the first systematic analysis of support-deficiency in contextual-bandit learning, we conclude with recommendations that provide practical guidance. △ Less

Submitted 16 June, 2020; originally announced June 2020.

Comments: 11 pages, 6 figures. Accepted for publication at KDD '20 (Research track)

arXiv:2005.12210 [pdf, other]

doi 10.1145/3397271.3401281

How Useful are Reviews for Recommendation? A Critical Review and Potential Improvements

Authors: Noveen Sachdeva, Julian McAuley

Abstract: We investigate a growing body of work that seeks to improve recommender systems through the use of review text. Generally, these papers argue that since reviews 'explain' users' opinions, they ought to be useful to infer the underlying dimensions that predict ratings or purchases. Schemes to incorporate reviews range from simple regularizers to neural network approaches. Our initial findings revea… ▽ More We investigate a growing body of work that seeks to improve recommender systems through the use of review text. Generally, these papers argue that since reviews 'explain' users' opinions, they ought to be useful to infer the underlying dimensions that predict ratings or purchases. Schemes to incorporate reviews range from simple regularizers to neural network approaches. Our initial findings reveal several discrepancies in reported results, partly due to (e.g.) copying results across papers despite changes in experimental settings or data pre-processing. First, we attempt a comprehensive analysis to resolve these ambiguities. Further investigation calls for discussion on a much larger problem about the "importance" of user reviews for recommendation. Through a wide range of experiments, we observe several cases where state-of-the-art methods fail to outperform existing baselines, especially as we deviate from a few narrowly-defined settings where reviews are useful. We conclude by providing hypotheses for our observations, that seek to characterize under what conditions reviews are likely to be helpful. Through this work, we aim to evaluate the direction in which the field is progressing and encourage robust empirical evaluation. △ Less

Submitted 25 May, 2020; originally announced May 2020.

Comments: 4 pages, 3 figures. Accepted for publication at SIGIR '20

arXiv:1911.05013 [pdf, other]

doi 10.1109/ICASSP.2019.8683538

EDUQA: Educational Domain Question Answering System using Conceptual Network Mapping

Authors: Abhishek Agarwal, Nikhil Sachdeva, Raj Kamal Yadav, Vishaal Udandarao, Vrinda Mittal, Anubha Gupta, Abhinav Mathur

Abstract: Most of the existing question answering models can be largely compiled into two categories: i) open domain question answering models that answer generic questions and use large-scale knowledge base along with the targeted web-corpus retrieval and ii) closed domain question answering models that address focused questioning area and use complex deep learning models. Both the above models derive answ… ▽ More Most of the existing question answering models can be largely compiled into two categories: i) open domain question answering models that answer generic questions and use large-scale knowledge base along with the targeted web-corpus retrieval and ii) closed domain question answering models that address focused questioning area and use complex deep learning models. Both the above models derive answers through textual comprehension methods. Due to their inability to capture the pedagogical meaning of textual content, these models are not appropriately suited to the educational field for pedagogy. In this paper, we propose an on-the-fly conceptual network model that incorporates educational semantics. The proposed model preserves correlations between conceptual entities by applying intelligent indexing algorithms on the concept network so as to improve answer generation. This model can be utilized for building interactive conversational agents for aiding classroom learning. △ Less

Submitted 12 November, 2019; originally announced November 2019.

Comments: Published in the 44th International Conference on Acoustics, Speech, and Signal Processing (ICASSP) 2019

Journal ref: IEEE ICASSP (2019) 8137-8141

arXiv:1811.09975 [pdf, other]

Sequential Variational Autoencoders for Collaborative Filtering

Authors: Noveen Sachdeva, Giuseppe Manco, Ettore Ritacco, Vikram Pudi

Abstract: Variational autoencoders were proven successful in domains such as computer vision and speech processing. Their adoption for modeling user preferences is still unexplored, although recently it is starting to gain attention in the current literature. In this work, we propose a model which extends variational autoencoders by exploiting the rich information present in the past preference history. We… ▽ More Variational autoencoders were proven successful in domains such as computer vision and speech processing. Their adoption for modeling user preferences is still unexplored, although recently it is starting to gain attention in the current literature. In this work, we propose a model which extends variational autoencoders by exploiting the rich information present in the past preference history. We introduce a recurrent version of the VAE, where instead of passing a subset of the whole history regardless of temporal dependencies, we rather pass the consumption sequence subset through a recurrent neural network. At each time-step of the RNN, the sequence is fed through a series of fully-connected layers, the output of which models the probability distribution of the most likely future preferences. We show that handling temporal information is crucial for improving the accuracy of the VAE: In fact, our model beats the current state-of-the-art by valuable margins because of its ability to capture temporal dependencies among the user-consumption sequence using the recurrent encoder still keeping the fundamentals of variational autoencoders intact. △ Less

Submitted 25 November, 2018; originally announced November 2018.

Comments: 9 pages, 6 figures, 2 tables, WSDM2019

MSC Class: 68T05

arXiv:1811.08203 [pdf, other]

doi 10.1145/3240323.3240397

Attentive Neural Architecture Incorporating Song Features For Music Recommendation

Authors: Noveen Sachdeva, Kartik Gupta, Vikram Pudi

Abstract: Recommender Systems are an integral part of music sharing platforms. Often the aim of these systems is to increase the time, the user spends on the platform and hence having a high commercial value. The systems which aim at increasing the average time a user spends on the platform often need to recommend songs which the user might want to listen to next at each point in time. This is different fro… ▽ More Recommender Systems are an integral part of music sharing platforms. Often the aim of these systems is to increase the time, the user spends on the platform and hence having a high commercial value. The systems which aim at increasing the average time a user spends on the platform often need to recommend songs which the user might want to listen to next at each point in time. This is different from recommendation systems which try to predict the item which might be of interest to the user at some point in the user lifetime but not necessarily in the very near future. Prediction of the next song the user might like requires some kind of modeling of the user interests at the given point of time. Attentive neural networks have been exploiting the sequence in which the items were selected by the user to model the implicit short-term interests of the user for the task of next item prediction, however we feel that the features of the songs occurring in the sequence could also convey some important information about the short-term user interest which only the items cannot. In this direction, we propose a novel attentive neural architecture which in addition to the sequence of items selected by the user, uses the features of these items to better learn the user short-term preferences and recommend the next song to the user. △ Less

Submitted 20 November, 2018; originally announced November 2018.

Comments: Accepted as a paper at the 12th ACM Conference on Recommender Systems (RecSys 18)

Journal ref: 12th ACM Conference on Recommender Systems (RecSys '18). ACM (2018) 417-421

arXiv:1608.00905 [pdf, other]

PicHunt: Social Media Image Retrieval for Improved Law Enforcement

Authors: Sonal Goel, Niharika Sachdeva, Ponnurangam Kumaraguru, A V Subramanyam, Divam Gupta

Abstract: First responders are increasingly using social media to identify and reduce crime for well-being and safety of the society. Images shared on social media hurting religious, political, communal and other sentiments of people, often instigate violence and create law & order situations in society. This results in the need for first responders to inspect the spread of such images and users propagating… ▽ More First responders are increasingly using social media to identify and reduce crime for well-being and safety of the society. Images shared on social media hurting religious, political, communal and other sentiments of people, often instigate violence and create law & order situations in society. This results in the need for first responders to inspect the spread of such images and users propagating them on social media. In this paper, we present a comparison between different hand-crafted features and a Convolutional Neural Network (CNN) model to retrieve similar images, which outperforms state-of-art hand-crafted features. We propose an Open-Source-Intelligent (OSINT) real-time image search system, robust to retrieve modified images that allows first responders to analyze the current spread of images, sentiments floating and details of users propagating such content. The system also aids officials to save time of manually analyzing the content by reducing the search space on an average by 67%. △ Less

Submitted 15 September, 2016; v1 submitted 2 August, 2016; originally announced August 2016.

arXiv:1509.08205 [pdf, other]

Characterising Behavior and Emotions on Social Media for Safety: Exploring Online Communication between Police and Citizens

Authors: Niharika Sachdeva, Ponnurangam Kumaraguru

Abstract: Increased use of social media by police to connect with citizens has encouraged researchers to study different aspects of information exchange (e.g. type of information, credibility and propagation) during emergency and crisis situation. Research studies lack understanding of human behavior such as engagement, emotions and social interaction between citizen and police department on social media. S… ▽ More Increased use of social media by police to connect with citizens has encouraged researchers to study different aspects of information exchange (e.g. type of information, credibility and propagation) during emergency and crisis situation. Research studies lack understanding of human behavior such as engagement, emotions and social interaction between citizen and police department on social media. Several social media studies explore and show technological implications of human behavioral aspects in various contexts such as workplace interaction and depression in young mothers. In this paper, we study online interactions between citizens and Indian police in context of day-to-day policing, including safety concerns, advisories, etc. Indian police departments use Facebook to issue advisories, send alerts and receive citizen complaints and suggestions regarding safety issues and day-to-day policing. We explore how citizens express their emotions and social support on Facebook. Our work discusses technological implications of behavioral aspects on social well being of citizens. △ Less

Submitted 28 September, 2015; originally announced September 2015.

ACM Class: H.5.3

arXiv:1410.3942 [pdf, other]

Privacy4ICTD in India: Exploring Perceptions, Attitudes and Awareness about ICT Use

Authors: Ponnurangam Kumaraguru, Niharika Sachdeva

Abstract: Several ICT studies give anecdotal evidences showing privacy to be an area of concern that can influence adoption of technology in the developing world. However, in-depth understanding of end users' privacy attitudes and awareness is largely unexplored in developing countries such as India. We conducted a survey with 10,427 Indian citizens to bring forth various insights on privacy expectations an… ▽ More Several ICT studies give anecdotal evidences showing privacy to be an area of concern that can influence adoption of technology in the developing world. However, in-depth understanding of end users' privacy attitudes and awareness is largely unexplored in developing countries such as India. We conducted a survey with 10,427 Indian citizens to bring forth various insights on privacy expectations and perceptions of end users. Our study explores end-users' privacy expectations on three ICT platforms - mobile phones, OSN (Online Social Network), and government projects. Our results, though preliminary, show that users disproportionately consider financial details as personal information in comparison to medical records. Users heavily use mobile phones to store personal details and show high trust in mobile service providers for protecting the private data. However, users show concerns that mobile service provider may allow improper access of their personal information to third parties and government. We also find that female participants in the study were marginally more conscious of their privacy than males. To the best of our knowledge, this work presents the largest privacy study which benchmarks privacy perceptions among Indian citizens. Understanding users' privacy perceptions can help improve technology adoption and develop policies and laws for improving technology experience and enabling development for a better life in India. △ Less

Submitted 15 October, 2014; originally announced October 2014.

arXiv:1403.2042 [pdf, other]

Online Social Media and Police in India: Behavior, Perceptions, Challenges

Authors: Niharika Sachdeva, Ponnurangam Kumaraguru

Abstract: Police agencies across the globe are increasingly using Online Social Media (OSM) to acquire intelligence and connect with citizens. Developed nations have well thought of strategies to use OSM for policing. However, developing nations like India are exploring and evolving OSM as a policing solution. India, in recent years, experienced many events where rumors and fake content on OSM instigated co… ▽ More Police agencies across the globe are increasingly using Online Social Media (OSM) to acquire intelligence and connect with citizens. Developed nations have well thought of strategies to use OSM for policing. However, developing nations like India are exploring and evolving OSM as a policing solution. India, in recent years, experienced many events where rumors and fake content on OSM instigated communal violence. In contrast to traditional media (e.g. television and print media) used by Indian police departments, OSM offers velocity, variety, veracity and large volume of information. These introduce new challenges for police like platforms selection, secure usage strategy, developing trust, handling offensive comments, and security / privacy implication of information shared through OSM. Success of police initiatives on OSM to maintain law and order depends both on their understanding of OSM and citizen's acceptance / participation on these platforms. This study provides multidimensional understanding of behavior, perceptions, interactions, and expectation regarding policing through OSM. First, we examined recent updates from four different police pages- Delhi, Bangalore, Uttar Pradesh and Chennai to comprehend various dimensions of police interaction with citizens on OSM. Second, we conducted 20 interviews with IPS officers (Indian Police Service) and 17 interviews with citizens to understand decision rationales and expectation gaps between two stakeholders (police and citizens); this was followed up with 445 policemen surveys and 204 citizen surveys. We also present differences between police expectations of Indian and police departments in developed countries. △ Less

Submitted 9 March, 2014; originally announced March 2014.

arXiv:1310.1540 [pdf, other]

Three-Way Dissection of a Game-CAPTCHA: Automated Attacks, Relay Attacks, and Usability

Authors: Manar Mohamed, Niharika Sachdeva, Michael Georgescu, Song Gao, Nitesh Saxena, Chengcui Zhang, Ponnurangam Kumaraguru, Paul C. van Oorschot, Wei-Bang Chen

Abstract: Existing captcha solutions on the Internet are a major source of user frustration. Game captchas are an interesting and, to date, little-studied approach claiming to make captcha solving a fun activity for the users. One broad form of such captchas -- called Dynamic Cognitive Game (DCG) captchas -- challenge the user to perform a game-like cognitive task interacting with a series of dynamic images… ▽ More Existing captcha solutions on the Internet are a major source of user frustration. Game captchas are an interesting and, to date, little-studied approach claiming to make captcha solving a fun activity for the users. One broad form of such captchas -- called Dynamic Cognitive Game (DCG) captchas -- challenge the user to perform a game-like cognitive task interacting with a series of dynamic images. We pursue a comprehensive analysis of a representative category of DCG captchas. We formalize, design and implement such captchas, and dissect them across: (1) fully automated attacks, (2) human-solver relay attacks, and (3) usability. Our results suggest that the studied DCG captchas exhibit high usability and, unlike other known captchas, offer some resistance to relay attacks, but they are also vulnerable to our novel dictionary-based automated attack. △ Less

Submitted 6 October, 2013; originally announced October 2013.

Comments: 16 pages, 10 figures

arXiv:1306.0195 [pdf, other]

ChaMAILeon: Exploring the Usability of a Privacy Preserving Email Sharing System

Authors: Prateek Dewan, Niharika Sachdeva, Mayank Gupta, Ponnurangam Kumaraguru

Abstract: While passwords, by definition, are meant to be secret, recent trends have witnessed an increasing number of people sharing their email passwords with friends, colleagues, and significant others. However, leading websites like Google advise their users not to share their passwords with anyone, to avoid security and privacy breaches. To understand users' general password sharing behavior and practi… ▽ More While passwords, by definition, are meant to be secret, recent trends have witnessed an increasing number of people sharing their email passwords with friends, colleagues, and significant others. However, leading websites like Google advise their users not to share their passwords with anyone, to avoid security and privacy breaches. To understand users' general password sharing behavior and practices, we conducted an online survey with 209 Indian participants and found that 64.35% of the participants felt a need to share their email passwords. Further, about 77% of the participants said that they would want to use a system which could provide them access control features, to maintain their privacy while sharing emails. To address the privacy concerns of users who need to share emails, we propose ChaMAILeon, a system which enables users to share their email passwords while maintaining their privacy. ChaMAILeon allows users to create multiple passwords for their email account. Each such password corresponds to a different set of access control rules, and gives a different view of the same email account. We conducted a controlled experiment with 30 participants to evaluate the usability of the system. Each participant was required to perform 5 tasks. Each task corresponded to different access control rules, which the participant was required to set, for a dummy email account. We found that, with a reasonable number of multiple attempts, all 30 participants were able to perform all 5 tasks given to them. The system usability score was found out to be 75.42. Moreover, 56.6% of the participants said that they would like to use ChaMAILeon frequently. △ Less

Submitted 2 June, 2013; originally announced June 2013.

Comments: 12 pages without references and appendices

Showing 1–22 of 22 results for author: Sachdeva, N