Zum Hauptinhalt springen

Showing 1–50 of 57 results for author: Sanner, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2408.10946  [pdf, other

    cs.AI

    Large Language Model Driven Recommendation

    Authors: Anton Korikov, Scott Sanner, Yashar Deldjoo, Zhankui He, Julian McAuley, Arnau Ramisa, Rene Vidal, Mahesh Sathiamoorthy, Atoosa Kasrizadeh, Silvia Milano, Francesco Ricci

    Abstract: While previous chapters focused on recommendation systems (RSs) based on standardized, non-verbal user feedback such as purchases, views, and clicks -- the advent of LLMs has unlocked the use of natural language (NL) interactions for recommendation. This chapter discusses how LLMs' abilities for general NL reasoning present novel opportunities to build highly personalized RSs -- which can effectiv… ▽ More

    Submitted 20 August, 2024; originally announced August 2024.

  2. arXiv:2408.00878  [pdf, other

    cs.IR

    Multi-Aspect Reviewed-Item Retrieval via LLM Query Decomposition and Aspect Fusion

    Authors: Anton Korikov, George Saad, Ethan Baron, Mustafa Khan, Manav Shah, Scott Sanner

    Abstract: While user-generated product reviews often contain large quantities of information, their utility in addressing natural language product queries has been limited, with a key challenge being the need to aggregate information from multiple low-level sources (reviews) to a higher item level during retrieval. Existing methods for reviewed-item retrieval (RIR) typically take a late fusion (LF) approach… ▽ More

    Submitted 1 August, 2024; originally announced August 2024.

  3. Retrieval-Augmented Conversational Recommendation with Prompt-based Semi-Structured Natural Language State Tracking

    Authors: Sara Kemper, Justin Cui, Kai Dicarlantonio, Kathy Lin, Danjie Tang, Anton Korikov, Scott Sanner

    Abstract: Conversational recommendation (ConvRec) systems must understand rich and diverse natural language (NL) expressions of user preferences and intents, often communicated in an indirect manner (e.g., "I'm watching my weight"). Such complex utterances make retrieving relevant items challenging, especially if only using often incomplete or out-of-date metadata. Fortunately, many domains feature rich ite… ▽ More

    Submitted 25 May, 2024; originally announced June 2024.

  4. Bayesian Optimization with LLM-Based Acquisition Functions for Natural Language Preference Elicitation

    Authors: David Eric Austin, Anton Korikov, Armin Toroghi, Scott Sanner

    Abstract: Designing preference elicitation (PE) methodologies that can quickly ascertain a user's top item preferences in a cold-start setting is a key challenge for building effective and personalized conversational recommendation (ConvRec) systems. While large language models (LLMs) enable fully natural language (NL) PE dialogues, we hypothesize that monolithic LLM NL-PE approaches lack the multi-turn, de… ▽ More

    Submitted 19 August, 2024; v1 submitted 1 May, 2024; originally announced May 2024.

  5. arXiv:2404.00579  [pdf, other

    cs.IR cs.AI

    A Review of Modern Recommender Systems Using Generative Models (Gen-RecSys)

    Authors: Yashar Deldjoo, Zhankui He, Julian McAuley, Anton Korikov, Scott Sanner, Arnau Ramisa, René Vidal, Maheswaran Sathiamoorthy, Atoosa Kasirzadeh, Silvia Milano

    Abstract: Traditional recommender systems (RS) typically use user-item rating histories as their main data source. However, deep generative models now have the capability to model and sample from complex data distributions, including user-item interactions, text, images, and videos, enabling novel recommendation tasks. This comprehensive, multidisciplinary survey connects key advancements in RS using Genera… ▽ More

    Submitted 4 July, 2024; v1 submitted 31 March, 2024; originally announced April 2024.

    Comments: This survey accompanies a tutorial presented at ACM KDD'24

  6. arXiv:2403.01395  [pdf, other

    cs.CL

    CR-LT-KGQA: A Knowledge Graph Question Answering Dataset Requiring Commonsense Reasoning and Long-Tail Knowledge

    Authors: Willis Guo, Armin Toroghi, Scott Sanner

    Abstract: Knowledge graph question answering (KGQA) is a well-established field that seeks to provide factual answers to natural language (NL) questions by leveraging knowledge graphs (KGs). However, existing KGQA datasets suffer from two significant limitations: (1) no existing KGQA dataset requires commonsense reasoning to arrive at an answer and (2) existing KGQA datasets focus on popular entities for wh… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: 7 pages

    ACM Class: I.2.4, I.2.7

  7. arXiv:2403.01390  [pdf, other

    cs.CL

    Right for Right Reasons: Large Language Models for Verifiable Commonsense Knowledge Graph Question Answering

    Authors: Armin Toroghi, Willis Guo, Mohammad Mahdi Abdollah Pour, Scott Sanner

    Abstract: Knowledge Graph Question Answering (KGQA) methods seek to answer Natural Language questions using the relational information stored in Knowledge Graphs (KGs). With the recent advancements of Large Language Models (LLMs) and their remarkable reasoning abilities, there is a growing trend to leverage them for KGQA. However, existing methodologies have only focused on answering factual questions, e.g.… ▽ More

    Submitted 2 March, 2024; originally announced March 2024.

    Comments: 8 pages

    ACM Class: I.2.7

  8. arXiv:2401.12243  [pdf, other

    math.OC cs.LG cs.RO cs.SC eess.SY

    Constraint-Generation Policy Optimization (CGPO): Nonlinear Programming for Policy Optimization in Mixed Discrete-Continuous MDPs

    Authors: Michael Gimelfarb, Ayal Taitler, Scott Sanner

    Abstract: We propose Constraint-Generation Policy Optimization (CGPO) for optimizing policy parameters within compact and interpretable policy classes for mixed discrete-continuous Markov Decision Processes (DC-MDPs). CGPO is not only able to provide bounded policy error guarantees over an infinite range of initial states for many DC-MDPs with expressive nonlinear dynamics, but it can also provably derive o… ▽ More

    Submitted 20 January, 2024; originally announced January 2024.

  9. arXiv:2309.02530  [pdf, other

    cs.LG stat.ML

    Diffusion on the Probability Simplex

    Authors: Griffin Floto, Thorsteinn Jonsson, Mihai Nica, Scott Sanner, Eric Zhengyu Zhu

    Abstract: Diffusion models learn to reverse the progressive noising of a data distribution to create a generative model. However, the desired continuous nature of the noising process can be at odds with discrete data. To deal with this tension between continuous and discrete objects, we propose a method of performing diffusion on the probability simplex. Using the probability simplex naturally creates an in… ▽ More

    Submitted 11 September, 2023; v1 submitted 5 September, 2023; originally announced September 2023.

  10. Self-Supervised Contrastive BERT Fine-tuning for Fusion-based Reviewed-Item Retrieval

    Authors: Mohammad Mahdi Abdollah Pour, Parsa Farinneya, Armin Toroghi, Anton Korikov, Ali Pesaranghader, Touqir Sajed, Manasa Bharadwaj, Borislav Mavrin, Scott Sanner

    Abstract: As natural language interfaces enable users to express increasingly complex natural language queries, there is a parallel explosion of user review content that can allow users to better find items such as restaurants, books, or movies that match these expressive queries. While Neural Information Retrieval (IR) methods have provided state-of-the-art results for matching queries to documents, they h… ▽ More

    Submitted 1 August, 2023; originally announced August 2023.

    Journal ref: European Conference on Information Retrieval, pages 3--17, year 2023, Springer

  11. arXiv:2307.14225  [pdf, ps, other

    cs.IR cs.LG

    Large Language Models are Competitive Near Cold-start Recommenders for Language- and Item-based Preferences

    Authors: Scott Sanner, Krisztian Balog, Filip Radlinski, Ben Wedin, Lucas Dixon

    Abstract: Traditional recommender systems leverage users' item preference history to recommend novel content that users may like. However, modern dialog interfaces that allow users to express language-based preferences offer a fundamentally different modality for preference input. Inspired by recent successes of prompting paradigms for large language models (LLMs), we study their use for making recommendati… ▽ More

    Submitted 26 July, 2023; originally announced July 2023.

    Comments: To appear at RecSys'23

  12. arXiv:2306.08505  [pdf, other

    cs.CL cs.LG

    DiffuDetox: A Mixed Diffusion Model for Text Detoxification

    Authors: Griffin Floto, Mohammad Mahdi Abdollah Pour, Parsa Farinneya, Zhenwei Tang, Ali Pesaranghader, Manasa Bharadwaj, Scott Sanner

    Abstract: Text detoxification is a conditional text generation task aiming to remove offensive content from toxic text. It is highly useful for online forums and social media, where offensive content is frequently encountered. Intuitively, there are diverse ways to detoxify sentences while preserving their meanings, and we can select from detoxified sentences before displaying text to users. Conditional dif… ▽ More

    Submitted 14 June, 2023; originally announced June 2023.

    Comments: 7 pages, 1 figure, ACL findings 2023

  13. Bayesian Knowledge-driven Critiquing with Indirect Evidence

    Authors: Armin Toroghi, Griffin Floto, Zhenwei Tang, Scott Sanner

    Abstract: Conversational recommender systems (CRS) enhance the expressivity and personalization of recommendations through multiple turns of user-system interaction. Critiquing is a well-known paradigm for CRS that allows users to iteratively refine recommendations by providing feedback about attributes of recommended items. While existing critiquing methodologies utilize direct attributes of items to addre… ▽ More

    Submitted 8 June, 2023; originally announced June 2023.

  14. Revisiting Random Forests in a Comparative Evaluation of Graph Convolutional Neural Network Variants for Traffic Prediction

    Authors: Ta Jiun Ting, Xiaocan Li, Scott Sanner, Baher Abdulhai

    Abstract: Traffic prediction is a spatiotemporal predictive task that plays an essential role in intelligent transportation systems. Today, graph convolutional neural networks (GCNNs) have become the prevailing models in the traffic prediction literature since they excel at extracting spatial correlations. In this work, we classify the components of successful GCNN prediction models and analyze the effects… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

    Journal ref: The International Conference on Intelligent Transportation Systems 2021

  15. arXiv:2305.19291  [pdf, other

    cs.LG cs.AI eess.SY

    Perimeter Control Using Deep Reinforcement Learning: A Model-free Approach towards Homogeneous Flow Rate Optimization

    Authors: Xiaocan Li, Ray Coden Mercurius, Ayal Taitler, Xiaoyu Wang, Mohammad Noaeen, Scott Sanner, Baher Abdulhai

    Abstract: Perimeter control maintains high traffic efficiency within protected regions by controlling transfer flows among regions to ensure that their traffic densities are below critical values. Existing approaches can be categorized as either model-based or model-free, depending on whether they rely on network transmission models (NTMs) and macroscopic fundamental diagrams (MFDs). Although model-based ap… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

  16. arXiv:2305.18354  [pdf, other

    cs.CL cs.AI

    LLMs and the Abstraction and Reasoning Corpus: Successes, Failures, and the Importance of Object-based Representations

    Authors: Yudong Xu, Wenhao Li, Pashootan Vaezipoor, Scott Sanner, Elias B. Khalil

    Abstract: Can a Large Language Model (LLM) solve simple abstract reasoning problems? We explore this broad question through a systematic analysis of GPT on the Abstraction and Reasoning Corpus (ARC), a representative benchmark of abstract reasoning ability from limited examples in which solutions require some "core knowledge" of concepts such as objects, goal states, counting, and basic geometry. GPT-4 solv… ▽ More

    Submitted 14 February, 2024; v1 submitted 26 May, 2023; originally announced May 2023.

    Comments: 26 pages, 15 figures, published in Transactions on Machine Learning Research (TMLR)

  17. arXiv:2305.04364  [pdf, other

    cs.LG stat.ML

    A Generalized Framework for Predictive Clustering and Optimization

    Authors: Aravinth Chembu, Scott Sanner

    Abstract: Clustering is a powerful and extensively used data science tool. While clustering is generally thought of as an unsupervised learning technique, there are also supervised variations such as Spath's clusterwise regression that attempt to find clusters of data that yield low regression error on a supervised target. We believe that clusterwise regression is just a single vertex of a largely unexplore… ▽ More

    Submitted 7 May, 2023; originally announced May 2023.

    Comments: 23 pages, 5 figures

  18. LogicRec: Recommendation with Users' Logical Requirements

    Authors: Zhenwei Tang, Griffin Floto, Armin Toroghi, Shichao Pei, Xiangliang Zhang, Scott Sanner

    Abstract: Users may demand recommendations with highly personalized requirements involving logical operations, e.g., the intersection of two requirements, where such requirements naturally form structured logical queries on knowledge graphs (KGs). To date, existing recommender systems lack the capability to tackle users' complex logical requirements. In this work, we formulate the problem of recommendation… ▽ More

    Submitted 23 April, 2023; originally announced April 2023.

    Comments: SIGIR 2023

  19. arXiv:2304.03081  [pdf, other

    cs.LG cs.AI

    Safe MDP Planning by Learning Temporal Patterns of Undesirable Trajectories and Averting Negative Side Effects

    Authors: Siow Meng Low, Akshat Kumar, Scott Sanner

    Abstract: In safe MDP planning, a cost function based on the current state and action is often used to specify safety aspects. In the real world, often the state representation used may lack sufficient fidelity to specify such safety constraints. Operating based on an incomplete model can often produce unintended negative side effects (NSEs). To address these challenges, first, we associate safety signals w… ▽ More

    Submitted 6 April, 2023; originally announced April 2023.

  20. arXiv:2211.14426  [pdf

    eess.SY cs.AI

    A Critical Review of Traffic Signal Control and A Novel Unified View of Reinforcement Learning and Model Predictive Control Approaches for Adaptive Traffic Signal Control

    Authors: Xiaoyu Wang, Scott Sanner, Baher Abdulhai

    Abstract: Recent years have witnessed substantial growth in adaptive traffic signal control (ATSC) methodologies that improve transportation network efficiency, especially in branches leveraging artificial intelligence based optimization and control algorithms such as reinforcement learning as well as conventional model predictive control. However, lack of cross-domain analysis and comparison of the effecti… ▽ More

    Submitted 25 November, 2022; originally announced November 2022.

    Comments: 32 pages, 19 figures. This is a draft chapter/article. The final version is available in Handbook on Artificial Intelligence in Transport, edited by Hussein Dia, forthcoming 2023, Edward Elgar Publishing Ltd

  21. arXiv:2211.05939  [pdf, other

    cs.AI

    pyRDDLGym: From RDDL to Gym Environments

    Authors: Ayal Taitler, Michael Gimelfarb, Jihwan Jeong, Sriram Gopalakrishnan, Martin Mladenov, Xiaotian Liu, Scott Sanner

    Abstract: We present pyRDDLGym, a Python framework for auto-generation of OpenAI Gym environments from RDDL declerative description. The discrete time step evolution of variables in RDDL is described by conditional probability functions, which fits naturally into the Gym step scheme. Furthermore, since RDDL is a lifted description, the modification and scaling up of environments to support multiple entities… ▽ More

    Submitted 5 February, 2024; v1 submitted 10 November, 2022; originally announced November 2022.

  22. arXiv:2211.04591  [pdf, other

    cs.LG cs.AI cs.CL

    Learning to Follow Instructions in Text-Based Games

    Authors: Mathieu Tuli, Andrew C. Li, Pashootan Vaezipoor, Toryn Q. Klassen, Scott Sanner, Sheila A. McIlraith

    Abstract: Text-based games present a unique class of sequential decision making problem in which agents interact with a partially observable, simulated environment via actions and observations conveyed through natural language. Such observations typically include instructions that, in a reinforcement learning (RL) setting, can directly or indirectly guide a player towards completing reward-worthy tasks. In… ▽ More

    Submitted 8 November, 2022; originally announced November 2022.

    Comments: NeurIPS 2022

  23. arXiv:2210.09880  [pdf, other

    cs.AI

    Graphs, Constraints, and Search for the Abstraction and Reasoning Corpus

    Authors: Yudong Xu, Elias B. Khalil, Scott Sanner

    Abstract: The Abstraction and Reasoning Corpus (ARC) aims at benchmarking the performance of general artificial intelligence algorithms. The ARC's focus on broad generalization and few-shot learning has made it difficult to solve using pure machine learning. A more promising approach has been to perform program synthesis within an appropriately designed Domain Specific Language (DSL). However, these too hav… ▽ More

    Submitted 1 December, 2022; v1 submitted 18 October, 2022; originally announced October 2022.

    Comments: 9 pages, 5 figures, to be published in AAAI-23

  24. arXiv:2210.03802  [pdf, other

    cs.LG

    Conservative Bayesian Model-Based Value Expansion for Offline Policy Optimization

    Authors: Jihwan Jeong, Xiaoyu Wang, Michael Gimelfarb, Hyunwoo Kim, Baher Abdulhai, Scott Sanner

    Abstract: Offline reinforcement learning (RL) addresses the problem of learning a performant policy from a fixed batch of data collected by following some behavior policy. Model-based approaches are particularly appealing in the offline setting since they can extract more learning signals from the logged dataset by learning a model of the environment. However, the performance of existing model-based approac… ▽ More

    Submitted 3 March, 2023; v1 submitted 7 October, 2022; originally announced October 2022.

  25. arXiv:2203.12679  [pdf, other

    cs.AI cs.LG

    Sample-efficient Iterative Lower Bound Optimization of Deep Reactive Policies for Planning in Continuous MDPs

    Authors: Siow Meng Low, Akshat Kumar, Scott Sanner

    Abstract: Recent advances in deep learning have enabled optimization of deep reactive policies (DRPs) for continuous MDP planning by encoding a parametric policy as a deep neural network and exploiting automatic differentiation in an end-to-end model-based gradient descent framework. This approach has proven effective for optimizing DRPs in nonlinear continuous MDPs, but it requires a large number of sample… ▽ More

    Submitted 23 March, 2022; originally announced March 2022.

  26. TransCAM: Transformer Attention-based CAM Refinement for Weakly Supervised Semantic Segmentation

    Authors: Ruiwen Li, Zheda Mai, Chiheb Trabelsi, Zhibo Zhang, Jongseong Jang, Scott Sanner

    Abstract: Weakly supervised semantic segmentation (WSSS) with only image-level supervision is a challenging task. Most existing methods exploit Class Activation Maps (CAM) to generate pixel-level pseudo labels for supervised training. However, due to the local receptive field of Convolution Neural Networks (CNN), CAM applied to CNNs often suffers from partial activation -- highlighting the most discriminati… ▽ More

    Submitted 14 March, 2022; originally announced March 2022.

    Journal ref: Journal of Visual Communication and Image Representation 2023

  27. arXiv:2201.06224  [pdf, other

    cs.IR cs.AI cs.CL cs.HC

    Unintended Bias in Language Model-driven Conversational Recommendation

    Authors: Tianshu Shen, Jiaru Li, Mohamed Reda Bouadjenek, Zheda Mai, Scott Sanner

    Abstract: Conversational Recommendation Systems (CRSs) have recently started to leverage pretrained language models (LM) such as BERT for their ability to semantically interpret a wide range of preference statement variations. However, pretrained LMs are well-known to be prone to intrinsic biases in their training data, which may be exacerbated by biases embedded in domain-specific language data(e.g., user… ▽ More

    Submitted 18 January, 2022; v1 submitted 17 January, 2022; originally announced January 2022.

    Comments: 12 pages, 7 figures

  28. arXiv:2111.14271  [pdf, other

    cs.CV cs.AI cs.LG

    ExCon: Explanation-driven Supervised Contrastive Learning for Image Classification

    Authors: Zhibo Zhang, Jongseong Jang, Chiheb Trabelsi, Ruiwen Li, Scott Sanner, Yeonjeong Jeong, Dongsub Shim

    Abstract: Contrastive learning has led to substantial improvements in the quality of learned embedding representations for tasks such as image classification. However, a key drawback of existing contrastive augmentation methods is that they may lead to the modification of the image content which can yield undesired alterations of its semantics. This can affect the performance of the model on downstream task… ▽ More

    Submitted 17 April, 2022; v1 submitted 28 November, 2021; originally announced November 2021.

  29. arXiv:2110.01794  [pdf, other

    cs.LG

    Multi-axis Attentive Prediction for Sparse EventData: An Application to Crime Prediction

    Authors: Yi Sui, Ga Wu, Scott Sanner

    Abstract: Spatiotemporal prediction of event data is a challenging task with a long history of research. While recent work in spatiotemporal prediction has leveraged deep sequential models that substantially improve over classical approaches, these models are prone to overfitting when the observation is extremely sparse, as in the task of crime event prediction. To overcome these sparsity issues, we present… ▽ More

    Submitted 4 October, 2021; originally announced October 2021.

  30. arXiv:2108.00633  [pdf, ps, other

    cs.AI

    Planning with Learned Binarized Neural Networks Benchmarks for MaxSAT Evaluation 2021

    Authors: Buser Say, Scott Sanner, Jo Devriendt, Jakob Nordström, Peter J. Stuckey

    Abstract: This document provides a brief introduction to learned automated planning problem where the state transition function is in the form of a binarized neural network (BNN), presents a general MaxSAT encoding for this problem, and describes the four domains, namely: Navigation, Inventory Control, System Administrator and Cellda, that are submitted as benchmarks for MaxSAT Evaluation 2021.

    Submitted 2 August, 2021; originally announced August 2021.

  31. arXiv:2106.07260  [pdf, other

    cs.LG

    RAPTOR: End-to-end Risk-Aware MDP Planning and Policy Learning by Backpropagation

    Authors: Noah Patton, Jihwan Jeong, Michael Gimelfarb, Scott Sanner

    Abstract: Planning provides a framework for optimizing sequential decisions in complex environments. Recent advances in efficient planning in deterministic or stochastic high-dimensional domains with continuous action spaces leverage backpropagation through a model of the environment to directly optimize actions. However, existing methods typically not take risk into account when optimizing in stochastic do… ▽ More

    Submitted 14 June, 2021; originally announced June 2021.

  32. arXiv:2105.14162  [pdf, other

    cs.LG cs.AI cs.CV

    EDDA: Explanation-driven Data Augmentation to Improve Explanation Faithfulness

    Authors: Ruiwen Li, Zhibo Zhang, Jiani Li, Chiheb Trabelsi, Scott Sanner, Jongseong Jang, Yeonjeong Jeong, Dongsub Shim

    Abstract: Recent years have seen the introduction of a range of methods for post-hoc explainability of image classifier predictions. However, these post-hoc explanations may not always be faithful to classifier predictions, which poses a significant challenge when attempting to debug models based on such explanations. To this end, we seek a methodology that can improve the faithfulness of an explanation met… ▽ More

    Submitted 24 September, 2021; v1 submitted 28 May, 2021; originally announced May 2021.

  33. arXiv:2105.14127  [pdf, other

    cs.LG cs.AI cs.RO

    Risk-Aware Transfer in Reinforcement Learning using Successor Features

    Authors: Michael Gimelfarb, André Barreto, Scott Sanner, Chi-Guhn Lee

    Abstract: Sample efficiency and risk-awareness are central to the development of practical reinforcement learning (RL) for complex decision-making. The former can be addressed by transfer learning and the latter by optimizing some utility function of the return. However, the problem of transferring skills in a risk-aware manner is not well-understood. In this paper, we address the problem of risk-aware poli… ▽ More

    Submitted 28 May, 2021; originally announced May 2021.

  34. arXiv:2103.13885  [pdf, other

    cs.LG cs.AI cs.CV

    Supervised Contrastive Replay: Revisiting the Nearest Class Mean Classifier in Online Class-Incremental Continual Learning

    Authors: Zheda Mai, Ruiwen Li, Hyunwoo Kim, Scott Sanner

    Abstract: Online class-incremental continual learning (CL) studies the problem of learning new classes continually from an online non-stationary data stream, intending to adapt to new data while mitigating catastrophic forgetting. While memory replay has shown promising results, the recency bias in online learning caused by the commonly used Softmax classifier remains an unsolved challenge. Although the Nea… ▽ More

    Submitted 15 September, 2021; v1 submitted 22 March, 2021; originally announced March 2021.

    Comments: In Workshop on Continual Learning in Computer Vision at CVPR 2021

  35. arXiv:2101.10423  [pdf, other

    cs.LG cs.CV

    Online Continual Learning in Image Classification: An Empirical Survey

    Authors: Zheda Mai, Ruiwen Li, Jihwan Jeong, David Quispe, Hyunwoo Kim, Scott Sanner

    Abstract: Online continual learning for image classification studies the problem of learning to classify images from an online stream of data and tasks, where tasks may include new classes (class incremental) or data nonstationarity (domain incremental). One of the key challenges of continual learning is to avoid catastrophic forgetting (CF), i.e., forgetting old tasks in the presence of more recent tasks.… ▽ More

    Submitted 4 October, 2021; v1 submitted 25 January, 2021; originally announced January 2021.

    Comments: Accepted for publication in the Elsevier's Neurocomputing journal. Codes available at https://github.com/RaptorMai/online-continual-learning

  36. arXiv:2010.12803  [pdf, other

    cs.IR cs.LG

    Attentive Autoencoders for Multifaceted Preference Learning in One-class Collaborative Filtering

    Authors: Zheda Mai, Ga Wu, Kai Luo, Scott Sanner

    Abstract: Most existing One-Class Collaborative Filtering (OC-CF) algorithms estimate a user's preference as a latent vector by encoding their historical interactions. However, users often show diverse interests, which significantly increases the learning difficulty. In order to capture multifaceted user preferences, existing recommender systems either increase the encoding complexity or extend the latent r… ▽ More

    Submitted 24 October, 2020; originally announced October 2020.

    Comments: Accepted at ICDMW 2020

  37. arXiv:2009.00093  [pdf, other

    cs.LG cs.CV stat.ML

    Online Class-Incremental Continual Learning with Adversarial Shapley Value

    Authors: Dongsub Shim, Zheda Mai, Jihwan Jeong, Scott Sanner, Hyunwoo Kim, Jongseong Jang

    Abstract: As image-based deep learning becomes pervasive on every device, from cell phones to smart watches, there is a growing need to develop methods that continually learn from data while minimizing memory footprint and power consumption. While memory replay techniques have shown exceptional promise for this task of continual learning, the best method for selecting which buffered images to replay is stil… ▽ More

    Submitted 22 March, 2021; v1 submitted 31 August, 2020; originally announced September 2020.

    Comments: Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI-21)

  38. arXiv:2008.01246  [pdf, other

    cs.IR

    Noise Contrastive Estimation for Autoencoding-based One-Class Collaborative Filtering

    Authors: Jin Peng Zhou, Ga Wu, Zheda Mai, Scott Sanner

    Abstract: One-class collaborative filtering (OC-CF) is a common class of recommendation problem where only the positive class is explicitly observed (e.g., purchases, clicks). Autoencoder based recommenders such as AutoRec and variants demonstrate strong performance on many OC-CF benchmarks, but also empirically suffer from a strong popularity bias. While a careful choice of negative samples in the OC-CF se… ▽ More

    Submitted 5 August, 2020; v1 submitted 3 August, 2020; originally announced August 2020.

    Comments: 10 pages, 7 figures

  39. arXiv:2007.05683  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Batch-level Experience Replay with Review for Continual Learning

    Authors: Zheda Mai, Hyunwoo Kim, Jihwan Jeong, Scott Sanner

    Abstract: Continual learning is a branch of deep learning that seeks to strike a balance between learning stability and plasticity. The CVPR 2020 CLVision Continual Learning for Computer Vision challenge is dedicated to evaluating and advancing the current state-of-the-art continual learning methods using the CORe50 dataset with three different continual learning scenarios. This paper presents our approach,… ▽ More

    Submitted 11 July, 2020; originally announced July 2020.

  40. arXiv:2007.00869  [pdf, other

    cs.LG cs.RO stat.ML

    ε-BMC: A Bayesian Ensemble Approach to Epsilon-Greedy Exploration in Model-Free Reinforcement Learning

    Authors: Michael Gimelfarb, Scott Sanner, Chi-Guhn Lee

    Abstract: Resolving the exploration-exploitation trade-off remains a fundamental problem in the design and implementation of reinforcement learning (RL) algorithms. In this paper, we focus on model-free RL using the epsilon-greedy exploration policy, which despite its simplicity, remains one of the most frequently used forms of exploration. However, a key limitation of this policy is the specification of… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

    Comments: Published in UAI 2019

  41. arXiv:2006.05725  [pdf, other

    cs.LG cs.NE cs.RO stat.ML

    Bayesian Experience Reuse for Learning from Multiple Demonstrators

    Authors: Michael Gimelfarb, Scott Sanner, Chi-Guhn Lee

    Abstract: Learning from demonstrations (LfD) improves the exploration efficiency of a learning agent by incorporating demonstrations from experts. However, demonstration data can often come from multiple experts with conflicting goals, making it difficult to incorporate safely and effectively in online settings. We address this problem in the static and dynamic optimization settings by modelling the uncerta… ▽ More

    Submitted 10 June, 2020; originally announced June 2020.

    Comments: 15 pages, 7 figures

  42. arXiv:2003.00203  [pdf, other

    cs.LG cs.NE cs.RO eess.SY

    Contextual Policy Transfer in Reinforcement Learning Domains via Deep Mixtures-of-Experts

    Authors: Michael Gimelfarb, Scott Sanner, Chi-Guhn Lee

    Abstract: In reinforcement learning, agents that consider the context, or current state, when selecting source policies for transfer have been shown to outperform context-free approaches. However, none of the existing approaches transfer knowledge contextually from model-based learners to a model-free learner. This could be useful, for instance, when source policies are intentionally learned on diverse simu… ▽ More

    Submitted 10 June, 2020; v1 submitted 29 February, 2020; originally announced March 2020.

    Comments: - updated experiment for Lander domain (fixed a bug in the UCB baseline) - minor editing and formatting, fixing typos - new template - 15 pages, 6 figures

  43. arXiv:1904.10403  [pdf, other

    cs.IR

    Optimizing Search API Queries for Twitter Topic Classifiers Using a Maximum Set Coverage Approach

    Authors: Kasra Safari, Scott Sanner

    Abstract: Twitter has grown to become an important platform to access immediate information about major events and dynamic topics. As one example, recent work has shown that classifiers trained to detect topical content on Twitter can generalize well beyond the training data. Since access to Twitter data is hidden behind a limited search API, it is impossible (for most users) to apply these classifiers dire… ▽ More

    Submitted 25 January, 2020; v1 submitted 23 April, 2019; originally announced April 2019.

  44. arXiv:1904.09366  [pdf, other

    cs.AI

    Reward Potentials for Planning with Learned Neural Network Transition Models

    Authors: Buser Say, Scott Sanner, Sylvie Thiébaux

    Abstract: Optimal planning with respect to learned neural network (NN) models in continuous action and state spaces using mixed-integer linear programming (MILP) is a challenging task for branch-and-bound solvers due to the poor linear relaxation of the underlying MILP model. For a given set of features, potential heuristics provide an efficient framework for computing bounds on cost (reward) functions. In… ▽ More

    Submitted 26 July, 2019; v1 submitted 19 April, 2019; originally announced April 2019.

    Comments: To appear in the proceedings of the 25th International Conference on Principles and Practice of Constraint Programming

  45. arXiv:1904.02873  [pdf, other

    cs.AI cs.LG

    Scalable Planning with Deep Neural Network Learned Transition Models

    Authors: Ga Wu, Buser Say, Scott Sanner

    Abstract: In many real-world planning problems with factored, mixed discrete and continuous state and action spaces such as Reservoir Control, Heating Ventilation, and Air Conditioning, and Navigation domains, it is difficult to obtain a model of the complex nonlinear dynamics that govern state evolution. However, the ubiquity of modern sensors allows us to collect large quantities of data from each of thes… ▽ More

    Submitted 14 July, 2020; v1 submitted 5 April, 2019; originally announced April 2019.

    Comments: 36 pages

  46. arXiv:1811.10433  [pdf, other

    cs.AI

    Compact and Efficient Encodings for Planning in Factored State and Action Spaces with Learned Binarized Neural Network Transition Models

    Authors: Buser Say, Scott Sanner

    Abstract: In this paper, we leverage the efficiency of Binarized Neural Networks (BNNs) to learn complex state transition models of planning domains with discretized factored state and action spaces. In order to directly exploit this transition structure for planning, we present two novel compilations of the learned factored planning problem with BNNs based on reductions to Weighted Partial Maximum Boolean… ▽ More

    Submitted 6 March, 2020; v1 submitted 26 November, 2018; originally announced November 2018.

  47. arXiv:1811.00697  [pdf, other

    cs.IR

    Noise Contrastive Estimation for Scalable Linear Models for One-Class Collaborative Filtering

    Authors: Ga Wu, Maksims Volkovs, Chee Loong Soon, Scott Sanner, Himanshu Rai

    Abstract: Previous highly scalable one-class collaborative filtering methods such as Projected Linear Recommendation (PLRec) have advocated using fast randomized SVD to embed items into a latent space, followed by linear regression methods to learn personalized recommendation models per user. Unfortunately, naive SVD embedding methods often exhibit a popularity bias that skews the ability to accurately embe… ▽ More

    Submitted 1 November, 2018; originally announced November 2018.

    Comments: 8 pages

  48. arXiv:1809.00060  [pdf, other

    cs.IR cs.CV

    Aesthetic Features for Personalized Photo Recommendation

    Authors: Yu Qing Zhou, Ga Wu, Scott Sanner, Putra Manggala

    Abstract: Many photography websites such as Flickr, 500px, Unsplash, and Adobe Behance are used by amateur and professional photography enthusiasts. Unlike content-based image search, such users of photography websites are not just looking for photos with certain content, but more generally for photos with a certain photographic "aesthetic". In this context, we explore personalized photo recommendation and… ▽ More

    Submitted 31 August, 2018; originally announced September 2018.

    Comments: In Proceedings of the Late-Breaking Results track part of the Twelfth ACM Conference on Recommender Systems, Vancouver, BC, Canada, October 6, 2018, 2 pages

  49. arXiv:1805.07785  [pdf, other

    stat.ML cs.CV cs.LG

    Conditional Inference in Pre-trained Variational Autoencoders via Cross-coding

    Authors: Ga Wu, Justin Domke, Scott Sanner

    Abstract: Variational Autoencoders (VAEs) are a popular generative model, but one in which conditional inference can be challenging. If the decomposition into query and evidence variables is fixed, conditional VAEs provide an attractive solution. To support arbitrary queries, one is generally reduced to Markov Chain Monte Carlo sampling methods that can suffer from long mixing times. In this paper, we propo… ▽ More

    Submitted 3 October, 2018; v1 submitted 20 May, 2018; originally announced May 2018.

    Comments: 8 pages main content, 4 pages appendix

  50. arXiv:1704.07511  [pdf, ps, other

    cs.LG

    Scalable Planning with Tensorflow for Hybrid Nonlinear Domains

    Authors: Ga Wu, Buser Say, Scott Sanner

    Abstract: Given recent deep learning results that demonstrate the ability to effectively optimize high-dimensional non-convex functions with gradient descent optimization on GPUs, we ask in this paper whether symbolic gradient optimization tools such as Tensorflow can be effective for planning in hybrid (mixed discrete and continuous) nonlinear domains with high dimensional state and action spaces? To this… ▽ More

    Submitted 4 November, 2017; v1 submitted 24 April, 2017; originally announced April 2017.

    Comments: 9 pages