Zum Hauptinhalt springen

Showing 1–36 of 36 results for author: Sun, C

Searching in archive stat. Search in all archives.
.
  1. arXiv:2402.07087  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    Self-Correcting Self-Consuming Loops for Generative Model Training

    Authors: Nate Gillman, Michael Freeman, Daksh Aggarwal, Chia-Hong Hsu, Calvin Luo, Yonglong Tian, Chen Sun

    Abstract: As synthetic data becomes higher quality and proliferates on the internet, machine learning models are increasingly trained on a mix of human- and machine-generated data. Despite the successful stories of using synthetic data for representation learning, using synthetic data for generative model training creates "self-consuming loops" which may lead to training instability or even collapse, unless… ▽ More

    Submitted 10 June, 2024; v1 submitted 10 February, 2024; originally announced February 2024.

    Comments: Camera ready version (ICML 2024). Code at https://nategillman.com/sc-sc.html

  2. arXiv:2312.12191  [pdf, other

    cs.LG cs.AI stat.ML

    CUDC: A Curiosity-Driven Unsupervised Data Collection Method with Adaptive Temporal Distances for Offline Reinforcement Learning

    Authors: Chenyu Sun, Hangwei Qian, Chunyan Miao

    Abstract: Offline reinforcement learning (RL) aims to learn an effective policy from a pre-collected dataset. Most existing works are to develop sophisticated learning algorithms, with less emphasis on improving the data collection process. Moreover, it is even challenging to extend the single-task setting and collect a task-agnostic dataset that allows an agent to perform multiple downstream tasks. In this… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: Accepted at AAAI-24

  3. arXiv:2312.11927  [pdf, other

    cs.LG cs.SI stat.ME

    Empowering Dual-Level Graph Self-Supervised Pretraining with Motif Discovery

    Authors: Pengwei Yan, Kaisong Song, Zhuoren Jiang, Yangyang Kang, Tianqianjin Lin, Changlong Sun, Xiaozhong Liu

    Abstract: While self-supervised graph pretraining techniques have shown promising results in various domains, their application still experiences challenges of limited topology learning, human knowledge dependency, and incompetent multi-level interactions. To address these issues, we propose a novel solution, Dual-level Graph self-supervised Pretraining with Motif discovery (DGPM), which introduces a unique… ▽ More

    Submitted 19 December, 2023; originally announced December 2023.

    Comments: 14 pages, 6 figures, accepted by AAAI'24

  4. arXiv:2312.05757  [pdf, ps, other

    cs.LG cs.AI cs.DL cs.SI stat.ME

    Towards Human-like Perception: Learning Structural Causal Model in Heterogeneous Graph

    Authors: Tianqianjin Lin, Kaisong Song, Zhuoren Jiang, Yangyang Kang, Weikang Yuan, Xurui Li, Changlong Sun, Cui Huang, Xiaozhong Liu

    Abstract: Heterogeneous graph neural networks have become popular in various domains. However, their generalizability and interpretability are limited due to the discrepancy between their inherent inference flows and human reasoning logic or underlying causal relationships for the learning problem. This study introduces a novel solution, HG-SCM (Heterogeneous Graph as Structural Causal Model). It can mimic… ▽ More

    Submitted 9 December, 2023; originally announced December 2023.

    Comments: 28 pages, 10 figures, 6 tables, accepted by Information Processing & Management

    Journal ref: Information Processing & Management, 60 (2024) 1-21

  5. arXiv:2312.00308  [pdf, other

    cs.CV eess.IV stat.AP

    A knowledge-based data-driven (KBDD) framework for all-day identification of cloud types using satellite remote sensing

    Authors: Longfeng Nie, Yuntian Chen, Mengge Du, Changqi Sun, Dongxiao Zhang

    Abstract: Cloud types, as a type of meteorological data, are of particular significance for evaluating changes in rainfall, heatwaves, water resources, floods and droughts, food security and vegetation cover, as well as land use. In order to effectively utilize high-resolution geostationary observations, a knowledge-based data-driven (KBDD) framework for all-day identification of cloud types based on spectr… ▽ More

    Submitted 30 November, 2023; originally announced December 2023.

  6. arXiv:2310.02423  [pdf, other

    cs.LG stat.ML

    Delta-AI: Local objectives for amortized inference in sparse graphical models

    Authors: Jean-Pierre Falet, Hae Beom Lee, Nikolay Malkin, Chen Sun, Dragos Secrieru, Thomas Jiralerspong, Dinghuai Zhang, Guillaume Lajoie, Yoshua Bengio

    Abstract: We present a new algorithm for amortized inference in sparse probabilistic graphical models (PGMs), which we call $Δ$-amortized inference ($Δ$-AI). Our approach is based on the observation that when the sampling of variables in a PGM is seen as a sequence of actions taken by an agent, sparsity of the PGM enables local credit assignment in the agent's policy learning objective. This yields a local… ▽ More

    Submitted 13 March, 2024; v1 submitted 3 October, 2023; originally announced October 2023.

    Comments: ICLR 2024; 19 pages, code: https://github.com/GFNOrg/Delta-AI/

  7. arXiv:2310.00817  [pdf, other

    stat.ML cs.LG

    Learning to Make Adherence-Aware Advice

    Authors: Guanting Chen, Xiaocheng Li, Chunlin Sun, Hanzhao Wang

    Abstract: As artificial intelligence (AI) systems play an increasingly prominent role in human decision-making, challenges surface in the realm of human-AI interactions. One challenge arises from the suboptimal AI policies due to the inadequate consideration of humans disregarding AI recommendations, as well as the need for AI to provide advice selectively when it is most pertinent. This paper presents a se… ▽ More

    Submitted 20 March, 2024; v1 submitted 1 October, 2023; originally announced October 2023.

  8. arXiv:2307.04402  [pdf

    stat.ME eess.SY

    Moving pattern-based modeling using a new type of interval ARX model

    Authors: Changping Sun

    Abstract: In this paper,firstly,to overcome the shortcoming of traditional ARX model, a new operator between an interval number and a real matrix is defined, and then it is applied to the traditional ARX model to get a new type of structure interval ARX model that can deal with interval data, which is defined as interval ARX model (IARX). Secondly,the IARX model is applied to moving pattern-based modeling.… ▽ More

    Submitted 12 July, 2023; v1 submitted 10 July, 2023; originally announced July 2023.

  9. arXiv:2301.11260  [pdf, ps, other

    cs.LG math.OC stat.ML

    Maximum Optimality Margin: A Unified Approach for Contextual Linear Programming and Inverse Linear Programming

    Authors: Chunlin Sun, Shang Liu, Xiaocheng Li

    Abstract: In this paper, we study the predict-then-optimize problem where the output of a machine learning prediction task is used as the input of some downstream optimization problem, say, the objective coefficient vector of a linear program. The problem is also known as predictive analytics or contextual linear programming. The existing approaches largely suffer from either (i) optimization intractability… ▽ More

    Submitted 28 May, 2023; v1 submitted 26 January, 2023; originally announced January 2023.

    Comments: to be published in ICML 2023

  10. arXiv:2205.00943  [pdf, other

    cs.LG cs.AI cs.CV stat.ML

    CCLF: A Contrastive-Curiosity-Driven Learning Framework for Sample-Efficient Reinforcement Learning

    Authors: Chenyu Sun, Hangwei Qian, Chunyan Miao

    Abstract: In reinforcement learning (RL), it is challenging to learn directly from high-dimensional observations, where data augmentation has recently been shown to remedy this via encoding invariances from raw pixels. Nevertheless, we empirically find that not all samples are equally important and hence simply injecting more augmented inputs may instead cause instability in Q-learning. In this paper, we ap… ▽ More

    Submitted 3 May, 2022; v1 submitted 2 May, 2022; originally announced May 2022.

    Comments: Full paper with supplementary material, accepted by IJCAI 2022. Acknowledgements and affiliations are updated

  11. arXiv:2203.04511  [pdf, other

    cs.LG stat.AP

    Revealing the Excitation Causality between Climate and Political Violence via a Neural Forward-Intensity Poisson Process

    Authors: Schyler C. Sun, Bailu Jin, Zhuangkun Wei, Weisi Guo

    Abstract: The causal mechanism between climate and political violence is fraught with complex mechanisms. Current quantitative causal models rely on one or more assumptions: (1) the climate drivers persistently generate conflict, (2) the causal mechanisms have a linear relationship with the conflict generation parameter, and/or (3) there is sufficient data to inform the prior distribution. Yet, we know conf… ▽ More

    Submitted 8 March, 2022; originally announced March 2022.

  12. arXiv:2201.13259  [pdf, other

    cs.LG stat.ML

    Trajectory balance: Improved credit assignment in GFlowNets

    Authors: Nikolay Malkin, Moksh Jain, Emmanuel Bengio, Chen Sun, Yoshua Bengio

    Abstract: Generative flow networks (GFlowNets) are a method for learning a stochastic policy for generating compositional objects, such as graphs or strings, from a given unnormalized density by sequences of actions, where many possible action sequences may lead to the same object. We find previously proposed learning objectives for GFlowNets, flow matching and detailed balance, which are analogous to tempo… ▽ More

    Submitted 4 October, 2023; v1 submitted 31 January, 2022; originally announced January 2022.

    Comments: NeurIPS 2022; see footnotes for code; v3 fixes minor errata

  13. arXiv:2110.05428  [pdf, other

    stat.ML cs.AI cs.LG

    Learning Temporally Causal Latent Processes from General Temporal Data

    Authors: Weiran Yao, Yuewen Sun, Alex Ho, Changyin Sun, Kun Zhang

    Abstract: Our goal is to recover time-delayed latent causal variables and identify their relations from measured temporal data. Estimating causally-related latent variables from observations is particularly challenging as the latent variables are not uniquely recoverable in the most general case. In this work, we consider both a nonparametric, nonstationary setting and a parametric setting for the latent pr… ▽ More

    Submitted 8 February, 2022; v1 submitted 11 October, 2021; originally announced October 2021.

    Comments: ICLR 2022: https://openreview.net/forum?id=RDlLMjLJXdq

  14. arXiv:2104.14281  [pdf

    cs.CY cs.LG stat.AP

    Leveraging Online Shopping Behaviors as a Proxy for Personal Lifestyle Choices: New Insights into Chronic Disease Prevention Literacy

    Authors: Yongzhen Wang, Xiaozhong Liu, Katy Börner, Jun Lin, Yingnan Ju, Changlong Sun, Luo Si

    Abstract: Objective: Ubiquitous internet access is reshaping the way we live, but it is accompanied by unprecedented challenges in preventing chronic diseases that are usually planted by long exposure to unhealthy lifestyles. This paper proposes leveraging online shopping behaviors as a proxy for personal lifestyle choices to improve chronic disease prevention literacy, targeted for times when e-commerce us… ▽ More

    Submitted 9 March, 2022; v1 submitted 29 April, 2021; originally announced April 2021.

    Comments: 58 pages with appendices, 5 figures, 17 tables

  15. arXiv:2010.12493  [pdf, other

    cs.LG stat.ML

    A Review of Deep Learning Methods for Irregularly Sampled Medical Time Series Data

    Authors: Chenxi Sun, Shenda Hong, Moxian Song, Hongyan Li

    Abstract: Irregularly sampled time series (ISTS) data has irregular temporal intervals between observations and different sampling rates between sequences. ISTS commonly appears in healthcare, economics, and geoscience. Especially in the medical environment, the widely used Electronic Health Records (EHRs) have abundant typical irregularly sampled medical time series (ISMTS) data. Developing deep learning m… ▽ More

    Submitted 26 October, 2020; v1 submitted 23 October, 2020; originally announced October 2020.

    Comments: 19 pages, 7 figures

  16. arXiv:2008.11922  [pdf, other

    cs.IR cs.LG stat.ML

    Time-based Sequence Model for Personalization and Recommendation Systems

    Authors: Tigran Ishkhanov, Maxim Naumov, Xianjie Chen, Yan Zhu, Yuan Zhong, Alisson Gusatti Azzolini, Chonglin Sun, Frank Jiang, Andrey Malevich, Liang Xiong

    Abstract: In this paper we develop a novel recommendation model that explicitly incorporates time information. The model relies on an embedding layer and TSL attention-like mechanism with inner products in different vector spaces, that can be thought of as a modification of multi-headed attention. This mechanism allows the model to efficiently treat sequences of user behavior of different length. We study t… ▽ More

    Submitted 27 August, 2020; originally announced August 2020.

    Comments: 17 pages, 7 figures

    MSC Class: 68T05 ACM Class: I.2.6; I.5.0; H.3.3; H.3.4

  17. arXiv:2006.11419  [pdf, other

    cs.LG cs.AI stat.ML

    FISAR: Forward Invariant Safe Reinforcement Learning with a Deep Neural Network-Based Optimize

    Authors: Chuangchuang Sun, Dong-Ki Kim, Jonathan P. How

    Abstract: This paper investigates reinforcement learning with constraints, which are indispensable in safety-critical environments. To drive the constraint violation monotonically decrease, we take the constraints as Lyapunov functions and impose new linear constraints on the policy parameters' updating dynamics. As a result, the original safety set can be forward-invariant. However, because the new guarant… ▽ More

    Submitted 5 May, 2021; v1 submitted 19 June, 2020; originally announced June 2020.

    Comments: Accepted to ICML 2020 Workshop Theoretical Foundations of RL; Accepted to ICRA 2021

  18. arXiv:2006.06057  [pdf, other

    cs.LG cs.AI stat.ML

    Scalable Partial Explainability in Neural Networks via Flexible Activation Functions

    Authors: Schyler C. Sun, Chen Li, Zhuangkun Wei, Antonios Tsourdos, Weisi Guo

    Abstract: Achieving transparency in black-box deep learning algorithms is still an open challenge. High dimensional features and decisions given by deep neural networks (NN) require new algorithms and methods to expose its mechanisms. Current state-of-the-art NN interpretation methods (e.g. Saliency maps, DeepLIFT, LIME, etc.) focus more on the direct relationship between NN outputs and inputs rather than t… ▽ More

    Submitted 10 June, 2020; originally announced June 2020.

  19. arXiv:2006.05859  [pdf

    econ.EM econ.GN stat.AP

    Trading Privacy for the Greater Social Good: How Did America React During COVID-19?

    Authors: Anindya Ghose, Beibei Li, Meghanath Macha, Chenshuo Sun, Natasha Ying Zhang Foutz

    Abstract: Digital contact tracing and analysis of social distancing from smartphone location data are two prime examples of non-therapeutic interventions used in many countries to mitigate the impact of the COVID-19 pandemic. While many understand the importance of trading personal privacy for the public good, others have been alarmed at the potential for surveillance via measures enabled through location t… ▽ More

    Submitted 10 June, 2020; originally announced June 2020.

  20. arXiv:2005.04259  [pdf, other

    cs.CV cs.LG stat.ML

    VectorNet: Encoding HD Maps and Agent Dynamics from Vectorized Representation

    Authors: Jiyang Gao, Chen Sun, Hang Zhao, Yi Shen, Dragomir Anguelov, Congcong Li, Cordelia Schmid

    Abstract: Behavior prediction in dynamic, multi-agent systems is an important problem in the context of self-driving cars, due to the complex representations and interactions of road components, including moving agents (e.g. pedestrians and vehicles) and road context information (e.g. lanes, traffic lights). This paper introduces VectorNet, a hierarchical graph neural network that first exploits the spatial… ▽ More

    Submitted 8 May, 2020; originally announced May 2020.

    Comments: CVPR 2020

  21. arXiv:2004.13970  [pdf, other

    cs.LG stat.ML

    Directed Graph Convolutional Network

    Authors: Zekun Tong, Yuxuan Liang, Changsheng Sun, David S. Rosenblum, Andrew Lim

    Abstract: Graph Convolutional Networks (GCNs) have been widely used due to their outstanding performance in processing graph-structured data. However, the undirected graphs limit their application scope. In this paper, we extend spectral-based graph convolution to directed graphs by using first- and second-order proximity, which can not only retain the connection properties of the directed graph, but also e… ▽ More

    Submitted 29 April, 2020; originally announced April 2020.

  22. arXiv:2003.04994  [pdf, ps, other

    cs.CL cs.LG stat.ML

    Masking Orchestration: Multi-task Pretraining for Multi-role Dialogue Representation Learning

    Authors: Tianyi Wang, Yating Zhang, Xiaozhong Liu, Changlong Sun, Qiong Zhang

    Abstract: Multi-role dialogue understanding comprises a wide range of diverse tasks such as question answering, act classification, dialogue summarization etc. While dialogue corpora are abundantly available, labeled data, for specific learning tasks, can be highly scarce and expensive. In this work, we investigate dialogue context representation learning with various types unsupervised pretraining tasks wh… ▽ More

    Submitted 26 February, 2020; originally announced March 2020.

    Comments: 8 pages, 4 figures, AAAI2020

  23. arXiv:2003.03609  [pdf, other

    cs.LG stat.ML

    RCC-Dual-GAN: An Efficient Approach for Outlier Detection with Few Identified Anomalies

    Authors: Zhe Li, Chunhua Sun, Chunli Liu, Xiayu Chen, Meng Wang, Yezheng Liu

    Abstract: Outlier detection is an important task in data mining and many technologies have been explored in various applications. However, due to the default assumption that outliers are non-concentrated, unsupervised outlier detection may not correctly detect group anomalies with higher density levels. As for the supervised outlier detection, although high detection rates and optimal parameters can usually… ▽ More

    Submitted 7 March, 2020; originally announced March 2020.

  24. arXiv:2001.07631  [pdf, other

    cs.LG cs.CV stat.ML

    HRFA: High-Resolution Feature-based Attack

    Authors: Zhixing Ye, Sizhe Chen, Peidong Zhang, Chengjin Sun, Xiaolin Huang

    Abstract: Adversarial attacks have long been developed for revealing the vulnerability of Deep Neural Networks (DNNs) by adding imperceptible perturbations to the input. Most methods generate perturbations like normal noise, which is not interpretable and without semantic meaning. In this paper, we propose High-Resolution Feature-based Attack (HRFA), yielding authentic adversarial examples with up to… ▽ More

    Submitted 22 October, 2020; v1 submitted 21 January, 2020; originally announced January 2020.

  25. arXiv:2001.06325  [pdf, other

    cs.LG stat.ML

    Universal Adversarial Attack on Attention and the Resulting Dataset DAmageNet

    Authors: Sizhe Chen, Zhengbao He, Chengjin Sun, Jie Yang, Xiaolin Huang

    Abstract: Adversarial attacks on deep neural networks (DNNs) have been found for several years. However, the existing adversarial attacks have high success rates only when the information of the victim DNN is well-known or could be estimated by the structure similarity or massive queries. In this paper, we propose to Attack on Attention (AoA), a semantic property commonly shared by DNNs. AoA enjoys a signif… ▽ More

    Submitted 21 October, 2020; v1 submitted 16 January, 2020; originally announced January 2020.

    Comments: accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

  26. arXiv:2001.00784  [pdf, other

    cs.LG stat.ML

    Optimizing Wireless Systems Using Unsupervised and Reinforced-Unsupervised Deep Learning

    Authors: Dong Liu, Chengjian Sun, Chenyang Yang, Lajos Hanzo

    Abstract: Resource allocation and transceivers in wireless networks are usually designed by solving optimization problems subject to specific constraints, which can be formulated as variable or functional optimization. If the objective and constraint functions of a variable optimization problem can be derived, standard numerical algorithms can be applied for finding the optimal solution, which however incur… ▽ More

    Submitted 3 January, 2020; originally announced January 2020.

    Comments: To appear in IEEE Network Magazine

  27. arXiv:1912.07160  [pdf, other

    cs.LG cs.CV eess.IV stat.ML

    DAmageNet: A Universal Adversarial Dataset

    Authors: Sizhe Chen, Xiaolin Huang, Zhengbao He, Chengjin Sun

    Abstract: It is now well known that deep neural networks (DNNs) are vulnerable to adversarial attack. Adversarial samples are similar to the clean ones, but are able to cheat the attacked DNN to produce incorrect predictions in high confidence. But most of the existing adversarial attacks have high success rate only when the information of the attacked DNN is well-known or could be estimated by massive quer… ▽ More

    Submitted 15 December, 2019; originally announced December 2019.

  28. arXiv:1911.03183  [pdf, other

    cs.LG cs.CR cs.DC stat.ML

    Privacy-Preserving Generalized Linear Models using Distributed Block Coordinate Descent

    Authors: Erik-Jan van Kesteren, Chang Sun, Daniel L. Oberski, Michel Dumontier, Lianne Ippel

    Abstract: Combining data from varied sources has considerable potential for knowledge discovery: collaborating data parties can mine data in an expanded feature space, allowing them to explore a larger range of scientific questions. However, data sharing among different parties is highly restricted by legal conditions, ethical concerns, and / or data volume. Fueled by these concerns, the fields of cryptogra… ▽ More

    Submitted 8 November, 2019; originally announced November 2019.

    Comments: Fully reproducible code for all results and images can be found at https://github.com/vankesteren/privacy-preserving-glm, and the software package can be found at https://github.com/vankesteren/privreg

  29. arXiv:1907.12706  [pdf, other

    cs.LG eess.SP stat.ML

    Model-Free Unsupervised Learning for Optimization Problems with Constraints

    Authors: Chengjian Sun, Dong Liu, Chenyang Yang

    Abstract: In many optimization problems in wireless communications, the expressions of objective function or constraints are hard or even impossible to derive, which makes the solutions difficult to find. In this paper, we propose a model-free learning framework to solve constrained optimization problems without the supervision of the optimal solution. Neural networks are used respectively for parameterizin… ▽ More

    Submitted 29 July, 2019; originally announced July 2019.

    Comments: Submitted to Asia-Pacific Conference on Communications (APCC)

  30. arXiv:1906.05743  [pdf, other

    cs.LG cs.CV stat.ML

    Learning Video Representations using Contrastive Bidirectional Transformer

    Authors: Chen Sun, Fabien Baradel, Kevin Murphy, Cordelia Schmid

    Abstract: This paper proposes a self-supervised learning approach for video features that results in significantly improved performance on downstream tasks (such as video classification, captioning and segmentation) compared to existing methods. Our method extends the BERT model for text sequences to the case of sequences of real-valued feature vectors, by replacing the softmax loss with noise contrastive e… ▽ More

    Submitted 27 September, 2019; v1 submitted 13 June, 2019; originally announced June 2019.

  31. arXiv:1905.13014  [pdf, ps, other

    cs.NI eess.SP stat.ML

    Unsupervised Deep Learning for Ultra-reliable and Low-latency Communications

    Authors: Chengjian Sun, Chenyang Yang

    Abstract: In this paper, we study how to solve resource allocation problems in ultra-reliable and low-latency communications by unsupervised deep learning, which often yield functional optimization problems with quality-of-service (QoS) constraints. We take a joint power and bandwidth allocation problem as an example, which minimizes the total bandwidth required to guarantee the QoS of each user in terms of… ▽ More

    Submitted 5 June, 2019; v1 submitted 25 April, 2019; originally announced May 2019.

    Comments: 6 pages, 1 figure, submitted to IEEE for possible publication. arXiv admin note: text overlap with arXiv:1905.11017

  32. arXiv:1905.11017  [pdf, ps, other

    cs.LG eess.SP stat.ML

    Learning to Optimize with Unsupervised Learning: Training Deep Neural Networks for URLLC

    Authors: Chengjian Sun, Chenyang Yang

    Abstract: Learning the optimized solution as a function of environmental parameters is effective in solving numerical optimization in real time for time-sensitive applications. Existing works of learning to optimize train deep neural networks (DNN) with labels, and the learnt solution are inaccurate, which cannot be employed to ensure the stringent quality of service. In this paper, we propose a framework t… ▽ More

    Submitted 27 May, 2019; originally announced May 2019.

    Comments: 7 pages, 1 figure, submitted to IEEE for possible publication

  33. arXiv:1905.06744  [pdf, other

    eess.SP cs.LG stat.ML

    Forecasting Wireless Demand with Extreme Values using Feature Embedding in Gaussian Processes

    Authors: Chengyao Sun, Weisi Guo

    Abstract: Wireless traffic prediction is a fundamental enabler to proactive network optimisation in beyond 5G. Forecasting extreme demand spikes and troughs due to traffic mobility is essential to avoiding outages and improving energy efficiency. Current state-of-the-art deep learning forecasting methods predominantly focus on overall forecast performance and do not offer probabilistic uncertainty quantific… ▽ More

    Submitted 1 November, 2019; v1 submitted 15 May, 2019; originally announced May 2019.

  34. arXiv:1902.09641  [pdf, other

    cs.LG cs.CV stat.ML

    Stochastic Prediction of Multi-Agent Interactions from Partial Observations

    Authors: Chen Sun, Per Karlsson, Jiajun Wu, Joshua B Tenenbaum, Kevin Murphy

    Abstract: We present a method that learns to integrate temporal information, from a learned dynamics model, with ambiguous visual information, from a learned vision model, in the context of interacting agents. Our method is based on a graph-structured variational recurrent neural network (Graph-VRNN), which is trained end-to-end to infer the current state of the (partially observed) world, as well as to for… ▽ More

    Submitted 25 February, 2019; originally announced February 2019.

    Comments: ICLR 2019 camera ready

  35. arXiv:1812.02035  [pdf, other

    cs.LG stat.ML

    Stochastic Model Pruning via Weight Dropping Away and Back

    Authors: Haipeng Jia, Xueshuang Xiang, Da Fan, Meiyu Huang, Changhao Sun, Yang He

    Abstract: Deep neural networks have dramatically achieved great success on a variety of challenging tasks. However, most successful DNNs have an extremely complex structure, leading to extensive research on model compression.As a significant area of progress in model compression, traditional gradual pruning approaches involve an iterative prune-retrain procedure and may suffer from two critical issues: loca… ▽ More

    Submitted 9 April, 2020; v1 submitted 5 December, 2018; originally announced December 2018.

  36. arXiv:1809.00594  [pdf, other

    cs.LG cs.CV stat.ML

    Adversarial Attack Type I: Cheat Classifiers by Significant Changes

    Authors: Sanli Tang, Xiaolin Huang, Mingjian Chen, Chengjin Sun, Jie Yang

    Abstract: Despite the great success of deep neural networks, the adversarial attack can cheat some well-trained classifiers by small permutations. In this paper, we propose another type of adversarial attack that can cheat classifiers by significant changes. For example, we can significantly change a face but well-trained neural networks still recognize the adversarial and the original example as the same p… ▽ More

    Submitted 22 July, 2019; v1 submitted 3 September, 2018; originally announced September 2018.