-
Optimizing VarLiNGAM for Scalable and Efficient Time Series Causal Discovery
Authors:
Ziyang Jiao,
Ce Guo,
Wayne Luk
Abstract:
Causal discovery is designed to identify causal relationships in data, a task that has become increasingly complex due to the computational demands of traditional methods such as VarLiNGAM, which combines Vector Autoregressive Model with Linear Non-Gaussian Acyclic Model for time series data.
This study is dedicated to optimising causal discovery specifically for time series data, which is commo…
▽ More
Causal discovery is designed to identify causal relationships in data, a task that has become increasingly complex due to the computational demands of traditional methods such as VarLiNGAM, which combines Vector Autoregressive Model with Linear Non-Gaussian Acyclic Model for time series data.
This study is dedicated to optimising causal discovery specifically for time series data, which is common in practical applications. Time series causal discovery is particularly challenging due to the need to account for temporal dependencies and potential time lag effects. By designing a specialised dataset generator and reducing the computational complexity of the VarLiNGAM model from \( O(m^3 \cdot n) \) to \( O(m^3 + m^2 \cdot n) \), this study significantly improves the feasibility of processing large datasets. The proposed methods have been validated on advanced computational platforms and tested across simulated, real-world, and large-scale datasets, showcasing enhanced efficiency and performance. The optimised algorithm achieved 7 to 13 times speedup compared with the original algorithm and around 4.5 times speedup compared with the GPU-accelerated version on large-scale datasets with feature sizes between 200 and 400.
Our methods aim to push the boundaries of current causal discovery capabilities, making them more robust, scalable, and applicable to real-world scenarios, thus facilitating breakthroughs in various fields such as healthcare and finance.
△ Less
Submitted 9 September, 2024;
originally announced September 2024.
-
ROSE: Register Assisted General Time Series Forecasting with Decomposed Frequency Learning
Authors:
Yihang Wang,
Yuying Qiu,
Peng Chen,
Kai Zhao,
Yang Shu,
Zhongwen Rao,
Lujia Pan,
Bin Yang,
Chenjuan Guo
Abstract:
With the increasing collection of time series data from various domains, there arises a strong demand for general time series forecasting models pre-trained on a large number of time-series datasets to support a variety of downstream prediction tasks. Enabling general time series forecasting faces two challenges: how to obtain unified representations from multi-domian time series data, and how to…
▽ More
With the increasing collection of time series data from various domains, there arises a strong demand for general time series forecasting models pre-trained on a large number of time-series datasets to support a variety of downstream prediction tasks. Enabling general time series forecasting faces two challenges: how to obtain unified representations from multi-domian time series data, and how to capture domain-specific features from time series data across various domains for adaptive transfer in downstream tasks. To address these challenges, we propose a Register Assisted General Time Series Forecasting Model with Decomposed Frequency Learning (ROSE), a novel pre-trained model for time series forecasting. ROSE employs Decomposed Frequency Learning for the pre-training task, which decomposes coupled semantic and periodic information in time series with frequency-based masking and reconstruction to obtain unified representations across domains. We also equip ROSE with a Time Series Register, which learns to generate a register codebook to capture domain-specific representations during pre-training and enhances domain-adaptive transfer by selecting related register tokens on downstream tasks. After pre-training on large-scale time series data, ROSE achieves state-of-the-art forecasting performance on 8 real-world benchmarks. Remarkably, even in few-shot scenarios, it demonstrates competitive or superior performance compared to existing methods trained with full data.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Guarantees of confidentiality via Hammersley-Chapman-Robbins bounds
Authors:
Kamalika Chaudhuri,
Chuan Guo,
Laurens van der Maaten,
Saeed Mahloujifar,
Mark Tygert
Abstract:
Protecting privacy during inference with deep neural networks is possible by adding noise to the activations in the last layers prior to the final classifiers or other task-specific layers. The activations in such layers are known as "features" (or, less commonly, as "embeddings" or "feature embeddings"). The added noise helps prevent reconstruction of the inputs from the noisy features. Lower bou…
▽ More
Protecting privacy during inference with deep neural networks is possible by adding noise to the activations in the last layers prior to the final classifiers or other task-specific layers. The activations in such layers are known as "features" (or, less commonly, as "embeddings" or "feature embeddings"). The added noise helps prevent reconstruction of the inputs from the noisy features. Lower bounding the variance of every possible unbiased estimator of the inputs quantifies the confidentiality arising from such added noise. Convenient, computationally tractable bounds are available from classic inequalities of Hammersley and of Chapman and Robbins -- the HCR bounds. Numerical experiments indicate that the HCR bounds are on the precipice of being effectual for small neural nets with the data sets, "MNIST" and "CIFAR-10," which contain 10 classes each for image classification. The HCR bounds appear to be insufficient on their own to guarantee confidentiality of the inputs to inference with standard deep neural nets, "ResNet-18" and "Swin-T," pre-trained on the data set, "ImageNet-1000," which contains 1000 classes. Supplementing the addition of noise to features with other methods for providing confidentiality may be warranted in the case of ImageNet. In all cases, the results reported here limit consideration to amounts of added noise that incur little degradation in the accuracy of classification from the noisy features. Thus, the added noise enhances confidentiality without much reduction in the accuracy on the task of image classification.
△ Less
Submitted 17 June, 2024; v1 submitted 3 April, 2024;
originally announced April 2024.
-
Iterative missing value imputation based on feature importance
Authors:
Cong Guo,
Chun Liu,
Wei Yang
Abstract:
Many datasets suffer from missing values due to various reasons,which not only increases the processing difficulty of related tasks but also reduces the accuracy of classification. To address this problem, the mainstream approach is to use missing value imputation to complete the dataset. Existing imputation methods estimate the missing parts based on the observed values in the original feature sp…
▽ More
Many datasets suffer from missing values due to various reasons,which not only increases the processing difficulty of related tasks but also reduces the accuracy of classification. To address this problem, the mainstream approach is to use missing value imputation to complete the dataset. Existing imputation methods estimate the missing parts based on the observed values in the original feature space, and they treat all features as equally important during data completion, while in fact different features have different importance. Therefore, we have designed an imputation method that considers feature importance. This algorithm iteratively performs matrix completion and feature importance learning, and specifically, matrix completion is based on a filling loss that incorporates feature importance. Our experimental analysis involves three types of datasets: synthetic datasets with different noisy features and missing values, real-world datasets with artificially generated missing values, and real-world datasets originally containing missing values. The results on these datasets consistently show that the proposed method outperforms the existing five imputation algorithms.To the best of our knowledge, this is the first work that considers feature importance in the imputation model.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
Inexact iterative numerical linear algebra for neural network-based spectral estimation and rare-event prediction
Authors:
John Strahan,
Spencer C. Guo,
Chatipat Lorpaiboon,
Aaron R. Dinner,
Jonathan Weare
Abstract:
Understanding dynamics in complex systems is challenging because there are many degrees of freedom, and those that are most important for describing events of interest are often not obvious. The leading eigenfunctions of the transition operator are useful for visualization, and they can provide an efficient basis for computing statistics such as the likelihood and average time of events (predictio…
▽ More
Understanding dynamics in complex systems is challenging because there are many degrees of freedom, and those that are most important for describing events of interest are often not obvious. The leading eigenfunctions of the transition operator are useful for visualization, and they can provide an efficient basis for computing statistics such as the likelihood and average time of events (predictions). Here we develop inexact iterative linear algebra methods for computing these eigenfunctions (spectral estimation) and making predictions from a data set of short trajectories sampled at finite intervals. We demonstrate the methods on a low-dimensional model that facilitates visualization and a high-dimensional model of a biomolecular system. Implications for the prediction problem in reinforcement learning are discussed.
△ Less
Submitted 20 July, 2023; v1 submitted 22 March, 2023;
originally announced March 2023.
-
Origins of Low-dimensional Adversarial Perturbations
Authors:
Elvis Dohmatob,
Chuan Guo,
Morgane Goibert
Abstract:
In this paper, we initiate a rigorous study of the phenomenon of low-dimensional adversarial perturbations (LDAPs) in classification. Unlike the classical setting, these perturbations are limited to a subspace of dimension $k$ which is much smaller than the dimension $d$ of the feature space. The case $k=1$ corresponds to so-called universal adversarial perturbations (UAPs; Moosavi-Dezfooli et al.…
▽ More
In this paper, we initiate a rigorous study of the phenomenon of low-dimensional adversarial perturbations (LDAPs) in classification. Unlike the classical setting, these perturbations are limited to a subspace of dimension $k$ which is much smaller than the dimension $d$ of the feature space. The case $k=1$ corresponds to so-called universal adversarial perturbations (UAPs; Moosavi-Dezfooli et al., 2017). First, we consider binary classifiers under generic regularity conditions (including ReLU networks) and compute analytical lower-bounds for the fooling rate of any subspace. These bounds explicitly highlight the dependence of the fooling rate on the pointwise margin of the model (i.e., the ratio of the output to its $L_2$ norm of its gradient at a test point), and on the alignment of the given subspace with the gradients of the model w.r.t. inputs. Our results provide a rigorous explanation for the recent success of heuristic methods for efficiently generating low-dimensional adversarial perturbations. Finally, we show that if a decision-region is compact, then it admits a universal adversarial perturbation with $L_2$ norm which is $\sqrt{d}$ times smaller than the typical $L_2$ norm of a data point. Our theoretical results are confirmed by experiments on both synthetic and real data.
△ Less
Submitted 4 July, 2022; v1 submitted 25 March, 2022;
originally announced March 2022.
-
Model Averaging based Semiparametric Modelling for Conditional Quantile Prediction
Authors:
Chaohui Guo,
Wenyang Zhang
Abstract:
In real data analysis, the underlying model is usually unknown, modelling strategy plays a key role in the success of data analysis. Stimulated by the idea of model averaging, we propose a novel semiparametric modelling strategy for conditional quantile prediction, without assuming the underlying model is any specific parametric or semiparametric model. Thanks the optimality of the selected weight…
▽ More
In real data analysis, the underlying model is usually unknown, modelling strategy plays a key role in the success of data analysis. Stimulated by the idea of model averaging, we propose a novel semiparametric modelling strategy for conditional quantile prediction, without assuming the underlying model is any specific parametric or semiparametric model. Thanks the optimality of the selected weights by cross-validation, the proposed modelling strategy results in a more accurate prediction than that based on some commonly used semiparametric models, such as the varying coefficient models and additive models. Asymptotic properties are established of the proposed modelling strategy together with its estimation procedure. Intensive simulation studies are conducted to demonstrate how well the proposed method works, compared with its alternatives under various circumstances. The results show the proposed method indeed leads to more accurate predictions than its alternatives. Finally, the proposed modelling strategy together with its prediction procedure are applied to the Boston housing data, which result in more accurate predictions of the quantiles of the house prices than that based on some commonly used alternative methods, therefore, present us a more accurate picture of the housing market in Boston.
△ Less
Submitted 18 March, 2022;
originally announced March 2022.
-
Online Adaptation to Label Distribution Shift
Authors:
Ruihan Wu,
Chuan Guo,
Yi Su,
Kilian Q. Weinberger
Abstract:
Machine learning models often encounter distribution shifts when deployed in the real world. In this paper, we focus on adaptation to label distribution shift in the online setting, where the test-time label distribution is continually changing and the model must dynamically adapt to it without observing the true label. Leveraging a novel analysis, we show that the lack of true label does not hind…
▽ More
Machine learning models often encounter distribution shifts when deployed in the real world. In this paper, we focus on adaptation to label distribution shift in the online setting, where the test-time label distribution is continually changing and the model must dynamically adapt to it without observing the true label. Leveraging a novel analysis, we show that the lack of true label does not hinder estimation of the expected test loss, which enables the reduction of online label shift adaptation to conventional online learning. Informed by this observation, we propose adaptation algorithms inspired by classical online learning techniques such as Follow The Leader (FTL) and Online Gradient Descent (OGD) and derive their regret bounds. We empirically verify our findings under both simulated and real world label distribution shifts and show that OGD is particularly effective and robust to a variety of challenging label shift scenarios.
△ Less
Submitted 5 January, 2022; v1 submitted 9 July, 2021;
originally announced July 2021.
-
An Analysis of Alternating Direction Method of Multipliers for Feed-forward Neural Networks
Authors:
Seyedeh Niusha Alavi Foumani,
Ce Guo,
Wayne Luk
Abstract:
In this work, we present a hardware compatible neural network training algorithm in which we used alternating direction method of multipliers (ADMM) and iterative least-square methods. The motive behind this approach was to conduct a method of training neural networks that is scalable and can be parallelised. These characteristics make this algorithm suitable for hardware implementation. We have a…
▽ More
In this work, we present a hardware compatible neural network training algorithm in which we used alternating direction method of multipliers (ADMM) and iterative least-square methods. The motive behind this approach was to conduct a method of training neural networks that is scalable and can be parallelised. These characteristics make this algorithm suitable for hardware implementation. We have achieved 6.9\% and 6.8\% better accuracy comparing to SGD and Adam respectively, with a four-layer neural network with hidden size of 28 on HIGGS dataset. Likewise, we could observe 21.0\% and 2.2\% accuracy improvement comparing to SGD and Adam respectively, on IRIS dataset with a three-layer neural network with hidden size of 8. This is while the use of matrix inversion, which is challenging for hardware implementation, is avoided in this method. We assessed the impact of avoiding matrix inversion on ADMM accuracy and we observed that we can safely replace matrix inversion with iterative least-square methods and maintain the desired performance. Also, the computational complexity of the implemented method is polynomial regarding dimensions of the input dataset and hidden size of the network.
△ Less
Submitted 6 September, 2020;
originally announced September 2020.
-
An Intelligent CNN-VAE Text Representation Technology Based on Text Semantics for Comprehensive Big Data
Authors:
Genggeng Liu,
Canyang Guo,
Lin Xie,
Wenxi Liu,
Naixue Xiong,
Guolong Chen
Abstract:
In the era of big data, a large number of text data generated by the Internet has given birth to a variety of text representation methods. In natural language processing (NLP), text representation transforms text into vectors that can be processed by computer without losing the original semantic information. However, these methods are difficult to effectively extract the semantic features among wo…
▽ More
In the era of big data, a large number of text data generated by the Internet has given birth to a variety of text representation methods. In natural language processing (NLP), text representation transforms text into vectors that can be processed by computer without losing the original semantic information. However, these methods are difficult to effectively extract the semantic features among words and distinguish polysemy in language. Therefore, a text feature representation model based on convolutional neural network (CNN) and variational autoencoder (VAE) is proposed to extract the text features and apply the obtained text feature representation on the text classification tasks. CNN is used to extract the features of text vector to get the semantics among words and VAE is introduced to make the text feature space more consistent with Gaussian distribution. In addition, the output of the improved word2vec model is employed as the input of the proposed model to distinguish different meanings of the same word in different contexts. The experimental results show that the proposed model outperforms in k-nearest neighbor (KNN), random forest (RF) and support vector machine (SVM) classification algorithms.
△ Less
Submitted 28 August, 2020;
originally announced August 2020.
-
Software Engineering Event Modeling using Relative Time in Temporal Knowledge Graphs
Authors:
Kian Ahrabian,
Daniel Tarlow,
Hehuimin Cheng,
Jin L. C. Guo
Abstract:
We present a multi-relational temporal Knowledge Graph based on the daily interactions between artifacts in GitHub, one of the largest social coding platforms. Such representation enables posing many user-activity and project management questions as link prediction and time queries over the knowledge graph. In particular, we introduce two new datasets for i) interpolated time-conditioned link pred…
▽ More
We present a multi-relational temporal Knowledge Graph based on the daily interactions between artifacts in GitHub, one of the largest social coding platforms. Such representation enables posing many user-activity and project management questions as link prediction and time queries over the knowledge graph. In particular, we introduce two new datasets for i) interpolated time-conditioned link prediction and ii) extrapolated time-conditioned link/time prediction queries, each with distinguished properties. Our experiments on these datasets highlight the potential of adapting knowledge graphs to answer broad software engineering questions. Meanwhile, it also reveals the unsatisfactory performance of existing temporal models on extrapolated queries and time prediction queries in general. To overcome these shortcomings, we introduce an extension to current temporal models using relative temporal information with regards to past events.
△ Less
Submitted 12 July, 2020; v1 submitted 2 July, 2020;
originally announced July 2020.
-
Drug-Drug Interaction Prediction with Wasserstein Adversarial Autoencoder-based Knowledge Graph Embeddings
Authors:
Yuanfei Dai,
Chenhao Guo,
Wenzhong Guo,
Carsten Eickhoff
Abstract:
Interaction between pharmacological agents can trigger unexpected adverse events. Capturing richer and more comprehensive information about drug-drug interactions (DDI) is one of the key tasks in public health and drug development. Recently, several knowledge graph embedding approaches have received increasing attention in the DDI domain due to their capability of projecting drugs and interactions…
▽ More
Interaction between pharmacological agents can trigger unexpected adverse events. Capturing richer and more comprehensive information about drug-drug interactions (DDI) is one of the key tasks in public health and drug development. Recently, several knowledge graph embedding approaches have received increasing attention in the DDI domain due to their capability of projecting drugs and interactions into a low-dimensional feature space for predicting links and classifying triplets. However, existing methods only apply a uniformly random mode to construct negative samples. As a consequence, these samples are often too simplistic to train an effective model. In this paper, we propose a new knowledge graph embedding framework by introducing adversarial autoencoders (AAE) based on Wasserstein distances and Gumbel-Softmax relaxation for drug-drug interactions tasks. In our framework, the autoencoder is employed to generate high-quality negative samples and the hidden vector of the autoencoder is regarded as a plausible drug candidate. Afterwards, the discriminator learns the embeddings of drugs and interactions based on both positive and negative triplets. Meanwhile, in order to solve vanishing gradient problems on the discrete representation--an inherent flaw in traditional generative models--we utilize the Gumbel-Softmax relaxation and the Wasserstein distance to train the embedding model steadily. We empirically evaluate our method on two tasks, link prediction and DDI classification. The experimental results show that our framework can attain significant improvements and noticeably outperform competitive baselines.
△ Less
Submitted 15 October, 2020; v1 submitted 15 April, 2020;
originally announced April 2020.
-
On Hiding Neural Networks Inside Neural Networks
Authors:
Chuan Guo,
Ruihan Wu,
Kilian Q. Weinberger
Abstract:
Modern neural networks often contain significantly more parameters than the size of their training data. We show that this excess capacity provides an opportunity for embedding secret machine learning models within a trained neural network. Our novel framework hides the existence of a secret neural network with arbitrary desired functionality within a carrier network. We prove theoretically that t…
▽ More
Modern neural networks often contain significantly more parameters than the size of their training data. We show that this excess capacity provides an opportunity for embedding secret machine learning models within a trained neural network. Our novel framework hides the existence of a secret neural network with arbitrary desired functionality within a carrier network. We prove theoretically that the secret network's detection is computationally infeasible and demonstrate empirically that the carrier network does not compromise the secret network's disguise. Our paper introduces a previously unknown steganographic technique that can be exploited by adversaries if left unchecked.
△ Less
Submitted 21 May, 2021; v1 submitted 24 February, 2020;
originally announced February 2020.
-
Secure multiparty computations in floating-point arithmetic
Authors:
Chuan Guo,
Awni Hannun,
Brian Knott,
Laurens van der Maaten,
Mark Tygert,
Ruiyu Zhu
Abstract:
Secure multiparty computations enable the distribution of so-called shares of sensitive data to multiple parties such that the multiple parties can effectively process the data while being unable to glean much information about the data (at least not without collusion among all parties to put back together all the shares). Thus, the parties may conspire to send all their processed results to a tru…
▽ More
Secure multiparty computations enable the distribution of so-called shares of sensitive data to multiple parties such that the multiple parties can effectively process the data while being unable to glean much information about the data (at least not without collusion among all parties to put back together all the shares). Thus, the parties may conspire to send all their processed results to a trusted third party (perhaps the data provider) at the conclusion of the computations, with only the trusted third party being able to view the final results. Secure multiparty computations for privacy-preserving machine-learning turn out to be possible using solely standard floating-point arithmetic, at least with a carefully controlled leakage of information less than the loss of accuracy due to roundoff, all backed by rigorous mathematical proofs of worst-case bounds on information loss and numerical stability in finite-precision arithmetic. Numerical examples illustrate the high performance attained on commodity off-the-shelf hardware for generalized linear models, including ordinary linear least-squares regression, binary and multinomial logistic regression, probit regression, and Poisson regression.
△ Less
Submitted 9 January, 2020;
originally announced January 2020.
-
Certified Data Removal from Machine Learning Models
Authors:
Chuan Guo,
Tom Goldstein,
Awni Hannun,
Laurens van der Maaten
Abstract:
Good data stewardship requires removal of data at the request of the data's owner. This raises the question if and how a trained machine-learning model, which implicitly stores information about its training data, should be affected by such a removal request. Is it possible to "remove" data from a machine-learning model? We study this problem by defining certified removal: a very strong theoretica…
▽ More
Good data stewardship requires removal of data at the request of the data's owner. This raises the question if and how a trained machine-learning model, which implicitly stores information about its training data, should be affected by such a removal request. Is it possible to "remove" data from a machine-learning model? We study this problem by defining certified removal: a very strong theoretical guarantee that a model from which data is removed cannot be distinguished from a model that never observed the data to begin with. We develop a certified-removal mechanism for linear classifiers and empirically study learning settings in which this mechanism is practical.
△ Less
Submitted 7 November, 2023; v1 submitted 7 November, 2019;
originally announced November 2019.
-
A New Defense Against Adversarial Images: Turning a Weakness into a Strength
Authors:
Tao Yu,
Shengyuan Hu,
Chuan Guo,
Wei-Lun Chao,
Kilian Q. Weinberger
Abstract:
Natural images are virtually surrounded by low-density misclassified regions that can be efficiently discovered by gradient-guided search --- enabling the generation of adversarial images. While many techniques for detecting these attacks have been proposed, they are easily bypassed when the adversary has full knowledge of the detection mechanism and adapts the attack strategy accordingly. In this…
▽ More
Natural images are virtually surrounded by low-density misclassified regions that can be efficiently discovered by gradient-guided search --- enabling the generation of adversarial images. While many techniques for detecting these attacks have been proposed, they are easily bypassed when the adversary has full knowledge of the detection mechanism and adapts the attack strategy accordingly. In this paper, we adopt a novel perspective and regard the omnipresence of adversarial perturbations as a strength rather than a weakness. We postulate that if an image has been tampered with, these adversarial directions either become harder to find with gradient methods or have substantially higher density than for natural images. We develop a practical test for this signature characteristic to successfully detect adversarial attacks, achieving unprecedented accuracy under the white-box setting where the adversary is given full knowledge of our detection mechanism.
△ Less
Submitted 3 December, 2019; v1 submitted 16 October, 2019;
originally announced October 2019.
-
Canonical Correlation Analysis (CCA) Based Multi-View Learning: An Overview
Authors:
Chenfeng Guo,
Dongrui Wu
Abstract:
Multi-view learning (MVL) is a strategy for fusing data from different sources or subsets. Canonical correlation analysis (CCA) is very important in MVL, whose main idea is to map data from different views onto a common space with maximum correlation. Traditional CCA can only be used to calculate the linear correlation of two views. Besides, it is unsupervised and the label information is wasted.…
▽ More
Multi-view learning (MVL) is a strategy for fusing data from different sources or subsets. Canonical correlation analysis (CCA) is very important in MVL, whose main idea is to map data from different views onto a common space with maximum correlation. Traditional CCA can only be used to calculate the linear correlation of two views. Besides, it is unsupervised and the label information is wasted. Many nonlinear, supervised, or generalized extensions have been proposed to overcome these limitations. However, to our knowledge, there is no overview for these approaches. This paper provides an overview of many representative CCA-based MVL approaches.
△ Less
Submitted 1 May, 2021; v1 submitted 2 July, 2019;
originally announced July 2019.
-
Simple Black-box Adversarial Attacks
Authors:
Chuan Guo,
Jacob R. Gardner,
Yurong You,
Andrew Gordon Wilson,
Kilian Q. Weinberger
Abstract:
We propose an intriguingly simple method for the construction of adversarial images in the black-box setting. In constrast to the white-box scenario, constructing black-box adversarial images has the additional constraint on query budget, and efficient attacks remain an open problem to date. With only the mild assumption of continuous-valued confidence scores, our highly query-efficient algorithm…
▽ More
We propose an intriguingly simple method for the construction of adversarial images in the black-box setting. In constrast to the white-box scenario, constructing black-box adversarial images has the additional constraint on query budget, and efficient attacks remain an open problem to date. With only the mild assumption of continuous-valued confidence scores, our highly query-efficient algorithm utilizes the following simple iterative principle: we randomly sample a vector from a predefined orthonormal basis and either add or subtract it to the target image. Despite its simplicity, the proposed method can be used for both untargeted and targeted attacks -- resulting in previously unprecedented query efficiency in both settings. We demonstrate the efficacy and efficiency of our algorithm on several real world settings including the Google Cloud Vision API. We argue that our proposed algorithm should serve as a strong baseline for future black-box attacks, in particular because it is extremely fast and its implementation requires less than 20 lines of PyTorch code.
△ Less
Submitted 15 August, 2019; v1 submitted 17 May, 2019;
originally announced May 2019.
-
Adversarial Defense Through Network Profiling Based Path Extraction
Authors:
Yuxian Qiu,
Jingwen Leng,
Cong Guo,
Quan Chen,
Chao Li,
Minyi Guo,
Yuhao Zhu
Abstract:
Recently, researchers have started decomposing deep neural network models according to their semantics or functions. Recent work has shown the effectiveness of decomposed functional blocks for defending adversarial attacks, which add small input perturbation to the input image to fool the DNN models. This work proposes a profiling-based method to decompose the DNN models to different functional bl…
▽ More
Recently, researchers have started decomposing deep neural network models according to their semantics or functions. Recent work has shown the effectiveness of decomposed functional blocks for defending adversarial attacks, which add small input perturbation to the input image to fool the DNN models. This work proposes a profiling-based method to decompose the DNN models to different functional blocks, which lead to the effective path as a new approach to exploring DNNs' internal organization. Specifically, the per-image effective path can be aggregated to the class-level effective path, through which we observe that adversarial images activate effective path different from normal images. We propose an effective path similarity-based method to detect adversarial images with an interpretable model, which achieve better accuracy and broader applicability than the state-of-the-art technique.
△ Less
Submitted 9 May, 2019; v1 submitted 17 April, 2019;
originally announced April 2019.
-
Recurrent Multi-Graph Neural Networks for Travel Cost Prediction
Authors:
Jilin Hu,
Chenjuan Guo,
Bin Yang,
Christian S. Jensen,
Lu Chen
Abstract:
Origin-destination (OD) matrices are often used in urban planning, where a city is partitioned into regions and an element (i, j) in an OD matrix records the cost (e.g., travel time, fuel consumption, or travel speed) from region i to region j. In this paper, we partition a day into multiple intervals, e.g., 96 15-min intervals and each interval is associated with an OD matrix which represents the…
▽ More
Origin-destination (OD) matrices are often used in urban planning, where a city is partitioned into regions and an element (i, j) in an OD matrix records the cost (e.g., travel time, fuel consumption, or travel speed) from region i to region j. In this paper, we partition a day into multiple intervals, e.g., 96 15-min intervals and each interval is associated with an OD matrix which represents the costs in the interval; and we consider sparse and stochastic OD matrices, where the elements represent stochastic but not deterministic costs and some elements are missing due to lack of data between two regions. We solve the sparse, stochastic OD matrix forecasting problem. Given a sequence of historical OD matrices that are sparse, we aim at predicting future OD matrices with no empty elements. We propose a generic learning framework to solve the problem by dealing with sparse matrices via matrix factorization and two graph convolutional neural networks and capturing temporal dynamics via recurrent neural network. Empirical studies using two taxi datasets from different countries verify the effectiveness of the proposed framework.
△ Less
Submitted 13 November, 2018;
originally announced November 2018.
-
Correlated Time Series Forecasting using Deep Neural Networks: A Summary of Results
Authors:
Razvan-Gabriel Cirstea,
Darius-Valer Micu,
Gabriel-Marcel Muresan,
Chenjuan Guo,
Bin Yang
Abstract:
Cyber-physical systems often consist of entities that interact with each other over time. Meanwhile, as part of the continued digitization of industrial processes, various sensor technologies are deployed that enable us to record time-varying attributes (a.k.a., time series) of such entities, thus producing correlated time series. To enable accurate forecasting on such correlated time series, this…
▽ More
Cyber-physical systems often consist of entities that interact with each other over time. Meanwhile, as part of the continued digitization of industrial processes, various sensor technologies are deployed that enable us to record time-varying attributes (a.k.a., time series) of such entities, thus producing correlated time series. To enable accurate forecasting on such correlated time series, this paper proposes two models that combine convolutional neural networks (CNNs) and recurrent neural networks (RNNs). The first model employs a CNN on each individual time series, combines the convoluted features, and then applies an RNN on top of the convoluted features in the end to enable forecasting. The second model adds additional auto-encoders into the individual CNNs, making the second model a multi-task learning model, which provides accurate and robust forecasting. Experiments on two real-world correlated time series data set suggest that the proposed two models are effective and outperform baselines in most settings.
This report extends the paper "Correlated Time Series Forecasting using Multi-Task Deep Neural Networks," to appear in ACM CIKM 2018, by providing additional experimental results.
△ Less
Submitted 30 August, 2018; v1 submitted 29 August, 2018;
originally announced August 2018.
-
Feature Dimensionality Reduction for Video Affect Classification: A Comparative Study
Authors:
Chenfeng Guo,
Dongrui Wu
Abstract:
Affective computing has become a very important research area in human-machine interaction. However, affects are subjective, subtle, and uncertain. So, it is very difficult to obtain a large number of labeled training samples, compared with the number of possible features we could extract. Thus, dimensionality reduction is critical in affective computing. This paper presents our preliminary study…
▽ More
Affective computing has become a very important research area in human-machine interaction. However, affects are subjective, subtle, and uncertain. So, it is very difficult to obtain a large number of labeled training samples, compared with the number of possible features we could extract. Thus, dimensionality reduction is critical in affective computing. This paper presents our preliminary study on dimensionality reduction for affect classification. Five popular dimensionality reduction approaches are introduced and compared. Experiments on the DEAP dataset showed that no approach can universally outperform others, and performing classification using the raw features directly may not always be a bad choice.
△ Less
Submitted 8 August, 2018;
originally announced August 2018.
-
An empirical study on evaluation metrics of generative adversarial networks
Authors:
Qiantong Xu,
Gao Huang,
Yang Yuan,
Chuan Guo,
Yu Sun,
Felix Wu,
Kilian Weinberger
Abstract:
Evaluating generative adversarial networks (GANs) is inherently challenging. In this paper, we revisit several representative sample-based evaluation metrics for GANs, and address the problem of how to evaluate the evaluation metrics. We start with a few necessary conditions for metrics to produce meaningful scores, such as distinguishing real from generated samples, identifying mode dropping and…
▽ More
Evaluating generative adversarial networks (GANs) is inherently challenging. In this paper, we revisit several representative sample-based evaluation metrics for GANs, and address the problem of how to evaluate the evaluation metrics. We start with a few necessary conditions for metrics to produce meaningful scores, such as distinguishing real from generated samples, identifying mode dropping and mode collapsing, and detecting overfitting. With a series of carefully designed experiments, we comprehensively investigate existing sample-based metrics and identify their strengths and limitations in practical settings. Based on these results, we observe that kernel Maximum Mean Discrepancy (MMD) and the 1-Nearest-Neighbor (1-NN) two-sample test seem to satisfy most of the desirable properties, provided that the distances between samples are computed in a suitable feature space. Our experiments also unveil interesting properties about the behavior of several popular GAN models, such as whether they are memorizing training samples, and how far they are from learning the target distribution.
△ Less
Submitted 16 August, 2018; v1 submitted 19 June, 2018;
originally announced June 2018.
-
OMG - Emotion Challenge Solution
Authors:
Yuqi Cui,
Xiao Zhang,
Yang Wang,
Chenfeng Guo,
Dongrui Wu
Abstract:
This short paper describes our solution to the 2018 IEEE World Congress on Computational Intelligence One-Minute Gradual-Emotional Behavior Challenge, whose goal was to estimate continuous arousal and valence values from short videos. We designed four base regression models using visual and audio features, and then used a spectral approach to fuse them to obtain improved performance.
This short paper describes our solution to the 2018 IEEE World Congress on Computational Intelligence One-Minute Gradual-Emotional Behavior Challenge, whose goal was to estimate continuous arousal and valence values from short videos. We designed four base regression models using visual and audio features, and then used a spectral approach to fuse them to obtain improved performance.
△ Less
Submitted 30 April, 2018;
originally announced May 2018.