-
Compact Multi-level Sparse Neural Networks with Input Independent Dynamic Rerouting
Authors:
Minghai Qin,
Tianyun Zhang,
Fei Sun,
Yen-Kuang Chen,
Makan Fardad,
Yanzhi Wang,
Yuan Xie
Abstract:
Deep neural networks (DNNs) have shown to provide superb performance in many real life applications, but their large computation cost and storage requirement have prevented them from being deployed to many edge and internet-of-things (IoT) devices. Sparse deep neural networks, whose majority weight parameters are zeros, can substantially reduce the computation complexity and memory consumption of…
▽ More
Deep neural networks (DNNs) have shown to provide superb performance in many real life applications, but their large computation cost and storage requirement have prevented them from being deployed to many edge and internet-of-things (IoT) devices. Sparse deep neural networks, whose majority weight parameters are zeros, can substantially reduce the computation complexity and memory consumption of the models. In real-use scenarios, devices may suffer from large fluctuations of the available computation and memory resources under different environment, and the quality of service (QoS) is difficult to maintain due to the long tail inferences with large latency. Facing the real-life challenges, we propose to train a sparse model that supports multiple sparse levels. That is, a hierarchical structure of weights are satisfied such that the locations and the values of the non-zero parameters of the more-sparse sub-model area subset of the less-sparse sub-model. In this way, one can dynamically select the appropriate sparsity level during inference, while the storage cost is capped by the least sparse sub-model. We have verified our methodologies on a variety of DNN models and tasks, including the ResNet-50, PointNet++, GNMT, and graph attention networks. We obtain sparse sub-models with an average of 13.38% weights and 14.97% FLOPs, while the accuracies are as good as their dense counterparts. More-sparse sub-models with 5.38% weights and 4.47% of FLOPs, which are subsets of the less-sparse ones, can be obtained with only 3.25% relative accuracy loss.
△ Less
Submitted 20 December, 2021;
originally announced December 2021.
-
A Unified DNN Weight Compression Framework Using Reweighted Optimization Methods
Authors:
Tianyun Zhang,
Xiaolong Ma,
Zheng Zhan,
Shanglin Zhou,
Minghai Qin,
Fei Sun,
Yen-Kuang Chen,
Caiwen Ding,
Makan Fardad,
Yanzhi Wang
Abstract:
To address the large model size and intensive computation requirement of deep neural networks (DNNs), weight pruning techniques have been proposed and generally fall into two categories, i.e., static regularization-based pruning and dynamic regularization-based pruning. However, the former method currently suffers either complex workloads or accuracy degradation, while the latter one takes a long…
▽ More
To address the large model size and intensive computation requirement of deep neural networks (DNNs), weight pruning techniques have been proposed and generally fall into two categories, i.e., static regularization-based pruning and dynamic regularization-based pruning. However, the former method currently suffers either complex workloads or accuracy degradation, while the latter one takes a long time to tune the parameters to achieve the desired pruning rate without accuracy loss. In this paper, we propose a unified DNN weight pruning framework with dynamically updated regularization terms bounded by the designated constraint, which can generate both non-structured sparsity and different kinds of structured sparsity. We also extend our method to an integrated framework for the combination of different DNN compression tasks.
△ Less
Submitted 11 April, 2020;
originally announced April 2020.
-
Adversarial Attack Generation Empowered by Min-Max Optimization
Authors:
Jingkang Wang,
Tianyun Zhang,
Sijia Liu,
Pin-Yu Chen,
Jiacen Xu,
Makan Fardad,
Bo Li
Abstract:
The worst-case training principle that minimizes the maximal adversarial loss, also known as adversarial training (AT), has shown to be a state-of-the-art approach for enhancing adversarial robustness. Nevertheless, min-max optimization beyond the purpose of AT has not been rigorously explored in the adversarial context. In this paper, we show how a general framework of min-max optimization over m…
▽ More
The worst-case training principle that minimizes the maximal adversarial loss, also known as adversarial training (AT), has shown to be a state-of-the-art approach for enhancing adversarial robustness. Nevertheless, min-max optimization beyond the purpose of AT has not been rigorously explored in the adversarial context. In this paper, we show how a general framework of min-max optimization over multiple domains can be leveraged to advance the design of different types of adversarial attacks. In particular, given a set of risk sources, minimizing the worst-case attack loss can be reformulated as a min-max problem by introducing domain weights that are maximized over the probability simplex of the domain set. We showcase this unified framework in three attack generation problems -- attacking model ensembles, devising universal perturbation under multiple inputs, and crafting attacks resilient to data transformations. Extensive experiments demonstrate that our approach leads to substantial attack improvement over the existing heuristic strategies as well as robustness improvement over state-of-the-art defense methods trained to be robust against multiple perturbation types. Furthermore, we find that the self-adjusted domain weights learned from our min-max framework can provide a holistic tool to explain the difficulty level of attack across domains. Code is available at https://github.com/wangjksjtu/minmax-adv.
△ Less
Submitted 1 November, 2021; v1 submitted 9 June, 2019;
originally announced June 2019.
-
Progressive DNN Compression: A Key to Achieve Ultra-High Weight Pruning and Quantization Rates using ADMM
Authors:
Shaokai Ye,
Xiaoyu Feng,
Tianyun Zhang,
Xiaolong Ma,
Sheng Lin,
Zhengang Li,
Kaidi Xu,
Wujie Wen,
Sijia Liu,
Jian Tang,
Makan Fardad,
Xue Lin,
Yongpan Liu,
Yanzhi Wang
Abstract:
Weight pruning and weight quantization are two important categories of DNN model compression. Prior work on these techniques are mainly based on heuristics. A recent work developed a systematic frame-work of DNN weight pruning using the advanced optimization technique ADMM (Alternating Direction Methods of Multipliers), achieving one of state-of-art in weight pruning results. In this work, we firs…
▽ More
Weight pruning and weight quantization are two important categories of DNN model compression. Prior work on these techniques are mainly based on heuristics. A recent work developed a systematic frame-work of DNN weight pruning using the advanced optimization technique ADMM (Alternating Direction Methods of Multipliers), achieving one of state-of-art in weight pruning results. In this work, we first extend such one-shot ADMM-based framework to guarantee solution feasibility and provide fast convergence rate, and generalize to weight quantization as well. We have further developed a multi-step, progressive DNN weight pruning and quantization framework, with dual benefits of (i) achieving further weight pruning/quantization thanks to the special property of ADMM regularization, and (ii) reducing the search space within each step. Extensive experimental results demonstrate the superior performance compared with prior work. Some highlights: (i) we achieve 246x,36x, and 8x weight pruning on LeNet-5, AlexNet, and ResNet-50 models, respectively, with (almost) zero accuracy loss; (ii) even a significant 61x weight pruning in AlexNet (ImageNet) results in only minor degradation in actual accuracy compared with prior work; (iii) we are among the first to derive notable weight pruning results for ResNet and MobileNet models; (iv) we derive the first lossless, fully binarized (for all layers) LeNet-5 for MNIST and VGG-16 for CIFAR-10; and (v) we derive the first fully binarized (for all layers) ResNet for ImageNet with reasonable accuracy loss.
△ Less
Submitted 29 March, 2019; v1 submitted 23 March, 2019;
originally announced March 2019.
-
Progressive Weight Pruning of Deep Neural Networks using ADMM
Authors:
Shaokai Ye,
Tianyun Zhang,
Kaiqi Zhang,
Jiayu Li,
Kaidi Xu,
Yunfei Yang,
Fuxun Yu,
Jian Tang,
Makan Fardad,
Sijia Liu,
Xiang Chen,
Xue Lin,
Yanzhi Wang
Abstract:
Deep neural networks (DNNs) although achieving human-level performance in many domains, have very large model size that hinders their broader applications on edge computing devices. Extensive research work have been conducted on DNN model compression or pruning. However, most of the previous work took heuristic approaches. This work proposes a progressive weight pruning approach based on ADMM (Alt…
▽ More
Deep neural networks (DNNs) although achieving human-level performance in many domains, have very large model size that hinders their broader applications on edge computing devices. Extensive research work have been conducted on DNN model compression or pruning. However, most of the previous work took heuristic approaches. This work proposes a progressive weight pruning approach based on ADMM (Alternating Direction Method of Multipliers), a powerful technique to deal with non-convex optimization problems with potentially combinatorial constraints. Motivated by dynamic programming, the proposed method reaches extremely high pruning rate by using partial prunings with moderate pruning rates. Therefore, it resolves the accuracy degradation and long convergence time problems when pursuing extremely high pruning ratios. It achieves up to 34 times pruning rate for ImageNet dataset and 167 times pruning rate for MNIST dataset, significantly higher than those reached by the literature work. Under the same number of epochs, the proposed method also achieves faster convergence and higher compression rates. The codes and pruned DNN models are released in the link bit.ly/2zxdlss
△ Less
Submitted 4 November, 2018; v1 submitted 16 October, 2018;
originally announced October 2018.
-
StructADMM: A Systematic, High-Efficiency Framework of Structured Weight Pruning for DNNs
Authors:
Tianyun Zhang,
Shaokai Ye,
Kaiqi Zhang,
Xiaolong Ma,
Ning Liu,
Linfeng Zhang,
Jian Tang,
Kaisheng Ma,
Xue Lin,
Makan Fardad,
Yanzhi Wang
Abstract:
Weight pruning methods of DNNs have been demonstrated to achieve a good model pruning rate without loss of accuracy, thereby alleviating the significant computation/storage requirements of large-scale DNNs. Structured weight pruning methods have been proposed to overcome the limitation of irregular network structure and demonstrated actual GPU acceleration. However, in prior work the pruning rate…
▽ More
Weight pruning methods of DNNs have been demonstrated to achieve a good model pruning rate without loss of accuracy, thereby alleviating the significant computation/storage requirements of large-scale DNNs. Structured weight pruning methods have been proposed to overcome the limitation of irregular network structure and demonstrated actual GPU acceleration. However, in prior work the pruning rate (degree of sparsity) and GPU acceleration are limited (to less than 50%) when accuracy needs to be maintained. In this work,we overcome these limitations by proposing a unified, systematic framework of structured weight pruning for DNNs. It is a framework that can be used to induce different types of structured sparsity, such as filter-wise, channel-wise, and shape-wise sparsity, as well non-structured sparsity. The proposed framework incorporates stochastic gradient descent with ADMM, and can be understood as a dynamic regularization method in which the regularization target is analytically updated in each iteration. Without loss of accuracy on the AlexNet model, we achieve 2.58X and 3.65X average measured speedup on two GPUs, clearly outperforming the prior work. The average speedups reach 3.15X and 8.52X when allowing a moderate ac-curacy loss of 2%. In this case the model compression for convolutional layers is 15.0X, corresponding to 11.93X measured CPU speedup. Our experiments on ResNet model and on other data sets like UCF101 and CIFAR-10 demonstrate the consistently higher performance of our framework.
△ Less
Submitted 26 March, 2019; v1 submitted 29 July, 2018;
originally announced July 2018.
-
A Systematic DNN Weight Pruning Framework using Alternating Direction Method of Multipliers
Authors:
Tianyun Zhang,
Shaokai Ye,
Kaiqi Zhang,
Jian Tang,
Wujie Wen,
Makan Fardad,
Yanzhi Wang
Abstract:
Weight pruning methods for deep neural networks (DNNs) have been investigated recently, but prior work in this area is mainly heuristic, iterative pruning, thereby lacking guarantees on the weight reduction ratio and convergence time. To mitigate these limitations, we present a systematic weight pruning framework of DNNs using the alternating direction method of multipliers (ADMM). We first formul…
▽ More
Weight pruning methods for deep neural networks (DNNs) have been investigated recently, but prior work in this area is mainly heuristic, iterative pruning, thereby lacking guarantees on the weight reduction ratio and convergence time. To mitigate these limitations, we present a systematic weight pruning framework of DNNs using the alternating direction method of multipliers (ADMM). We first formulate the weight pruning problem of DNNs as a nonconvex optimization problem with combinatorial constraints specifying the sparsity requirements, and then adopt the ADMM framework for systematic weight pruning. By using ADMM, the original nonconvex optimization problem is decomposed into two subproblems that are solved iteratively. One of these subproblems can be solved using stochastic gradient descent, the other can be solved analytically. Besides, our method achieves a fast convergence rate.
The weight pruning results are very promising and consistently outperform the prior work. On the LeNet-5 model for the MNIST data set, we achieve 71.2 times weight reduction without accuracy loss. On the AlexNet model for the ImageNet data set, we achieve 21 times weight reduction without accuracy loss. When we focus on the convolutional layer pruning for computation reductions, we can reduce the total computation by five times compared with the prior work (achieving a total of 13.4 times weight reduction in convolutional layers). Our models and codes are released at https://github.com/KaiqiZhang/admm-pruning
△ Less
Submitted 25 July, 2018; v1 submitted 9 April, 2018;
originally announced April 2018.
-
Systematic Weight Pruning of DNNs using Alternating Direction Method of Multipliers
Authors:
Tianyun Zhang,
Shaokai Ye,
Yipeng Zhang,
Yanzhi Wang,
Makan Fardad
Abstract:
We present a systematic weight pruning framework of deep neural networks (DNNs) using the alternating direction method of multipliers (ADMM). We first formulate the weight pruning problem of DNNs as a constrained nonconvex optimization problem, and then adopt the ADMM framework for systematic weight pruning. We show that ADMM is highly suitable for weight pruning due to the computational efficienc…
▽ More
We present a systematic weight pruning framework of deep neural networks (DNNs) using the alternating direction method of multipliers (ADMM). We first formulate the weight pruning problem of DNNs as a constrained nonconvex optimization problem, and then adopt the ADMM framework for systematic weight pruning. We show that ADMM is highly suitable for weight pruning due to the computational efficiency it offers. We achieve a much higher compression ratio compared with prior work while maintaining the same test accuracy, together with a faster convergence rate. Our models are released at https://github.com/KaiqiZhang/admm-pruning
△ Less
Submitted 21 April, 2018; v1 submitted 15 February, 2018;
originally announced February 2018.
-
A Memristor-Based Optimization Framework for AI Applications
Authors:
Sijia Liu,
Yanzhi Wang,
Makan Fardad,
Pramod K. Varshney
Abstract:
Memristors have recently received significant attention as ubiquitous device-level components for building a novel generation of computing systems. These devices have many promising features, such as non-volatility, low power consumption, high density, and excellent scalability. The ability to control and modify biasing voltages at the two terminals of memristors make them promising candidates to…
▽ More
Memristors have recently received significant attention as ubiquitous device-level components for building a novel generation of computing systems. These devices have many promising features, such as non-volatility, low power consumption, high density, and excellent scalability. The ability to control and modify biasing voltages at the two terminals of memristors make them promising candidates to perform matrix-vector multiplications and solve systems of linear equations. In this article, we discuss how networks of memristors arranged in crossbar arrays can be used for efficiently solving optimization and machine learning problems. We introduce a new memristor-based optimization framework that combines the computational merit of memristor crossbars with the advantages of an operator splitting method, alternating direction method of multipliers (ADMM). Here, ADMM helps in splitting a complex optimization problem into subproblems that involve the solution of systems of linear equations. The capability of this framework is shown by applying it to linear programming, quadratic programming, and sparse optimization. In addition to ADMM, implementation of a customized power iteration (PI) method for eigenvalue/eigenvector computation using memristor crossbars is discussed. The memristor-based PI method can further be applied to principal component analysis (PCA). The use of memristor crossbars yields a significant speed-up in computation, and thus, we believe, has the potential to advance optimization and machine learning research in artificial intelligence (AI).
△ Less
Submitted 18 October, 2017;
originally announced October 2017.
-
Optimized Sensor Collaboration for Estimation of Temporally Correlated Parameters
Authors:
Sijia Liu,
Swarnendu Kar,
Makan Fardad,
Pramod K. Varshney
Abstract:
In this paper, we aim to design the optimal sensor collaboration strategy for the estimation of time-varying parameters, where collaboration refers to the act of sharing measurements with neighboring sensors prior to transmission to a fusion center. We begin by addressing the sensor collaboration problem for the estimation of uncorrelated parameters. We show that the resulting collaboration proble…
▽ More
In this paper, we aim to design the optimal sensor collaboration strategy for the estimation of time-varying parameters, where collaboration refers to the act of sharing measurements with neighboring sensors prior to transmission to a fusion center. We begin by addressing the sensor collaboration problem for the estimation of uncorrelated parameters. We show that the resulting collaboration problem can be transformed into a special nonconvex optimization problem, where a difference of convex functions carries all the nonconvexity. This specific problem structure enables the use of a convex-concave procedure to obtain a near-optimal solution. When the parameters of interest are temporally correlated, a penalized version of the convex-concave procedure becomes well suited for designing the optimal collaboration scheme. In order to improve computational efficiency, we further propose a fast algorithm that scales gracefully with problem size via the alternating direction method of multipliers. Numerical results are provided to demonstrate the effectiveness of our approach and the impact of parameter correlation and temporal dynamics of sensor networks on estimation performance.
△ Less
Submitted 25 August, 2016; v1 submitted 10 March, 2016;
originally announced March 2016.
-
Sparsity-Aware Sensor Collaboration for Linear Coherent Estimation
Authors:
Sijia Liu,
Swarnendu Kar,
Makan Fardad,
Pramod K. Varshney
Abstract:
In the context of distributed estimation, we consider the problem of sensor collaboration, which refers to the act of sharing measurements with neighboring sensors prior to transmission to a fusion center. While incorporating the cost of sensor collaboration, we aim to find optimal sparse collaboration schemes subject to a certain information or energy constraint. Two types of sensor collaboration…
▽ More
In the context of distributed estimation, we consider the problem of sensor collaboration, which refers to the act of sharing measurements with neighboring sensors prior to transmission to a fusion center. While incorporating the cost of sensor collaboration, we aim to find optimal sparse collaboration schemes subject to a certain information or energy constraint. Two types of sensor collaboration problems are studied: minimum energy with an information constraint; and maximum information with an energy constraint. To solve the resulting sensor collaboration problems, we present tractable optimization formulations and propose efficient methods which render near-optimal solutions in numerical experiments. We also explore the situation in which there is a cost associated with the involvement of each sensor in the estimation scheme. In such situations, the participating sensors must be chosen judiciously. We introduce a unified framework to jointly design the optimal sensor selection and collaboration schemes. For a given estimation performance, we show empirically that there exists a trade-off between sensor selection and sensor collaboration.
△ Less
Submitted 5 February, 2015; v1 submitted 27 August, 2014;
originally announced August 2014.
-
Algorithms for leader selection in stochastically forced consensus networks
Authors:
Fu Lin,
Makan Fardad,
Mihailo R. Jovanović
Abstract:
We are interested in assigning a pre-specified number of nodes as leaders in order to minimize the mean-square deviation from consensus in stochastically forced networks. This problem arises in several applications including control of vehicular formations and localization in sensor networks. For networks with leaders subject to noise, we show that the Boolean constraints (a node is either a leade…
▽ More
We are interested in assigning a pre-specified number of nodes as leaders in order to minimize the mean-square deviation from consensus in stochastically forced networks. This problem arises in several applications including control of vehicular formations and localization in sensor networks. For networks with leaders subject to noise, we show that the Boolean constraints (a node is either a leader or it is not) are the only source of nonconvexity. By relaxing these constraints to their convex hull we obtain a lower bound on the global optimal value. We also use a simple but efficient greedy algorithm to identify leaders and to compute an upper bound. For networks with leaders that perfectly follow their desired trajectories, we identify an additional source of nonconvexity in the form of a rank constraint. Removal of the rank constraint and relaxation of the Boolean constraints yields a semidefinite program for which we develop a customized algorithm well-suited for large networks. Several examples ranging from regular lattices to random graphs are provided to illustrate the effectiveness of the developed algorithms.
△ Less
Submitted 29 May, 2013; v1 submitted 2 February, 2013;
originally announced February 2013.
-
Optimal Control of Vehicular Formations with Nearest Neighbor Interactions
Authors:
Fu Lin,
Makan Fardad,
Mihailo R. Jovanović
Abstract:
We consider the design of optimal localized feedback gains for one-dimensional formations in which vehicles only use information from their immediate neighbors. The control objective is to enhance coherence of the formation by making it behave like a rigid lattice. For the single-integrator model with symmetric gains, we establish convexity, implying that the globally optimal controller can be com…
▽ More
We consider the design of optimal localized feedback gains for one-dimensional formations in which vehicles only use information from their immediate neighbors. The control objective is to enhance coherence of the formation by making it behave like a rigid lattice. For the single-integrator model with symmetric gains, we establish convexity, implying that the globally optimal controller can be computed efficiently. We also identify a class of convex problems for double-integrators by restricting the controller to symmetric position and uniform diagonal velocity gains. To obtain the optimal non-symmetric gains for both the single- and the double-integrator models, we solve a parameterized family of optimal control problems ranging from an easily solvable problem to the problem of interest as the underlying parameter increases. When this parameter is kept small, we employ perturbation analysis to decouple the matrix equations that result from the optimality conditions, thereby rendering the unique optimal feedback gain. This solution is used to initialize a homotopy-based Newton's method to find the optimal localized gain. To investigate the performance of localized controllers, we examine how the coherence of large-scale stochastically forced formations scales with the number of vehicles. We establish several explicit scaling relationships and show that the best performance is achieved by a localized controller that is both non-symmetric and spatially-varying.
△ Less
Submitted 17 December, 2011;
originally announced December 2011.