Search | arXiv e-print repository

A Complete Set of Quadratic Constraints for Repeated ReLU and Generalizations

Authors: Sahel Vahedi Noori, Bin Hu, Geir Dullerud, Peter Seiler

Abstract: This paper derives a complete set of quadratic constraints (QCs) for the repeated ReLU. The complete set of QCs is described by a collection of matrix copositivity conditions. We also show that only two functions satisfy all QCs in our complete set: the repeated ReLU and flipped ReLU. Thus our complete set of QCs bounds the repeated ReLU as tight as possible up to the sign invariance inherent in q… ▽ More This paper derives a complete set of quadratic constraints (QCs) for the repeated ReLU. The complete set of QCs is described by a collection of matrix copositivity conditions. We also show that only two functions satisfy all QCs in our complete set: the repeated ReLU and flipped ReLU. Thus our complete set of QCs bounds the repeated ReLU as tight as possible up to the sign invariance inherent in quadratic forms. We derive a similar complete set of incremental QCs for repeated ReLU, which can potentially lead to less conservative Lipschitz bounds for ReLU networks than the standard LipSDP approach. The basic constructions are also used to derive the complete sets of QCs for other piecewise linear activation functions such as leaky ReLU, MaxMin, and HouseHolder. Finally, we illustrate the use of the complete set of QCs to assess stability and performance for recurrent neural networks with ReLU activation functions. We rely on a standard copositivity relaxation to formulate the stability/performance condition as a semidefinite program. Simple examples are provided to illustrate that the complete sets of QCs and incremental QCs can yield less conservative bounds than existing sets. △ Less

Submitted 22 August, 2024; v1 submitted 9 July, 2024; originally announced July 2024.

arXiv:2405.05236 [pdf, ps, other]

Stability and Performance Analysis of Discrete-Time ReLU Recurrent Neural Networks

Authors: Sahel Vahedi Noori, Bin Hu, Geir Dullerud, Peter Seiler

Abstract: This paper presents sufficient conditions for the stability and $\ell_2$-gain performance of recurrent neural networks (RNNs) with ReLU activation functions. These conditions are derived by combining Lyapunov/dissipativity theory with Quadratic Constraints (QCs) satisfied by repeated ReLUs. We write a general class of QCs for repeated RELUs using known properties for the scalar ReLU. Our stability… ▽ More This paper presents sufficient conditions for the stability and $\ell_2$-gain performance of recurrent neural networks (RNNs) with ReLU activation functions. These conditions are derived by combining Lyapunov/dissipativity theory with Quadratic Constraints (QCs) satisfied by repeated ReLUs. We write a general class of QCs for repeated RELUs using known properties for the scalar ReLU. Our stability and performance condition uses these QCs along with a "lifted" representation for the ReLU RNN. We show that the positive homogeneity property satisfied by a scalar ReLU does not expand the class of QCs for the repeated ReLU. We present examples to demonstrate the stability / performance condition and study the effect of the lifting horizon. △ Less

Submitted 14 May, 2024; v1 submitted 8 May, 2024; originally announced May 2024.

arXiv:2404.07373 [pdf, other]

Synthesizing Neural Network Controllers with Closed-Loop Dissipativity Guarantees

Authors: Neelay Junnarkar, Murat Arcak, Peter Seiler

Abstract: In this paper, a method is presented to synthesize neural network controllers such that the feedback system of plant and controller is dissipative, certifying performance requirements such as L2 gain bounds. The class of plants considered is that of linear time-invariant (LTI) systems interconnected with an uncertainty, including nonlinearities treated as an uncertainty for convenience of analysis… ▽ More In this paper, a method is presented to synthesize neural network controllers such that the feedback system of plant and controller is dissipative, certifying performance requirements such as L2 gain bounds. The class of plants considered is that of linear time-invariant (LTI) systems interconnected with an uncertainty, including nonlinearities treated as an uncertainty for convenience of analysis. The uncertainty of the plant and the nonlinearities of the neural network are both described using integral quadratic constraints (IQCs). First, a dissipativity condition is derived for uncertain LTI systems. Second, this condition is used to construct a linear matrix inequality (LMI) which can be used to synthesize neural network controllers. Finally, this convex condition is used in a projection-based training method to synthesize neural network controllers with dissipativity guarantees. Numerical examples on an inverted pendulum and a flexible rod on a cart are provided to demonstrate the effectiveness of this approach. △ Less

Submitted 10 April, 2024; originally announced April 2024.

Comments: Submitted to the journal Automatica, 14 pages, 7 figures

arXiv:2404.03647 [pdf, other]

Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra

Authors: Darioush Kevian, Usman Syed, Xingang Guo, Aaron Havens, Geir Dullerud, Peter Seiler, Lianhui Qin, Bin Hu

Abstract: In this paper, we explore the capabilities of state-of-the-art large language models (LLMs) such as GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra in solving undergraduate-level control problems. Controls provides an interesting case study for LLM reasoning due to its combination of mathematical theory and engineering design. We introduce ControlBench, a benchmark dataset tailored to reflect the bread… ▽ More In this paper, we explore the capabilities of state-of-the-art large language models (LLMs) such as GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra in solving undergraduate-level control problems. Controls provides an interesting case study for LLM reasoning due to its combination of mathematical theory and engineering design. We introduce ControlBench, a benchmark dataset tailored to reflect the breadth, depth, and complexity of classical control design. We use this dataset to study and evaluate the problem-solving abilities of these LLMs in the context of control engineering. We present evaluations conducted by a panel of human experts, providing insights into the accuracy, reasoning, and explanatory prowess of LLMs in control engineering. Our analysis reveals the strengths and limitations of each LLM in the context of classical control, and our results imply that Claude 3 Opus has become the state-of-the-art LLM for solving undergraduate control problems. Our study serves as an initial step towards the broader goal of employing artificial general intelligence in control engineering. △ Less

Submitted 4 April, 2024; originally announced April 2024.

arXiv:2402.11654 [pdf, other]

Model-Free $μ$-Synthesis: A Nonsmooth Optimization Perspective

Authors: Darioush Keivan, Xingang Guo, Peter Seiler, Geir Dullerud, Bin Hu

Abstract: In this paper, we revisit model-free policy search on an important robust control benchmark, namely $μ$-synthesis. In the general output-feedback setting, there do not exist convex formulations for this problem, and hence global optimality guarantees are not expected. Apkarian (2011) presented a nonconvex nonsmooth policy optimization approach for this problem, and achieved state-of-the-art design… ▽ More In this paper, we revisit model-free policy search on an important robust control benchmark, namely $μ$-synthesis. In the general output-feedback setting, there do not exist convex formulations for this problem, and hence global optimality guarantees are not expected. Apkarian (2011) presented a nonconvex nonsmooth policy optimization approach for this problem, and achieved state-of-the-art design results via using subgradient-based policy search algorithms which generate update directions in a model-based manner. Despite the lack of convexity and global optimality guarantees, these subgradient-based policy search methods have led to impressive numerical results in practice. Built upon such a policy optimization persepctive, our paper extends these subgradient-based search methods to a model-free setting. Specifically, we examine the effectiveness of two model-free policy optimization strategies: the model-free non-derivative sampling method and the zeroth-order policy search with uniform smoothing. We performed an extensive numerical study to demonstrate that both methods consistently replicate the design outcomes achieved by their model-based counterparts. Additionally, we provide some theoretical justifications showing that convergence guarantees to stationary points can be established for our model-free $μ$-synthesis under some assumptions related to the coerciveness of the cost function. Overall, our results demonstrate that derivative-free policy optimization offers a competitive and viable approach for solving general output-feedback $μ$-synthesis problems in the model-free setting. △ Less

Submitted 18 February, 2024; originally announced February 2024.

Comments: Submitted to L4DC 2024

arXiv:2204.00122 [pdf, other]

doi 10.1109/CDC51059.2022.9992684

Synthesis of Stabilizing Recurrent Equilibrium Network Controllers

Authors: Neelay Junnarkar, He Yin, Fangda Gu, Murat Arcak, Peter Seiler

Abstract: We propose a parameterization of a nonlinear dynamic controller based on the recurrent equilibrium network, a generalization of the recurrent neural network. We derive constraints on the parameterization under which the controller guarantees exponential stability of a partially observed dynamical system with sector bounded nonlinearities. Finally, we present a method to synthesize this controller… ▽ More We propose a parameterization of a nonlinear dynamic controller based on the recurrent equilibrium network, a generalization of the recurrent neural network. We derive constraints on the parameterization under which the controller guarantees exponential stability of a partially observed dynamical system with sector bounded nonlinearities. Finally, we present a method to synthesize this controller using projected policy gradient methods to maximize a reward function with arbitrary structure. The projection step involves the solution of convex optimization problems. We demonstrate the proposed method with simulated examples of controlling nonlinear plants, including plants modeled with neural networks. △ Less

Submitted 12 September, 2022; v1 submitted 31 March, 2022; originally announced April 2022.

Comments: Submitted to IEEE CDC 2022. arXiv admin note: text overlap with arXiv:2109.03861

arXiv:2203.01910 [pdf, other]

doi 10.1109/LCSYS.2022.3183650

Efficient Data Structures for Exploiting Sparsity and Structure in Representation of Polynomial Optimization Problems: Implementation in SOSTOOLS

Authors: Declan Jagt, Sachin Shivakumar, Peter Seiler, Matthew Peet

Abstract: We present a new data structure for representation of polynomial variables in the parsing of sum-of-squares (SOS) programs. In SOS programs, the variables $s(x;Q)$ are polynomial in the independent variables $x$, but linear in the decision variables $Q$. Current SOS parsers, however, fail to exploit the semi-linear structure of the polynomial variables, treating the decision variables as independe… ▽ More We present a new data structure for representation of polynomial variables in the parsing of sum-of-squares (SOS) programs. In SOS programs, the variables $s(x;Q)$ are polynomial in the independent variables $x$, but linear in the decision variables $Q$. Current SOS parsers, however, fail to exploit the semi-linear structure of the polynomial variables, treating the decision variables as independent variables in their representation. This results in unnecessary overhead in storage and manipulation of the polynomial variables, prohibiting the parser from addressing larger-scale optimization problems. To eliminate this computational overhead, we introduce a new representation of polynomial variables, the "dpvar" structure, that is affine in the decision variables. We show that the complexity of operations on variables in the dpvar representation scales favorably with the number of decision variables. We further show that the required memory for storing polynomial variables is relatively small using the dpvar structure, particularly when exploiting the MATLAB sparse storage structure. Finally, we incorporate the dpvar data structure into SOSTOOLS 4.00, and test the performance of the parser for several polynomial optimization problems. △ Less

Submitted 2 September, 2022; v1 submitted 3 March, 2022; originally announced March 2022.

Journal ref: IEEE Control Systems Letters, vol. 6, pp. 3493-3498, 2022

arXiv:2201.00801 [pdf, other]

Revisiting PGD Attacks for Stability Analysis of Large-Scale Nonlinear Systems and Perception-Based Control

Authors: Aaron Havens, Darioush Keivan, Peter Seiler, Geir Dullerud, Bin Hu

Abstract: Many existing region-of-attraction (ROA) analysis tools find difficulty in addressing feedback systems with large-scale neural network (NN) policies and/or high-dimensional sensing modalities such as cameras. In this paper, we tailor the projected gradient descent (PGD) attack method developed in the adversarial learning community as a general-purpose ROA analysis tool for large-scale nonlinear sy… ▽ More Many existing region-of-attraction (ROA) analysis tools find difficulty in addressing feedback systems with large-scale neural network (NN) policies and/or high-dimensional sensing modalities such as cameras. In this paper, we tailor the projected gradient descent (PGD) attack method developed in the adversarial learning community as a general-purpose ROA analysis tool for large-scale nonlinear systems and end-to-end perception-based control. We show that the ROA analysis can be approximated as a constrained maximization problem whose goal is to find the worst-case initial condition which shifts the terminal state the most. Then we present two PGD-based iterative methods which can be used to solve the resultant constrained maximization problem. Our analysis is not based on Lyapunov theory, and hence requires minimum information of the problem structures. In the model-based setting, we show that the PGD updates can be efficiently performed using back-propagation. In the model-free setting (which is more relevant to ROA analysis of perception-based control), we propose a finite-difference PGD estimate which is general and only requires a black-box simulator for generating the trajectories of the closed-loop system given any initial state. We demonstrate the scalability and generality of our analysis tool on several numerical examples with large-scale NN policies and high-dimensional image observations. We believe that our proposed analysis serves as a meaningful initial step toward further understanding of closed-loop stability of large-scale nonlinear systems and perception-based control. △ Less

Submitted 3 January, 2022; originally announced January 2022.

Comments: Submitted to L4DC 2022

arXiv:2111.15537 [pdf, other]

Model-Free $μ$ Synthesis via Adversarial Reinforcement Learning

Authors: Darioush Keivan, Aaron Havens, Peter Seiler, Geir Dullerud, Bin Hu

Abstract: Motivated by the recent empirical success of policy-based reinforcement learning (RL), there has been a research trend studying the performance of policy-based RL methods on standard control benchmark problems. In this paper, we examine the effectiveness of policy-based RL methods on an important robust control problem, namely $μ$ synthesis. We build a connection between robust adversarial RL and… ▽ More Motivated by the recent empirical success of policy-based reinforcement learning (RL), there has been a research trend studying the performance of policy-based RL methods on standard control benchmark problems. In this paper, we examine the effectiveness of policy-based RL methods on an important robust control problem, namely $μ$ synthesis. We build a connection between robust adversarial RL and $μ$ synthesis, and develop a model-free version of the well-known $DK$-iteration for solving state-feedback $μ$ synthesis with static $D$-scaling. In the proposed algorithm, the $K$ step mimics the classical central path algorithm via incorporating a recently-developed double-loop adversarial RL method as a subroutine, and the $D$ step is based on model-free finite difference approximation. Extensive numerical study is also presented to demonstrate the utility of our proposed model-free algorithm. Our study sheds new light on the connections between adversarial RL and robust control. △ Less

Submitted 8 June, 2022; v1 submitted 30 November, 2021; originally announced November 2021.

Comments: Accepted to ACC 2022

arXiv:2111.12906 [pdf, ps, other]

Robustness against Adversarial Attacks in Neural Networks using Incremental Dissipativity

Authors: Bernardo Aquino, Arash Rahnama, Peter Seiler, Lizhen Lin, Vijay Gupta

Abstract: Adversarial examples can easily degrade the classification performance in neural networks. Empirical methods for promoting robustness to such examples have been proposed, but often lack both analytical insights and formal guarantees. Recently, some robustness certificates have appeared in the literature based on system theoretic notions. This work proposes an incremental dissipativity-based robust… ▽ More Adversarial examples can easily degrade the classification performance in neural networks. Empirical methods for promoting robustness to such examples have been proposed, but often lack both analytical insights and formal guarantees. Recently, some robustness certificates have appeared in the literature based on system theoretic notions. This work proposes an incremental dissipativity-based robustness certificate for neural networks in the form of a linear matrix inequality for each layer. We also propose an equivalent spectral norm bound for this certificate which is scalable to neural networks with multiple layers. We demonstrate the improved performance against adversarial attacks on a feed-forward neural network trained on MNIST and an Alexnet trained using CIFAR-10. △ Less

Submitted 13 February, 2022; v1 submitted 24 November, 2021; originally announced November 2021.

arXiv:2109.03861 [pdf, other]

Recurrent Neural Network Controllers Synthesis with Stability Guarantees for Partially Observed Systems

Authors: Fangda Gu, He Yin, Laurent El Ghaoui, Murat Arcak, Peter Seiler, Ming Jin

Abstract: Neural network controllers have become popular in control tasks thanks to their flexibility and expressivity. Stability is a crucial property for safety-critical dynamical systems, while stabilization of partially observed systems, in many cases, requires controllers to retain and process long-term memories of the past. We consider the important class of recurrent neural networks (RNN) as dynamic… ▽ More Neural network controllers have become popular in control tasks thanks to their flexibility and expressivity. Stability is a crucial property for safety-critical dynamical systems, while stabilization of partially observed systems, in many cases, requires controllers to retain and process long-term memories of the past. We consider the important class of recurrent neural networks (RNN) as dynamic controllers for nonlinear uncertain partially-observed systems, and derive convex stability conditions based on integral quadratic constraints, S-lemma and sequential convexification. To ensure stability during the learning and control process, we propose a projected policy gradient method that iteratively enforces the stability conditions in the reparametrized space taking advantage of mild additional information on system dynamics. Numerical experiments show that our method learns stabilizing controllers while using fewer samples and achieving higher final performance compared with policy gradient. △ Less

Submitted 7 December, 2021; v1 submitted 8 September, 2021; originally announced September 2021.

arXiv:2011.01893 [pdf, other]

Iterative Best Response for Multi-Body Asset-Guarding Games

Authors: Emmanuel Sin, Murat Arcak, Douglas Philbrick, Peter Seiler

Abstract: We present a numerical approach to finding optimal trajectories for players in a multi-body, asset-guarding game with nonlinear dynamics and non-convex constraints. Using the Iterative Best Response (IBR) scheme, we solve for each player's optimal strategy assuming the other players' trajectories are known and fixed. Leveraging recent advances in Sequential Convex Programming (SCP), we use SCP as… ▽ More We present a numerical approach to finding optimal trajectories for players in a multi-body, asset-guarding game with nonlinear dynamics and non-convex constraints. Using the Iterative Best Response (IBR) scheme, we solve for each player's optimal strategy assuming the other players' trajectories are known and fixed. Leveraging recent advances in Sequential Convex Programming (SCP), we use SCP as a subroutine within the IBR algorithm to efficiently solve an approximation of each player's constrained trajectory optimization problem. We apply the approach to an asset-guarding game example involving multiple pursuers and a single evader (i.e., n-versus-1 engagements). Resulting evader trajectories are tested in simulation to verify successful evasion against pursuers using conventional intercept guidance laws. △ Less

Submitted 3 November, 2020; originally announced November 2020.

arXiv:2005.12226 [pdf, other]

Optimal assignment of collaborating agents in multi-body asset-guarding games

Authors: Emmanuel Sin, Murat Arcak, Andrew Packard, Douglas Philbrick, Peter Seiler

Abstract: We study a multi-body asset-guarding game in missile defense where teams of interceptor missiles collaborate to defend a non-manuevering asset against a group of threat missiles. We approach the problem in two steps. We first formulate an assignment problem where we optimally assign subsets of collaborating interceptors to each threat so that all threats are intercepted as far away from the asset… ▽ More We study a multi-body asset-guarding game in missile defense where teams of interceptor missiles collaborate to defend a non-manuevering asset against a group of threat missiles. We approach the problem in two steps. We first formulate an assignment problem where we optimally assign subsets of collaborating interceptors to each threat so that all threats are intercepted as far away from the asset as possible. We assume that each interceptor is controlled by a collaborative guidance law derived from linear quadratic dynamic games. Our results include a 6-DOF simulation of a 5-interceptor versus 3-threat missile engagement where each agent is modeled as a missile airframe controlled by an autopilot. Despite the assumption of linear dynamics in our collaborative guidance law and the unmodeled dynamics in the simulation environment (e.g., varying density and gravity), we show that the simulated trajectories match well with those predicted by our approach. Furthermore, we show that a more agile threat, with greater speed and acceleration, can be intercepted by inferior interceptors when they collaborate. We believe the concepts introduced in this paper may be applied in asymmetric missile defense scenarios, including defense against advanced cruise missiles and hypersonic vehicles. △ Less

Submitted 25 May, 2020; originally announced May 2020.

arXiv:2001.09467 [pdf, other]

Tractable Reinforcement Learning of Signal Temporal Logic Objectives

Authors: Harish Venkataraman, Derya Aksaray, Peter Seiler

Abstract: Signal temporal logic (STL) is an expressive language to specify time-bound real-world robotic tasks and safety specifications. Recently, there has been an interest in learning optimal policies to satisfy STL specifications via reinforcement learning (RL). Learning to satisfy STL specifications often needs a sufficient length of state history to compute reward and the next action. The need for his… ▽ More Signal temporal logic (STL) is an expressive language to specify time-bound real-world robotic tasks and safety specifications. Recently, there has been an interest in learning optimal policies to satisfy STL specifications via reinforcement learning (RL). Learning to satisfy STL specifications often needs a sufficient length of state history to compute reward and the next action. The need for history results in exponential state-space growth for the learning problem. Thus the learning problem becomes computationally intractable for most real-world applications. In this paper, we propose a compact means to capture state history in a new augmented state-space representation. An approximation to the objective (maximizing probability of satisfaction) is proposed and solved for in the new augmented state-space. We show the performance bound of the approximate solution and compare it with the solution of an existing technique via simulations. △ Less

Submitted 17 February, 2020; v1 submitted 26 January, 2020; originally announced January 2020.

Comments: Github code repository: https://github.com/kumaa001/Tractable_RL_for_STL_Objectives. arXiv admin note: text overlap with arXiv:1609.07409

arXiv:1310.4716 [pdf, ps, other]

SOSTOOLS Version 4.00 Sum of Squares Optimization Toolbox for MATLAB

Authors: Antonis Papachristodoulou, James Anderson, Giorgio Valmorbida, Stephen Prajna, Pete Seiler, Pablo Parrilo, Matthew M. Peet, Declan Jagt

Abstract: The release of SOSTOOLS v4.00 comes as we approach the 20th anniversary of the original release of SOSTOOLS v1.00 back in April, 2002. SOSTOOLS was originally envisioned as a flexible tool for parsing and solving polynomial optimization problems, using the SOS tightening of polynomial positivity constraints, and capable of adapting to the ever-evolving fauna of applications of SOS. There are now a… ▽ More The release of SOSTOOLS v4.00 comes as we approach the 20th anniversary of the original release of SOSTOOLS v1.00 back in April, 2002. SOSTOOLS was originally envisioned as a flexible tool for parsing and solving polynomial optimization problems, using the SOS tightening of polynomial positivity constraints, and capable of adapting to the ever-evolving fauna of applications of SOS. There are now a variety of SOS programming parsers beyond SOSTOOLS, including YALMIP, Gloptipoly, SumOfSquares, and others. We hope SOSTOOLS remains the most intuitive, robust and adaptable toolbox for SOS programming. Recent progress in Semidefinite programming has opened up new possibilities for solving large Sum of Squares programming problems, and we hope the next decade will be one where SOS methods will find wide application in different areas. In SOSTOOLS v4.00, we implement a parsing approach that reduces the computational and memory requirements of the parser below that of the SDP solver itself. We have re-developed the internal structure of our polynomial decision variables. Specifically, polynomial and SOS variable declarations made using sossosvar, sospolyvar, sosmatrixvar, etc now return a new polynomial structure, dpvar. This new polynomial structure, is documented in the enclosed dpvar guide, and isolates the scalar SDP decision variables in the SOS program from the independent variables used to construct the SOS program. As a result, the complexity of the parser scales almost linearly in the number of decision variables. As a result of these changes, almost all users will notice a significant increase in speed, with large-scaleproblems experiencing the most dramatic speedups. Parsing time is now always less than 10% of time spent in the SDP solver. Finally, SOSTOOLS now provides support for the MOSEK solver interface as well as the SeDuMi, SDPT3, CSDP, SDPNAL, SDPNAL+, and SDPA solvers. △ Less

Submitted 27 December, 2021; v1 submitted 17 October, 2013; originally announced October 2013.

Comments: 64 pages, 3 figures, "software available from http://sysos.eng.ox.ac.uk/sostools/ "

Showing 1–15 of 15 results for author: Seiler, P