-
Learning Temporal Logic Predicates from Data with Statistical Guarantees
Authors:
Emi Soroka,
Rohan Sinha,
Sanjay Lall
Abstract:
Temporal logic rules are often used in control and robotics to provide structured, human-interpretable descriptions of high-dimensional trajectory data. These rules have numerous applications including safety validation using formal methods, constraining motion planning among autonomous agents, and classifying data. However, existing methods for learning temporal logic predicates from data provide…
▽ More
Temporal logic rules are often used in control and robotics to provide structured, human-interpretable descriptions of high-dimensional trajectory data. These rules have numerous applications including safety validation using formal methods, constraining motion planning among autonomous agents, and classifying data. However, existing methods for learning temporal logic predicates from data provide no assurances about the correctness of the resulting predicate. We present a novel method to learn temporal logic predicates from data with finite-sample correctness guarantees. Our approach leverages expression optimization and conformal prediction to learn predicates that correctly describe future trajectories under mild assumptions with a user-defined confidence level. We provide experimental results showing the performance of our approach on a simulated trajectory dataset and perform ablation studies to understand how each component of our algorithm contributes to its performance.
△ Less
Submitted 14 June, 2024;
originally announced June 2024.
-
Adversarial Training of Two-Layer Polynomial and ReLU Activation Networks via Convex Optimization
Authors:
Daniel Kuelbs,
Sanjay Lall,
Mert Pilanci
Abstract:
Training neural networks which are robust to adversarial attacks remains an important problem in deep learning, especially as heavily overparameterized models are adopted in safety-critical settings. Drawing from recent work which reformulates the training problems for two-layer ReLU and polynomial activation networks as convex programs, we devise a convex semidefinite program (SDP) for adversaria…
▽ More
Training neural networks which are robust to adversarial attacks remains an important problem in deep learning, especially as heavily overparameterized models are adopted in safety-critical settings. Drawing from recent work which reformulates the training problems for two-layer ReLU and polynomial activation networks as convex programs, we devise a convex semidefinite program (SDP) for adversarial training of polynomial activation networks via the S-procedure. We also derive a convex SDP to compute the minimum distance from a correctly classified example to the decision boundary of a polynomial activation network. Adversarial training for two-layer ReLU activation networks has been explored in the literature, but, in contrast to prior work, we present a scalable approach which is compatible with standard machine libraries and GPU acceleration. The adversarial training SDP for polynomial activation networks leads to large increases in robust test accuracy against $\ell^\infty$ attacks on the Breast Cancer Wisconsin dataset from the UCI Machine Learning Repository. For two-layer ReLU networks, we leverage our scalable implementation to retrain the final two fully connected layers of a Pre-Activation ResNet-18 model on the CIFAR-10 dataset. Our 'robustified' model achieves higher clean and robust test accuracies than the same architecture trained with sharpness-aware minimization.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context
Authors:
Gemini Team,
Petko Georgiev,
Ving Ian Lei,
Ryan Burnell,
Libin Bai,
Anmol Gulati,
Garrett Tanzer,
Damien Vincent,
Zhufeng Pan,
Shibo Wang,
Soroosh Mariooryad,
Yifan Ding,
Xinyang Geng,
Fred Alcober,
Roy Frostig,
Mark Omernick,
Lexi Walker,
Cosmin Paduraru,
Christina Sorokin,
Andrea Tacchetti,
Colin Gaffney,
Samira Daruki,
Olcan Sercinoglu,
Zach Gleicher,
Juliette Love
, et al. (1110 additional authors not shown)
Abstract:
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February…
▽ More
In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content.
△ Less
Submitted 8 August, 2024; v1 submitted 8 March, 2024;
originally announced March 2024.
-
Logical Synchrony Networks: A formal model for deterministic distribution
Authors:
Logan Kenwright,
Partha Roop,
Nathan Allen,
Sanjay Lall,
Calin Cascaval,
Tammo Spalink,
Martin Izzard
Abstract:
Kahn Process Networks (KPNs) are a deterministic Model of Computation (MoC) for distributed systems. KPNs supports non-blocking writes and blocking reads, with the consequent assumption of unbounded buffers between processes. Variants such as Finite FIFO Platforms (FFP) have been developed, which enforce boundedness. One issue with existing models is that they mix process synchronisation with proc…
▽ More
Kahn Process Networks (KPNs) are a deterministic Model of Computation (MoC) for distributed systems. KPNs supports non-blocking writes and blocking reads, with the consequent assumption of unbounded buffers between processes. Variants such as Finite FIFO Platforms (FFP) have been developed, which enforce boundedness. One issue with existing models is that they mix process synchronisation with process execution. In this paper we address how these two facets may be decoupled.
This paper explores a recent alternative called bittide, which decouples the execution of a process from the control needed for process synchronisation, and thus preserves determinism and boundedness while ensuring pipelined execution for better throughput. Our intuition is that such an approach could leverage not only determinism and buffer boundedness but may potentially offer better overall throughput.
To understand the behavior of these systems we define a formal model -- a deterministic MoC called Logical Synchrony Networks (LSNs). LSNs describes a network of processes modelled as a graph, with edges representing invariant logical delays between a producer process and the corresponding consumer process. We show that this abstraction is satisfied by KPNs. Subsequently, we show that both FFPs and bittide faithfully implement this abstraction. Thus, we show for the first time that FFPs and bittide offer two alternative ways of implementing deterministic distributed systems with the latter being more performant.
△ Less
Submitted 5 June, 2024; v1 submitted 12 February, 2024;
originally announced February 2024.
-
Gemini: A Family of Highly Capable Multimodal Models
Authors:
Gemini Team,
Rohan Anil,
Sebastian Borgeaud,
Jean-Baptiste Alayrac,
Jiahui Yu,
Radu Soricut,
Johan Schalkwyk,
Andrew M. Dai,
Anja Hauth,
Katie Millican,
David Silver,
Melvin Johnson,
Ioannis Antonoglou,
Julian Schrittwieser,
Amelia Glaese,
Jilin Chen,
Emily Pitler,
Timothy Lillicrap,
Angeliki Lazaridou,
Orhan Firat,
James Molloy,
Michael Isard,
Paul R. Barham,
Tom Hennigan,
Benjamin Lee
, et al. (1325 additional authors not shown)
Abstract:
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr…
▽ More
This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI.
△ Less
Submitted 17 June, 2024; v1 submitted 18 December, 2023;
originally announced December 2023.
-
Markov Decision Processes with Noisy State Observation
Authors:
Amirhossein Afsharrad,
Sanjay Lall
Abstract:
This paper addresses the challenge of a particular class of noisy state observations in Markov Decision Processes (MDPs), a common issue in various real-world applications. We focus on modeling this uncertainty through a confusion matrix that captures the probabilities of misidentifying the true state. Our primary goal is to estimate the inherent measurement noise, and to this end, we propose two…
▽ More
This paper addresses the challenge of a particular class of noisy state observations in Markov Decision Processes (MDPs), a common issue in various real-world applications. We focus on modeling this uncertainty through a confusion matrix that captures the probabilities of misidentifying the true state. Our primary goal is to estimate the inherent measurement noise, and to this end, we propose two novel algorithmic approaches. The first, the method of second-order repetitive actions, is designed for efficient noise estimation within a finite time window, providing identifiable conditions for system analysis. The second approach comprises a family of Bayesian algorithms, which we thoroughly analyze and compare in terms of performance and limitations. We substantiate our theoretical findings with simulations, demonstrating the effectiveness of our methods in different scenarios, particularly highlighting their behavior in environments with varying stationary distributions. Our work advances the understanding of reinforcement learning in noisy environments, offering robust techniques for more accurate state estimation in MDPs.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Convex Methods for Constrained Linear Bandits
Authors:
Amirhossein Afsharrad,
Ahmadreza Moradipari,
Sanjay Lall
Abstract:
Recently, bandit optimization has received significant attention in real-world safety-critical systems that involve repeated interactions with humans. While there exist various algorithms with performance guarantees in the literature, practical implementation of the algorithms has not received as much attention. This work presents a comprehensive study on the computational aspects of safe bandit a…
▽ More
Recently, bandit optimization has received significant attention in real-world safety-critical systems that involve repeated interactions with humans. While there exist various algorithms with performance guarantees in the literature, practical implementation of the algorithms has not received as much attention. This work presents a comprehensive study on the computational aspects of safe bandit algorithms, specifically safe linear bandits, by introducing a framework that leverages convex programming tools to create computationally efficient policies. In particular, we first characterize the properties of the optimal policy for safe linear bandit problem and then propose an end-to-end pipeline of safe linear bandit algorithms that only involves solving convex problems. We also numerically evaluate the performance of our proposed methods.
△ Less
Submitted 9 November, 2023; v1 submitted 7 November, 2023;
originally announced November 2023.
-
Satisfiability.jl: Satisfiability Modulo Theories in Julia
Authors:
Emiko Soroka,
Mykel J. Kochenderfer,
Sanjay Lall
Abstract:
Satisfiability modulo theories (SMT) is a core tool in formal verification. While the SMT-LIB specification language can be used to interact with theorem proving software, a high-level interface allows for faster and easier specifications of complex SMT formulae. In this paper we present a novel open-source package for interacting with SMT-LIB compliant solvers in the Julia programming language.
Satisfiability modulo theories (SMT) is a core tool in formal verification. While the SMT-LIB specification language can be used to interact with theorem proving software, a high-level interface allows for faster and easier specifications of complex SMT formulae. In this paper we present a novel open-source package for interacting with SMT-LIB compliant solvers in the Julia programming language.
△ Less
Submitted 15 December, 2023; v1 submitted 15 September, 2023;
originally announced September 2023.
-
Logical Synchrony and the bittide Mechanism
Authors:
Sanjay Lall,
Calin Cascaval,
Martin Izzard,
Tammo Spalink
Abstract:
We introduce logical synchrony, a framework that allows distributed computing to be coordinated as tightly as in synchronous systems without the distribution of a global clock or any reference to universal time. We develop a model of events called a logical synchrony network, in which nodes correspond to processors and every node has an associated local clock which generates the events. We constru…
▽ More
We introduce logical synchrony, a framework that allows distributed computing to be coordinated as tightly as in synchronous systems without the distribution of a global clock or any reference to universal time. We develop a model of events called a logical synchrony network, in which nodes correspond to processors and every node has an associated local clock which generates the events. We construct a measure of logical latency and develop its properties. A further model, called a multiclock network, is then analyzed and shown to be a refinement of the logical synchrony network. We present the bittide mechanism as an instantiation of multiclock networks, and discuss the clock control mechanism that ensures that buffers do not overflow or underflow. Finally we give conditions under which a logical synchrony network has an equivalent synchronous realization.
△ Less
Submitted 3 July, 2024; v1 submitted 31 July, 2023;
originally announced August 2023.
-
Resistance Distance and Control Performance for bittide Synchronization
Authors:
Sanjay Lall,
Calin Cascaval,
Martin Izzard,
Tammo Spalink
Abstract:
We discuss control of bittide distributed systems, which are designed to provide logical synchronization between networked machines by observing data flow rates between adjacent systems at the physical network layer and controlling local reference clock frequencies. We analyze the performance of approximate proportional-integral control of the synchronization mechanism and develop a simple continu…
▽ More
We discuss control of bittide distributed systems, which are designed to provide logical synchronization between networked machines by observing data flow rates between adjacent systems at the physical network layer and controlling local reference clock frequencies. We analyze the performance of approximate proportional-integral control of the synchronization mechanism and develop a simple continuous-time model to show the resulting dynamics are stable for any positive choice of gains. We then construct explicit formulae to show that closed-loop performance measured using the L2 norm is a product of two terms, one depending only on resistance distances in the graph, and the other depending only on controller gains.
△ Less
Submitted 31 March, 2022; v1 submitted 9 November, 2021;
originally announced November 2021.
-
Modeling and Control of bittide Synchronization
Authors:
Sanjay Lall,
Calin Cascaval,
Martin Izzard,
Tammo Spalink
Abstract:
Distributed system applications rely on a fine-grain common sense of time. Existing systems maintain the common sense of time by keeping each independent machine as close as possible to wall-clock time through a combination of software protocols like NTP and GPS signals and/or precision references like atomic clocks. This approach is expensive and has tolerance limitations that require protocols t…
▽ More
Distributed system applications rely on a fine-grain common sense of time. Existing systems maintain the common sense of time by keeping each independent machine as close as possible to wall-clock time through a combination of software protocols like NTP and GPS signals and/or precision references like atomic clocks. This approach is expensive and has tolerance limitations that require protocols to deal with asynchrony and its performance consequences. Moreover, at data-center scale it is impractical to distribute a physical clock as is done on a chip or printed circuit board. In this paper we introduce a distributed system design that removes the need for physical clock distribution or mechanisms for maintaining close alignment to wall-clock time, and instead provides applications with a perfectly synchronized logical clock. We discuss the abstract frame model (AFM), a mathematical model that underpins the system synchronization. The model is based on the rate of communication between nodes in a topology without requiring a global clock. We show that there are families of controllers that satisfy the properties required for existence and uniqueness of solutions to the AFM, and give examples.
△ Less
Submitted 31 March, 2022; v1 submitted 28 September, 2021;
originally announced September 2021.
-
A Delay Analysis of Maximal Matching Switching with Speedup
Authors:
Randy Cogill,
Sanjay Lall
Abstract:
In this paper we analyze the average queue backlog in a combined input-output queued switch using a maximal size matching scheduling algorithm. We compare this average backlog to the average backlog achieved by an optimal switch. We model the cell arrival process as independent and identically distributed between time slots and uniformly distributed among input and output ports. For switches wit…
▽ More
In this paper we analyze the average queue backlog in a combined input-output queued switch using a maximal size matching scheduling algorithm. We compare this average backlog to the average backlog achieved by an optimal switch. We model the cell arrival process as independent and identically distributed between time slots and uniformly distributed among input and output ports. For switches with many input and output ports, the backlog associated with maximal size matching with speedup 3 is no more than 10/3 times the backlog associated with an optimal switch. Moreover, this performance ratio rapidly approaches 2 as speedup increases.
△ Less
Submitted 8 May, 2006;
originally announced May 2006.