Search | arXiv e-print repository

From Counting Stations to City-Wide Estimates: Data-Driven Bicycle Volume Extrapolation

Authors: Silke K. Kaiser, Nadja Klein, Lynn H. Kaack

Abstract: Shifting to cycling in urban areas reduces greenhouse gas emissions and improves public health. Street-level bicycle volume information would aid cities in planning targeted infrastructure improvements to encourage cycling and provide civil society with evidence to advocate for cyclists' needs. Yet, the data currently available to cities and citizens often only comes from sparsely located counting… ▽ More Shifting to cycling in urban areas reduces greenhouse gas emissions and improves public health. Street-level bicycle volume information would aid cities in planning targeted infrastructure improvements to encourage cycling and provide civil society with evidence to advocate for cyclists' needs. Yet, the data currently available to cities and citizens often only comes from sparsely located counting stations. This paper extrapolates bicycle volume beyond these few locations to estimate bicycle volume for the entire city of Berlin. We predict daily and average annual daily street-level bicycle volumes using machine-learning techniques and various public data sources. These include app-based crowdsourced data, infrastructure, bike-sharing, motorized traffic, socioeconomic indicators, weather, and holiday data. Our analysis reveals that the best-performing model is XGBoost, and crowdsourced cycling and infrastructure data are most important for the prediction. We further simulate how collecting short-term counts at predicted locations improves performance. By providing ten days of such sample counts for each predicted location to the model, we are able to halve the error and greatly reduce the variability in performance among predicted locations. △ Less

Submitted 26 June, 2024; originally announced June 2024.

arXiv:2405.20876 [pdf, other]

Investigating Calibration and Corruption Robustness of Post-hoc Pruned Perception CNNs: An Image Classification Benchmark Study

Authors: Pallavi Mitra, Gesina Schwalbe, Nadja Klein

Abstract: Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance in many computer vision tasks. However, high computational and storage demands hinder their deployment into resource-constrained environments, such as embedded devices. Model pruning helps to meet these restrictions by reducing the model size, while maintaining superior performance. Meanwhile, safety-critical applicati… ▽ More Convolutional Neural Networks (CNNs) have achieved state-of-the-art performance in many computer vision tasks. However, high computational and storage demands hinder their deployment into resource-constrained environments, such as embedded devices. Model pruning helps to meet these restrictions by reducing the model size, while maintaining superior performance. Meanwhile, safety-critical applications pose more than just resource and performance constraints. In particular, predictions must not be overly confident, i.e., provide properly calibrated uncertainty estimations (proper uncertainty calibration), and CNNs must be robust against corruptions like naturally occurring input perturbations (natural corruption robustness). This work investigates the important trade-off between uncertainty calibration, natural corruption robustness, and performance for current state-of-research post-hoc CNN pruning techniques in the context of image classification tasks. Our study reveals that post-hoc pruning substantially improves the model's uncertainty calibration, performance, and natural corruption robustness, sparking hope for safe and robust embedded CNNs.Furthermore, uncertainty calibration and natural corruption robustness are not mutually exclusive targets under pruning, as evidenced by the improved safety aspects obtained by post-hoc unstructured pruning with increasing compression. △ Less

Submitted 31 May, 2024; originally announced May 2024.

Comments: 11 pages, 3 figures

arXiv:2404.17427 [pdf, other]

Cost-Sensitive Uncertainty-Based Failure Recognition for Object Detection

Authors: Moussa Kassem Sbeyti, Michelle Karg, Christian Wirth, Nadja Klein, Sahin Albayrak

Abstract: Object detectors in real-world applications often fail to detect objects due to varying factors such as weather conditions and noisy input. Therefore, a process that mitigates false detections is crucial for both safety and accuracy. While uncertainty-based thresholding shows promise, previous works demonstrate an imperfect correlation between uncertainty and detection errors. This hinders ideal t… ▽ More Object detectors in real-world applications often fail to detect objects due to varying factors such as weather conditions and noisy input. Therefore, a process that mitigates false detections is crucial for both safety and accuracy. While uncertainty-based thresholding shows promise, previous works demonstrate an imperfect correlation between uncertainty and detection errors. This hinders ideal thresholding, prompting us to further investigate the correlation and associated cost with different types of uncertainty. We therefore propose a cost-sensitive framework for object detection tailored to user-defined budgets on the two types of errors, missing and false detections. We derive minimum thresholding requirements to prevent performance degradation and define metrics to assess the applicability of uncertainty for failure recognition. Furthermore, we automate and optimize the thresholding process to maximize the failure recognition rate w.r.t. the specified budget. Evaluation on three autonomous driving datasets demonstrates that our approach significantly enhances safety, particularly in challenging scenarios. Leveraging localization aleatoric uncertainty and softmax-based entropy only, our method boosts the failure recognition rate by 36-60\% compared to conventional approaches. Code is available at https://mos-ks.github.io/publications. △ Less

Submitted 26 April, 2024; originally announced April 2024.

Comments: Accepted with an oral presentation at UAI 2024

Journal ref: The 40th Conference on Uncertainty in Artificial Intelligence, 2024, https://openreview.net/forum?id=HuibNFkaoi

arXiv:2404.14271 [pdf, other]

Sparse Explanations of Neural Networks Using Pruned Layer-Wise Relevance Propagation

Authors: Paulo Yanez Sarmiento, Simon Witzke, Nadja Klein, Bernhard Y. Renard

Abstract: Explainability is a key component in many applications involving deep neural networks (DNNs). However, current explanation methods for DNNs commonly leave it to the human observer to distinguish relevant explanations from spurious noise. This is not feasible anymore when going from easily human-accessible data such as images to more complex data such as genome sequences. To facilitate the accessib… ▽ More Explainability is a key component in many applications involving deep neural networks (DNNs). However, current explanation methods for DNNs commonly leave it to the human observer to distinguish relevant explanations from spurious noise. This is not feasible anymore when going from easily human-accessible data such as images to more complex data such as genome sequences. To facilitate the accessibility of DNN outputs from such complex data and to increase explainability, we present a modification of the widely used explanation method layer-wise relevance propagation. Our approach enforces sparsity directly by pruning the relevance propagation for the different layers. Thereby, we achieve sparser relevance attributions for the input features as well as for the intermediate layers. As the relevance propagation is input-specific, we aim to prune the relevance propagation rather than the underlying model architecture. This allows to prune different neurons for different inputs and hence, might be more appropriate to the local nature of explanation methods. To demonstrate the efficacy of our method, we evaluate it on two types of data, images and genomic sequences. We show that our modification indeed leads to noise reduction and concentrates relevance on the most important features compared to the baseline. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: 15 pages, 5 figures

arXiv:2403.11966 [pdf, other]

Informed Spectral Normalized Gaussian Processes for Trajectory Prediction

Authors: Christian Schlauch, Christian Wirth, Nadja Klein

Abstract: Prior parameter distributions provide an elegant way to represent prior expert and world knowledge for informed learning. Previous work has shown that using such informative priors to regularize probabilistic deep learning (DL) models increases their performance and data-efficiency. However, commonly used sampling-based approximations for probabilistic DL models can be computationally expensive, r… ▽ More Prior parameter distributions provide an elegant way to represent prior expert and world knowledge for informed learning. Previous work has shown that using such informative priors to regularize probabilistic deep learning (DL) models increases their performance and data-efficiency. However, commonly used sampling-based approximations for probabilistic DL models can be computationally expensive, requiring multiple inference passes and longer training times. Promising alternatives are compute-efficient last layer kernel approximations like spectral normalized Gaussian processes (SNGPs). We propose a novel regularization-based continual learning method for SNGPs, which enables the use of informative priors that represent prior knowledge learned from previous tasks. Our proposal builds upon well-established methods and requires no rehearsal memory or parameter expansion. We apply our informed SNGP model to the trajectory prediction problem in autonomous driving by integrating prior drivability knowledge. On two public datasets, we investigate its performance under diminishing training data and across locations, and thereby demonstrate an increase in data-efficiency and robustness to location-transfers over non-informed and informed baselines. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2401.06523 [pdf, other]

Boosting Causal Additive Models

Authors: Maximilian Kertel, Nadja Klein

Abstract: We present a boosting-based method to learn additive Structural Equation Models (SEMs) from observational data, with a focus on the theoretical aspects of determining the causal order among variables. We introduce a family of score functions based on arbitrary regression techniques, for which we establish necessary conditions to consistently favor the true causal ordering. Our analysis reveals tha… ▽ More We present a boosting-based method to learn additive Structural Equation Models (SEMs) from observational data, with a focus on the theoretical aspects of determining the causal order among variables. We introduce a family of score functions based on arbitrary regression techniques, for which we establish necessary conditions to consistently favor the true causal ordering. Our analysis reveals that boosting with early stopping meets these criteria and thus offers a consistent score function for causal orderings. To address the challenges posed by high-dimensional data sets, we adapt our approach through a component-wise gradient descent in the space of additive SEMs. Our simulation study underlines our theoretical results for lower dimensions and demonstrates that our high-dimensional adaptation is competitive with state-of-the-art methods. In addition, it exhibits robustness with respect to the choice of the hyperparameters making the procedure easy to tune. △ Less

Submitted 12 January, 2024; originally announced January 2024.

arXiv:2311.09941 [pdf, other]

Ghost Value Augmentation for $k$-Edge-Connectivity

Authors: D Ellis Hershkowitz, Nathan Klein, Rico Zenklusen

Abstract: We give a poly-time algorithm for the $k$-edge-connected spanning subgraph ($k$-ECSS) problem that returns a solution of cost no greater than the cheapest $(k+10)$-ECSS on the same graph. Our approach enhances the iterative relaxation framework with a new ingredient, which we call ghost values, that allows for high sparsity in intermediate problems. Our guarantees improve upon the best-known app… ▽ More We give a poly-time algorithm for the $k$-edge-connected spanning subgraph ($k$-ECSS) problem that returns a solution of cost no greater than the cheapest $(k+10)$-ECSS on the same graph. Our approach enhances the iterative relaxation framework with a new ingredient, which we call ghost values, that allows for high sparsity in intermediate problems. Our guarantees improve upon the best-known approximation factor of $2$ for $k$-ECSS whenever the optimal value of $(k+10)$-ECSS is close to that of $k$-ECSS. This is a property that holds for the closely related problem $k$-edge-connected spanning multi-subgraph ($k$-ECSM), which is identical to $k$-ECSS except edges can be selected multiple times at the same cost. As a consequence, we obtain a $\left(1+O\left(\frac{1}{k}\right)\right)$-approximation algorithm for $k$-ECSM, which resolves a conjecture of Pritchard and improves upon a recent $\left(1+O\left(\frac{1}{\sqrt{k}}\right)\right)$-approximation algorithm of Karlin, Klein, Oveis Gharan, and Zhang. Moreover, we present a matching lower bound for $k$-ECSM, showing that our approximation ratio is tight up to the constant factor in $O\left(\frac{1}{k}\right)$, unless $P=NP$. △ Less

Submitted 24 April, 2024; v1 submitted 16 November, 2023; originally announced November 2023.

arXiv:2311.09072 [pdf, ps, other]

From Trees to Polynomials and Back Again: New Capacity Bounds with Applications to TSP

Authors: Leonid Gurvits, Nathan Klein, Jonathan Leake

Abstract: We give simply exponential lower bounds on the probabilities of a given strongly Rayleigh distribution, depending only on its expectation. This resolves a weak version of a problem left open by Karlin-Klein-Oveis Gharan in their recent breakthrough work on metric TSP, and this resolution leads to a minor improvement of their approximation factor for metric TSP. Our results also allow for a more st… ▽ More We give simply exponential lower bounds on the probabilities of a given strongly Rayleigh distribution, depending only on its expectation. This resolves a weak version of a problem left open by Karlin-Klein-Oveis Gharan in their recent breakthrough work on metric TSP, and this resolution leads to a minor improvement of their approximation factor for metric TSP. Our results also allow for a more streamlined analysis of the algorithm. To achieve these new bounds, we build upon the work of Gurvits-Leake on the use of the productization technique for bounding the capacity of a real stable polynomial. This technique allows one to reduce certain inequalities for real stable polynomials to products of affine linear forms, which have an underlying matrix structure. In this paper, we push this technique further by characterizing the worst-case polynomials via bipartitioned forests. This rigid combinatorial structure yields a clean induction argument, which implies our stronger bounds. In general, we believe the results of this paper will lead to further improvement and simplification of the analysis of various combinatorial and probabilistic bounds and algorithms. △ Less

Submitted 9 May, 2024; v1 submitted 15 November, 2023; originally announced November 2023.

Comments: ICALP 2024

arXiv:2311.01950 [pdf, other]

A Lower Bound for the Max Entropy Algorithm for TSP

Authors: Billy Jin, Nathan Klein, David P. Williamson

Abstract: One of the most famous conjectures in combinatorial optimization is the four-thirds conjecture, which states that the integrality gap of the subtour LP relaxation of the TSP is equal to $\frac43$. For 40 years, the best known upper bound was 1.5, due to Wolsey (1980). Recently, Karlin, Klein, and Oveis Gharan (2022) showed that the max entropy algorithm for the TSP gives an improved bound of… ▽ More One of the most famous conjectures in combinatorial optimization is the four-thirds conjecture, which states that the integrality gap of the subtour LP relaxation of the TSP is equal to $\frac43$. For 40 years, the best known upper bound was 1.5, due to Wolsey (1980). Recently, Karlin, Klein, and Oveis Gharan (2022) showed that the max entropy algorithm for the TSP gives an improved bound of $1.5 - 10^{-36}$. In this paper, we show that the approximation ratio of the max entropy algorithm is at least 1.375, even for graphic TSP. Thus the max entropy algorithm does not appear to be the algorithm that will ultimately resolve the four-thirds conjecture in the affirmative, should that be possible. △ Less

Submitted 3 November, 2023; originally announced November 2023.

arXiv:2308.06254 [pdf, other]

A Better-Than-1.6-Approximation for Prize-Collecting TSP

Authors: Jannis Blauth, Nathan Klein, Martin Nägele

Abstract: Prize-Collecting TSP is a variant of the traveling salesperson problem where one may drop vertices from the tour at the cost of vertex-dependent penalties. The quality of a solution is then measured by adding the length of the tour and the sum of all penalties of vertices that are not visited. We present a polynomial-time approximation algorithm with an approximation guarantee slightly below… ▽ More Prize-Collecting TSP is a variant of the traveling salesperson problem where one may drop vertices from the tour at the cost of vertex-dependent penalties. The quality of a solution is then measured by adding the length of the tour and the sum of all penalties of vertices that are not visited. We present a polynomial-time approximation algorithm with an approximation guarantee slightly below $1.6$, where the guarantee is with respect to the natural linear programming relaxation of the problem. This improves upon the previous best-known approximation ratio of $1.774$. Our approach is based on a known decomposition for solutions of this linear relaxation into rooted trees. Our algorithm takes a tree from this decomposition and then performs a pruning step before doing parity correction on the remainder. Using a simple analysis, we bound the approximation guarantee of the proposed algorithm by $(1+\sqrt{5})/2 \approx 1.618$, the golden ratio. With some additional technical care we further improve it to $1.599$. △ Less

Submitted 14 February, 2024; v1 submitted 11 August, 2023; originally announced August 2023.

arXiv:2305.11575 [pdf, other]

The Deep Promotion Time Cure Model

Authors: Victor Medina-Olivares, Stefan Lessmann, Nadja Klein

Abstract: We propose a novel method for predicting time-to-event in the presence of cure fractions based on flexible survivals models integrated into a deep neural network framework. Our approach allows for non-linear relationships and high-dimensional interactions between covariates and survival and is suitable for large-scale applications. Furthermore, we allow the method to incorporate an identified pred… ▽ More We propose a novel method for predicting time-to-event in the presence of cure fractions based on flexible survivals models integrated into a deep neural network framework. Our approach allows for non-linear relationships and high-dimensional interactions between covariates and survival and is suitable for large-scale applications. Furthermore, we allow the method to incorporate an identified predictor formed of an additive decomposition of interpretable linear and non-linear effects and add an orthogonalization layer to capture potential higher dimensional interactions. We demonstrate the usefulness and computational efficiency of our method via simulations and apply it to a large portfolio of US mortgage loans. Here, we find not only a better predictive performance of our framework but also a more realistic picture of covariate effects. △ Less

Submitted 19 May, 2023; originally announced May 2023.

arXiv:2305.06625 [pdf, other]

Dropout Regularization in Extended Generalized Linear Models based on Double Exponential Families

Authors: Benedikt Lütke Schwienhorst, Lucas Kock, David J. Nott, Nadja Klein

Abstract: Even though dropout is a popular regularization technique, its theoretical properties are not fully understood. In this paper we study dropout regularization in extended generalized linear models based on double exponential families, for which the dispersion parameter can vary with the features. A theoretical analysis shows that dropout regularization prefers rare but important features in both th… ▽ More Even though dropout is a popular regularization technique, its theoretical properties are not fully understood. In this paper we study dropout regularization in extended generalized linear models based on double exponential families, for which the dispersion parameter can vary with the features. A theoretical analysis shows that dropout regularization prefers rare but important features in both the mean and dispersion, generalizing an earlier result for conventional generalized linear models. Training is performed using stochastic gradient descent with adaptive learning rate. To illustrate, we apply dropout to adaptive smoothing with B-splines, where both the mean and dispersion parameters are modelled flexibly. The important B-spline basis functions can be thought of as rare features, and we confirm in experiments that dropout is an effective form of regularization for mean and dispersion parameters that improves on a penalized maximum likelihood approach with an explicit smoothness penalty. △ Less

Submitted 11 May, 2023; originally announced May 2023.

arXiv:2304.08673 [pdf, other]

Semi-supervised Learning of Pushforwards For Domain Translation & Adaptation

Authors: Nishant Panda, Natalie Klein, Dominic Yang, Patrick Gasda, Diane Oyen

Abstract: Given two probability densities on related data spaces, we seek a map pushing one density to the other while satisfying application-dependent constraints. For maps to have utility in a broad application space (including domain translation, domain adaptation, and generative modeling), the map must be available to apply on out-of-sample data points and should correspond to a probabilistic model over… ▽ More Given two probability densities on related data spaces, we seek a map pushing one density to the other while satisfying application-dependent constraints. For maps to have utility in a broad application space (including domain translation, domain adaptation, and generative modeling), the map must be available to apply on out-of-sample data points and should correspond to a probabilistic model over the two spaces. Unfortunately, existing approaches, which are primarily based on optimal transport, do not address these needs. In this paper, we introduce a novel pushforward map learning algorithm that utilizes normalizing flows to parameterize the map. We first re-formulate the classical optimal transport problem to be map-focused and propose a learning algorithm to select from all possible maps under the constraint that the map minimizes a probability distance and application-specific regularizers; thus, our method can be seen as solving a modified optimal transport problem. Once the map is learned, it can be used to map samples from a source domain to a target domain. In addition, because the map is parameterized as a composition of normalizing flows, it models the empirical distributions over the two data spaces and allows both sampling and likelihood evaluation for both data sets. We compare our method (parOT) to related optimal transport approaches in the context of domain adaptation and domain translation on benchmark data sets. Finally, to illustrate the impact of our work on applied problems, we apply parOT to a real scientific application: spectral calibration for high-dimensional measurements from two vastly different environments △ Less

Submitted 17 April, 2023; originally announced April 2023.

Comments: 19 pages, 7 figures

arXiv:2304.07674 [pdf, ps, other]

Thin trees for laminar families

Authors: Nathan Klein, Neil Olver

Abstract: In the laminar-constrained spanning tree problem, the goal is to find a minimum-cost spanning tree which respects upper bounds on the number of times each cut in a given laminar family is crossed. This generalizes the well-studied degree-bounded spanning tree problem, as well as a previously studied setting where a chain of cuts is given. We give the first constant-factor approximation algorithm;… ▽ More In the laminar-constrained spanning tree problem, the goal is to find a minimum-cost spanning tree which respects upper bounds on the number of times each cut in a given laminar family is crossed. This generalizes the well-studied degree-bounded spanning tree problem, as well as a previously studied setting where a chain of cuts is given. We give the first constant-factor approximation algorithm; in particular we show how to obtain a multiplicative violation of the crossing bounds of less than 22 while losing less than a factor of 5 in terms of cost. Our result compares to the natural LP relaxation. As a consequence, our results show that given a $k$-edge-connected graph and a laminar family $\mathcal{L} \subseteq 2^V$ of cuts, there exists a spanning tree which contains only an $O(1/k)$ fraction of the edges across every cut in $\mathcal{L}$. This can be viewed as progress towards the Thin Tree Conjecture, which (in a strong form) states that this guarantee can be obtained for all cuts simultaneously. △ Less

Submitted 15 April, 2023; originally announced April 2023.

arXiv:2212.07554 [pdf, other]

Generative structured normalizing flow Gaussian processes applied to spectroscopic data

Authors: Natalie Klein, Nishant Panda, Patrick Gasda, Diane Oyen

Abstract: In this work, we propose a novel generative model for mapping inputs to structured, high-dimensional outputs using structured conditional normalizing flows and Gaussian process regression. The model is motivated by the need to characterize uncertainty in the input/output relationship when making inferences on new data. In particular, in the physical sciences, limited training data may not adequate… ▽ More In this work, we propose a novel generative model for mapping inputs to structured, high-dimensional outputs using structured conditional normalizing flows and Gaussian process regression. The model is motivated by the need to characterize uncertainty in the input/output relationship when making inferences on new data. In particular, in the physical sciences, limited training data may not adequately characterize future observed data; it is critical that models adequately indicate uncertainty, particularly when they may be asked to extrapolate. In our proposed model, structured conditional normalizing flows provide parsimonious latent representations that relate to the inputs through a Gaussian process, providing exact likelihood calculations and uncertainty that naturally increases away from the training data inputs. We demonstrate the methodology on laser-induced breakdown spectroscopy data from the ChemCam instrument onboard the Mars rover Curiosity. ChemCam was designed to recover the chemical composition of rock and soil samples by measuring the spectral properties of plasma atomic emissions induced by a laser pulse. We show that our model can generate realistic spectra conditional on a given chemical composition and that we can use the model to perform uncertainty quantification of chemical compositions for new observed spectra. Based on our results, we anticipate that our proposed modeling approach may be useful in other scientific domains with high-dimensional, complex structure where it is important to quantify predictive uncertainty. △ Less

Submitted 14 December, 2022; originally announced December 2022.

Comments: Best paper award, 1st Annual AAAI Workshop on AI to Accelerate Science and Engineering (AI2ASE), February 2022

arXiv:2212.06296 [pdf, ps, other]

A (Slightly) Improved Deterministic Approximation Algorithm for Metric TSP

Authors: Anna R. Karlin, Nathan Klein, Shayan Oveis Gharan

Abstract: We show that the max entropy algorithm can be derandomized (with respect to a particular objective function) to give a deterministic $3/2-ε$ approximation algorithm for metric TSP for some $ε> 10^{-36}$. To obtain our result, we apply the method of conditional expectation to an objective function constructed in prior work which was used to certify that the expected cost of the algorithm is at mo… ▽ More We show that the max entropy algorithm can be derandomized (with respect to a particular objective function) to give a deterministic $3/2-ε$ approximation algorithm for metric TSP for some $ε> 10^{-36}$. To obtain our result, we apply the method of conditional expectation to an objective function constructed in prior work which was used to certify that the expected cost of the algorithm is at most $3/2-ε$ times the cost of an optimal solution to the subtour elimination LP. The proof in this work involves showing that the expected value of this objective function can be computed in polynomial time (at all stages of the algorithm's execution). △ Less

Submitted 12 December, 2022; originally announced December 2022.

arXiv:2211.04639 [pdf, other]

A 4/3-Approximation Algorithm for Half-Integral Cycle Cut Instances of the TSP

Authors: Billy Jin, Nathan Klein, David P. Williamson

Abstract: A long-standing conjecture for the traveling salesman problem (TSP) states that the integrality gap of the standard linear programming relaxation of the TSP is at most 4/3. Despite significant efforts, the conjecture remains open. We consider the half-integral case, in which the LP has solution values in $\{0, 1/2, 1\}$. Such instances have been conjectured to be the most difficult instances for… ▽ More A long-standing conjecture for the traveling salesman problem (TSP) states that the integrality gap of the standard linear programming relaxation of the TSP is at most 4/3. Despite significant efforts, the conjecture remains open. We consider the half-integral case, in which the LP has solution values in $\{0, 1/2, 1\}$. Such instances have been conjectured to be the most difficult instances for the overall four-thirds conjecture. Karlin, Klein, and Oveis Gharan, in a breakthrough result, were able to show that in the half-integral case, the integrality gap is at most 1.49993. This result led to the first significant progress on the overall conjecture in decades; the same authors showed the integrality gap is at most $1.5- 10^{-36}$ in the non-half-integral case. For the half-integral case, the current best-known ratio is 1.4983, a result by Gupta et al. With the improvements on the 3/2 bound remaining very incremental even in the half-integral case, we turn the question around and look for a large class of half-integral instances for which we can prove that the 4/3 conjecture is correct. The previous works on the half-integral case perform induction on a hierarchy of critical tight sets in the support graph of the LP solution, in which some of the sets correspond to "cycle cuts" and the others to "degree cuts". We show that if all the sets in the hierarchy correspond to cycle cuts, then we can find a distribution of tours whose expected cost is at most 4/3 times the value of the half-integral LP solution; sampling from the distribution gives us a randomized 4/3-approximation algorithm. We note that the known bad cases for the integrality gap have a gap of 4/3 and have a half-integral LP solution in which all the critical tight sets in the hierarchy are cycle cuts; thus our result is tight. △ Less

Submitted 8 July, 2023; v1 submitted 8 November, 2022; originally announced November 2022.

Comments: Comments, questions, and suggestions are welcome!

arXiv:2211.00348 [pdf, other]

Informed Priors for Knowledge Integration in Trajectory Prediction

Authors: Christian Schlauch, Nadja Klein, Christian Wirth

Abstract: Informed machine learning methods allow the integration of prior knowledge into learning systems. This can increase accuracy and robustness or reduce data needs. However, existing methods often assume hard constraining knowledge, that does not require to trade-off prior knowledge with observations, but can be used to directly reduce the problem space. Other approaches use specific, architectural c… ▽ More Informed machine learning methods allow the integration of prior knowledge into learning systems. This can increase accuracy and robustness or reduce data needs. However, existing methods often assume hard constraining knowledge, that does not require to trade-off prior knowledge with observations, but can be used to directly reduce the problem space. Other approaches use specific, architectural changes as representation of prior knowledge, limiting applicability. We propose an informed machine learning method, based on continual learning. This allows the integration of arbitrary, prior knowledge, potentially from multiple sources, and does not require specific architectures. Furthermore, our approach enables probabilistic and multi-modal predictions, that can improve predictive accuracy and robustness. We exemplify our approach by applying it to a state-of-the-art trajectory predictor for autonomous driving. This domain is especially dependent on informed learning approaches, as it is subject to an overwhelming large variety of possible environments and very rare events, while requiring robust and accurate predictions. We evaluate our model on a commonly used benchmark dataset, only using data already available in a conventional setup. We show that our method outperforms both non-informed and informed learning methods, that are often used in the literature. Furthermore, we are able to compete with a conventional baseline, even using half as many observation examples. △ Less

Submitted 1 November, 2022; originally announced November 2022.

ACM Class: I.2.6

arXiv:2211.00080 [pdf, other]

Denoising neural networks for magnetic resonance spectroscopy

Authors: Natalie Klein, Amber J. Day, Harris Mason, Michael W. Malone, Sinead A. Williamson

Abstract: In many scientific applications, measured time series are corrupted by noise or distortions. Traditional denoising techniques often fail to recover the signal of interest, particularly when the signal-to-noise ratio is low or when certain assumptions on the signal and noise are violated. In this work, we demonstrate that deep learning-based denoising methods can outperform traditional techniques w… ▽ More In many scientific applications, measured time series are corrupted by noise or distortions. Traditional denoising techniques often fail to recover the signal of interest, particularly when the signal-to-noise ratio is low or when certain assumptions on the signal and noise are violated. In this work, we demonstrate that deep learning-based denoising methods can outperform traditional techniques while exhibiting greater robustness to variation in noise and signal characteristics. Our motivating example is magnetic resonance spectroscopy, in which a primary goal is to detect the presence of short-duration, low-amplitude radio frequency signals that are often obscured by strong interference that can be difficult to separate from the signal using traditional methods. We explore various deep learning architecture choices to capture the inherently complex-valued nature of magnetic resonance signals. On both synthetic and experimental data, we show that our deep learning-based approaches can exceed performance of traditional techniques, providing a powerful new class of methods for analysis of scientific time series data. △ Less

Submitted 31 October, 2022; originally announced November 2022.

Comments: 5 pages with appendix

arXiv:2111.12436 [pdf, ps, other]

Matroid Partition Property and the Secretary Problem

Authors: Dorna Abdolazimi, Anna R. Karlin, Nathan Klein, Shayan Oveis Gharan

Abstract: A matroid $\mathcal{M}$ on a set $E$ of elements has the $α$-partition property, for some $α>0$, if it is possible to (randomly) construct a partition matroid $\mathcal{P}$ on (a subset of) elements of $\mathcal{M}$ such that every independent set of $\mathcal{P}$ is independent in $\mathcal{M}$ and for any weight function $w:E\to\mathbb{R}_{\geq 0}$, the expected value of the optimum of the matro… ▽ More A matroid $\mathcal{M}$ on a set $E$ of elements has the $α$-partition property, for some $α>0$, if it is possible to (randomly) construct a partition matroid $\mathcal{P}$ on (a subset of) elements of $\mathcal{M}$ such that every independent set of $\mathcal{P}$ is independent in $\mathcal{M}$ and for any weight function $w:E\to\mathbb{R}_{\geq 0}$, the expected value of the optimum of the matroid secretary problem on $\mathcal{P}$ is at least an $α$-fraction of the optimum on $\mathcal{M}$. We show that the complete binary matroid, ${\cal B}_d$ on $\mathbb{F}_2^d$ does not satisfy the $α$-partition property for any constant $α>0$ (independent of $d$). Furthermore, we refute a recent conjecture of Bérczi, Schwarcz, and Yamaguchi by showing the same matroid is $2^d/d$-colorable but cannot be reduced to an $α2^d/d$-colorable partition matroid for any $α$ that is sublinear in $d$. △ Less

Submitted 24 November, 2021; originally announced November 2021.

arXiv:2110.01050 [pdf, other]

Marginally calibrated response distributions for end-to-end learning in autonomous driving

Authors: Clara Hoffmann, Nadja Klein

Abstract: End-to-end learners for autonomous driving are deep neural networks that predict the instantaneous steering angle directly from images of the ahead-lying street. These learners must provide reliable uncertainty estimates for their predictions in order to meet safety requirements and initiate a switch to manual control in areas of high uncertainty. Yet end-to-end learners typically only deliver poi… ▽ More End-to-end learners for autonomous driving are deep neural networks that predict the instantaneous steering angle directly from images of the ahead-lying street. These learners must provide reliable uncertainty estimates for their predictions in order to meet safety requirements and initiate a switch to manual control in areas of high uncertainty. Yet end-to-end learners typically only deliver point predictions, since distributional predictions are associated with large increases in training time or additional computational resources during prediction. To address this shortcoming we investigate efficient and scalable approximate inference for the implicit copula neural linear model of Klein, Nott and Smith (2021) in order to quantify uncertainty for the predictions of end-to-end learners. The result are densities for the steering angle that are marginally calibrated, i.e.~the average of the estimated densities equals the empirical distribution of steering angles. To ensure the scalability to large $n$ regimes, we develop efficient estimation based on variational inference as a fast alternative to computationally intensive, exact inference via Hamiltonian Monte Carlo. We demonstrate the accuracy and speed of the variational approach in comparison to Hamiltonian Monte Carlo on two end-to-end learners trained for highway driving using the comma2k19 data set. The implicit copula neural linear model delivers accurate calibration, high-quality prediction intervals and allows to identify overconfident learners. Our approach also contributes to the explainability of black-box end-to-end learners, since predictive densities can be used to understand which steering actions the end-to-end learner sees as valid. △ Less

Submitted 3 October, 2021; originally announced October 2021.

Comments: 17 pages, 9 figures

arXiv:2108.08709 [pdf, other]

Neural density estimation and uncertainty quantification for laser induced breakdown spectroscopy spectra

Authors: Katiana Kontolati, Natalie Klein, Nishant Panda, Diane Oyen

Abstract: Constructing probability densities for inference in high-dimensional spectral data is often intractable. In this work, we use normalizing flows on structured spectral latent spaces to estimate such densities, enabling downstream inference tasks. In addition, we evaluate a method for uncertainty quantification when predicting unobserved state vectors associated with each spectrum. We demonstrate th… ▽ More Constructing probability densities for inference in high-dimensional spectral data is often intractable. In this work, we use normalizing flows on structured spectral latent spaces to estimate such densities, enabling downstream inference tasks. In addition, we evaluate a method for uncertainty quantification when predicting unobserved state vectors associated with each spectrum. We demonstrate the capability of this approach on laser-induced breakdown spectroscopy data collected by the ChemCam instrument on the Mars rover Curiosity. Using our approach, we are able to generate realistic spectral samples and to accurately predict state vectors with associated well-calibrated uncertainties. We anticipate that this methodology will enable efficient probabilistic modeling of spectral data, leading to potential advances in several areas, including out-of-distribution detection and sensitivity analysis. △ Less

Submitted 16 August, 2021; originally announced August 2021.

Comments: 5 pages, 3 figures

arXiv:2105.15187 [pdf, other]

A Quasipolynomial $(2+\varepsilon)$-Approximation for Planar Sparsest Cut

Authors: Vincent Cohen-Addad, Anupam Gupta, Philip N. Klein, Jason Li

Abstract: The (non-uniform) sparsest cut problem is the following graph-partitioning problem: given a "supply" graph, and demands on pairs of vertices, delete some subset of supply edges to minimize the ratio of the supply edges cut to the total demand of the pairs separated by this deletion. Despite much effort, there are only a handful of nontrivial classes of supply graphs for which constant-factor appro… ▽ More The (non-uniform) sparsest cut problem is the following graph-partitioning problem: given a "supply" graph, and demands on pairs of vertices, delete some subset of supply edges to minimize the ratio of the supply edges cut to the total demand of the pairs separated by this deletion. Despite much effort, there are only a handful of nontrivial classes of supply graphs for which constant-factor approximations are known. We consider the problem for planar graphs, and give a $(2+\varepsilon)$-approximation algorithm that runs in quasipolynomial time. Our approach defines a new structural decomposition of an optimal solution using a "patching" primitive. We combine this decomposition with a Sherali-Adams-style linear programming relaxation of the problem, which we then round. This should be compared with the polynomial-time approximation algorithm of Rao (1999), which uses the metric linear programming relaxation and $\ell_1$-embeddings, and achieves an $O(\sqrt{\log n})$-approximation in polynomial time. △ Less

Submitted 31 May, 2021; originally announced May 2021.

Comments: To appear at STOC 2021

arXiv:2105.10043 [pdf, ps, other]

A (Slightly) Improved Bound on the Integrality Gap of the Subtour LP for TSP

Authors: Anna Karlin, Nathan Klein, Shayan Oveis Gharan

Abstract: We show that for some $ε> 10^{-36}$ and any metric TSP instance, the max entropy algorithm returns a solution of expected cost at most $\frac{3}{2}-ε$ times the cost of the optimal solution to the subtour elimination LP. This implies that the integrality gap of the subtour LP is at most $\frac{3}{2}-ε$. This analysis also shows that there is a randomized $\frac{3}{2}-ε$ approximation for the 2-edg… ▽ More We show that for some $ε> 10^{-36}$ and any metric TSP instance, the max entropy algorithm returns a solution of expected cost at most $\frac{3}{2}-ε$ times the cost of the optimal solution to the subtour elimination LP. This implies that the integrality gap of the subtour LP is at most $\frac{3}{2}-ε$. This analysis also shows that there is a randomized $\frac{3}{2}-ε$ approximation for the 2-edge-connected multi-subgraph problem, improving upon Christofides' algorithm. △ Less

Submitted 10 April, 2022; v1 submitted 20 May, 2021; originally announced May 2021.

Comments: arXiv admin note: text overlap with arXiv:2007.01409

arXiv:2104.02705 [pdf, other]

deepregression: a Flexible Neural Network Framework for Semi-Structured Deep Distributional Regression

Authors: David Rügamer, Chris Kolb, Cornelius Fritz, Florian Pfisterer, Philipp Kopper, Bernd Bischl, Ruolin Shen, Christina Bukas, Lisa Barros de Andrade e Sousa, Dominik Thalmeier, Philipp Baumann, Lucas Kook, Nadja Klein, Christian L. Müller

Abstract: In this paper we describe the implementation of semi-structured deep distributional regression, a flexible framework to learn conditional distributions based on the combination of additive regression models and deep networks. Our implementation encompasses (1) a modular neural network building system based on the deep learning library \pkg{TensorFlow} for the fusion of various statistical and deep… ▽ More In this paper we describe the implementation of semi-structured deep distributional regression, a flexible framework to learn conditional distributions based on the combination of additive regression models and deep networks. Our implementation encompasses (1) a modular neural network building system based on the deep learning library \pkg{TensorFlow} for the fusion of various statistical and deep learning approaches, (2) an orthogonalization cell to allow for an interpretable combination of different subnetworks, as well as (3) pre-processing steps necessary to set up such models. The software package allows to define models in a user-friendly manner via a formula interface that is inspired by classical statistical model frameworks such as \pkg{mgcv}. The packages' modular design and functionality provides a unique resource for both scalable estimation of complex statistical models and the combination of approaches from deep learning and statistics. This allows for state-of-the-art predictive performance while simultaneously retaining the indispensable interpretability of classical statistical models. △ Less

Submitted 10 March, 2022; v1 submitted 6 April, 2021; originally announced April 2021.

arXiv:2101.05921 [pdf, ps, other]

An Improved Approximation Algorithm for the Minimum $k$-Edge Connected Multi-Subgraph Problem

Authors: Anna R. Karlin, Nathan Klein, Shayan Oveis Gharan, Xinzhi Zhang

Abstract: We give a randomized $1+\frac{5.06}{\sqrt{k}}$-approximation algorithm for the minimum $k$-edge connected spanning multi-subgraph problem, $k$-ECSM. We give a randomized $1+\frac{5.06}{\sqrt{k}}$-approximation algorithm for the minimum $k$-edge connected spanning multi-subgraph problem, $k$-ECSM. △ Less

Submitted 20 May, 2022; v1 submitted 14 January, 2021; originally announced January 2021.

arXiv:2009.05039 [pdf, other]

On Light Spanners, Low-treewidth Embeddings and Efficient Traversing in Minor-free Graphs

Authors: Vincent Cohen-Addad, Arnold Filtser, Philip N. Klein, Hung Le

Abstract: Understanding the structure of minor-free metrics, namely shortest path metrics obtained over a weighted graph excluding a fixed minor, has been an important research direction since the fundamental work of Robertson and Seymour. A fundamental idea that helps both to understand the structural properties of these metrics and lead to strong algorithmic results is to construct a "small-complexity" gr… ▽ More Understanding the structure of minor-free metrics, namely shortest path metrics obtained over a weighted graph excluding a fixed minor, has been an important research direction since the fundamental work of Robertson and Seymour. A fundamental idea that helps both to understand the structural properties of these metrics and lead to strong algorithmic results is to construct a "small-complexity" graph that approximately preserves distances between pairs of points of the metric. We show the two following structural results for minor-free metrics: 1. Construction of a light subset spanner. Given a subset of vertices called terminals, and $ε$, in polynomial time we construct a subgraph that preserves all pairwise distances between terminals up to a multiplicative $1+ε$ factor, of total weight at most $O_ε(1)$ times the weight of the minimal Steiner tree spanning the terminals. 2. Construction of a stochastic metric embedding into low treewidth graphs with expected additive distortion $εD$. Namely, given a minor free graph $G=(V,E,w)$ of diameter $D$, and parameter $ε$, we construct a distribution $\mathcal{D}$ over dominating metric embeddings into treewidth-$O_ε(\log n)$ graphs such that the additive distortion is at most $εD$. One of our important technical contributions is a novel framework that allows us to reduce \emph{both problems} to problems on simpler graphs of bounded diameter. Our results have the following algorithmic consequences: (1) the first efficient approximation scheme for subset TSP in minor-free metrics; (2) the first approximation scheme for vehicle routing with bounded capacity in minor-free metrics; (3) the first efficient approximation scheme for vehicle routing with bounded capacity on bounded genus metrics. △ Less

Submitted 10 September, 2020; originally announced September 2020.

Comments: 65 pages, 6 figures. Abstract shorten due to limited characters

ACM Class: F.2.2

arXiv:2009.00188 [pdf, other]

On the computational tractability of a geographic clustering problem arising in redistricting

Authors: Vincent Cohen-Addad, Philip N. Klein, Dániel Marx

Abstract: Redistricting is the problem of dividing a state into a number $k$ of regions, called districts. Voters in each district elect a representative. The primary criteria are: each district is connected, district populations are equal (or nearly equal), and districts are "compact". There are multiple competing definitions of compactness, usually minimizing some quantity. One measure that has been rec… ▽ More Redistricting is the problem of dividing a state into a number $k$ of regions, called districts. Voters in each district elect a representative. The primary criteria are: each district is connected, district populations are equal (or nearly equal), and districts are "compact". There are multiple competing definitions of compactness, usually minimizing some quantity. One measure that has been recently promoted by Duchin and others is number of cut edges. In redistricting, one is given atomic regions out of which each district must be built. The populations of the atomic regions are given. Consider the graph with one vertex per atomic region (with weight equal to the region's population) and an edge between atomic regions that share a boundary. A districting plan is a partition of vertices into $k$ parts, each connnected, of nearly equal weight. The districts are considered compact to the extent that the plan minimizes the number of edges crossing between different parts. Consider two problems: find the most compact districting plan, and sample districting plans under a compactness constraint uniformly at random. Both problems are NP-hard so we restrict the input graph to have branchwidth at most $w$. (A planar graph's branchwidth is bounded by its diameter.) If both $k$ and $w$ are bounded by constants, the problems are solvable in polynomial time. Assume vertices have weight~1. One would like algorithms whose running times are of the form $O(f(k,w) n^c)$ for some constant $c$ independent of $k$ and $w$, in which case the problems are said to be fixed-parameter tractable with respect to $k$ and $w$). We show that, under a complexity-theoretic assumption, no such algorithms exist. However, we do give algorithms with running time $O(c^wn^{k+1})$. Thus if the diameter of the graph is moderately small and the number of districts is very small, our algorithm is useable. △ Less

Submitted 31 August, 2020; originally announced September 2020.

arXiv:2007.02377 [pdf, other]

New Hardness Results for Planar Graph Problems in P and an Algorithm for Sparsest Cut

Authors: Amir Abboud, Vincent Cohen-Addad, Philip N. Klein

Abstract: The Sparsest Cut is a fundamental optimization problem that has been extensively studied. For planar inputs the problem is in $P$ and can be solved in $\tilde{O}(n^3)$ time if all vertex weights are $1$. Despite a significant amount of effort, the best algorithms date back to the early 90's and can only achieve $O(\log n)$-approximation in $\tilde{O}(n)$ time or a constant factor approximation in… ▽ More The Sparsest Cut is a fundamental optimization problem that has been extensively studied. For planar inputs the problem is in $P$ and can be solved in $\tilde{O}(n^3)$ time if all vertex weights are $1$. Despite a significant amount of effort, the best algorithms date back to the early 90's and can only achieve $O(\log n)$-approximation in $\tilde{O}(n)$ time or a constant factor approximation in $\tilde{O}(n^2)$ time [Rao, STOC92]. Our main result is an $Ω(n^{2-ε})$ lower bound for Sparsest Cut even in planar graphs with unit vertex weights, under the $(min,+)$-Convolution conjecture, showing that approximations are inevitable in the near-linear time regime. To complement the lower bound, we provide a constant factor approximation in near-linear time, improving upon the 25-year old result of Rao in both time and accuracy. Our lower bound accomplishes a repeatedly raised challenge by being the first fine-grained lower bound for a natural planar graph problem in P. Moreover, we prove near-quadratic lower bounds under SETH for variants of the closest pair problem in planar graphs, and use them to show that the popular Average-Linkage procedure for Hierarchical Clustering cannot be simulated in truly subquadratic time. We prove an $Ω(n/\log{n})$ lower bound on the number of communication rounds required to compute the weighted diameter of a network in the CONGEST model, even when the underlying graph is planar and all nodes are $D=4$ hops away from each other. This is the first poly($n$) + $ω(D)$ lower bound in the planar-distributed setting, and it complements the recent poly$(D, \log{n})$ upper bounds of Li and Parter [STOC 2019] for (exact) unweighted diameter and for ($1+ε$) approximate weighted diameter. △ Less

Submitted 5 July, 2020; originally announced July 2020.

arXiv:2007.01409 [pdf, ps, other]

A (Slightly) Improved Approximation Algorithm for Metric TSP

Authors: Anna R. Karlin, Nathan Klein, Shayan Oveis Gharan

Abstract: For some $ε> 10^{-36}$ we give a randomized $3/2-ε$ approximation algorithm for metric TSP. For some $ε> 10^{-36}$ we give a randomized $3/2-ε$ approximation algorithm for metric TSP. △ Less

Submitted 25 October, 2023; v1 submitted 2 July, 2020; originally announced July 2020.

arXiv:2002.05777 [pdf, other]

Semi-Structured Distributional Regression -- Extending Structured Additive Models by Arbitrary Deep Neural Networks and Data Modalities

Authors: David Rügamer, Chris Kolb, Nadja Klein

Abstract: Combining additive models and neural networks allows to broaden the scope of statistical regression and extend deep learning-based approaches by interpretable structured additive predictors at the same time. Existing attempts uniting the two modeling approaches are, however, limited to very specific combinations and, more importantly, involve an identifiability issue. As a consequence, interpretab… ▽ More Combining additive models and neural networks allows to broaden the scope of statistical regression and extend deep learning-based approaches by interpretable structured additive predictors at the same time. Existing attempts uniting the two modeling approaches are, however, limited to very specific combinations and, more importantly, involve an identifiability issue. As a consequence, interpretability and stable estimation are typically lost. We propose a general framework to combine structured regression models and deep neural networks into a unifying network architecture. To overcome the inherent identifiability issues between different model parts, we construct an orthogonalization cell that projects the deep neural network into the orthogonal complement of the statistical model predictor. This enables proper estimation of structured model parts and thereby interpretability. We demonstrate the framework's efficacy in numerical experiments and illustrate its special merits in benchmarks and real-world applications. △ Less

Submitted 9 July, 2022; v1 submitted 13 February, 2020; originally announced February 2020.

arXiv:1912.11103 [pdf, ps, other]

A near-linear time minimum Steiner cut algorithm for planar graphs

Authors: Stephen Jue, Philip N. Klein

Abstract: We consider the Minimum Steiner Cut problem on undirected planar graphs with non-negative edge weights. This problem involves finding the minimum cut of the graph that separates a specified subset $X$ of vertices (terminals) into two parts. This problem is of theoretical interest because it generalizes two classical optimization problems, Minimum $s$-$t$ Cut and Minimum Cut, and of practical impor… ▽ More We consider the Minimum Steiner Cut problem on undirected planar graphs with non-negative edge weights. This problem involves finding the minimum cut of the graph that separates a specified subset $X$ of vertices (terminals) into two parts. This problem is of theoretical interest because it generalizes two classical optimization problems, Minimum $s$-$t$ Cut and Minimum Cut, and of practical importance because of its application to computing a lower bound for Steiner (Subset) TSP. Our algorithm has running time $O(n\log{n}\log{k})$ where $k$ is the number of terminals. △ Less

Submitted 31 December, 2019; v1 submitted 23 December, 2019; originally announced December 2019.

Comments: 14 pages, 6 figures

arXiv:1910.08953 [pdf, other]

Overcoming Free-Riding in Bandit Games

Authors: Johannes Hörner, Nicolas Klein, Sven Rady

Abstract: This paper considers a class of experimentation games with Lévy bandits encompassing those of Bolton and Harris (1999) and Keller, Rady and Cripps (2005). Its main result is that efficient (perfect Bayesian) equilibria exist whenever players' payoffs have a diffusion component. Hence, the trade-offs emphasized in the literature do not rely on the intrinsic nature of bandit models but on the common… ▽ More This paper considers a class of experimentation games with Lévy bandits encompassing those of Bolton and Harris (1999) and Keller, Rady and Cripps (2005). Its main result is that efficient (perfect Bayesian) equilibria exist whenever players' payoffs have a diffusion component. Hence, the trade-offs emphasized in the literature do not rely on the intrinsic nature of bandit models but on the commonly adopted solution concept (MPE). This is not an artifact of continuous time: we prove that efficient equilibria arise as limits of equilibria in the discrete-time game. Furthermore, it suffices to relax the solution concept to strongly symmetric equilibrium. △ Less

Submitted 20 December, 2021; v1 submitted 20 October, 2019; originally announced October 2019.

Comments: 66 pages, 4 figures; further minor corrections

arXiv:1909.11784 [pdf, other]

bamlss: A Lego Toolbox for Flexible Bayesian Regression (and Beyond)

Authors: Nikolaus Umlauf, Nadja Klein, Thorsten Simon, Achim Zeileis

Abstract: Over the last decades, the challenges in applied regression and in predictive modeling have been changing considerably: (1) More flexible model specifications are needed as big(ger) data become available, facilitated by more powerful computing infrastructure. (2) Full probabilistic modeling rather than predicting just means or expectations is crucial in many applications. (3) Interest in Bayesian… ▽ More Over the last decades, the challenges in applied regression and in predictive modeling have been changing considerably: (1) More flexible model specifications are needed as big(ger) data become available, facilitated by more powerful computing infrastructure. (2) Full probabilistic modeling rather than predicting just means or expectations is crucial in many applications. (3) Interest in Bayesian inference has been increasing both as an appealing framework for regularizing or penalizing model estimation as well as a natural alternative to classical frequentist inference. However, while there has been a lot of research in all three areas, also leading to associated software packages, a modular software implementation that allows to easily combine all three aspects has not yet been available. For filling this gap, the R package bamlss is introduced for Bayesian additive models for location, scale, and shape (and beyond). At the core of the package are algorithms for highly-efficient Bayesian estimation and inference that can be applied to generalized additive models (GAMs) or generalized additive models for location, scale, and shape (GAMLSS), also known as distributional regression. However, its building blocks are designed as "Lego bricks" encompassing various distributions (exponential family, Cox, joint models, ...), regression terms (linear, splines, random effects, tensor products, spatial fields, ...), and estimators (MCMC, backfitting, gradient boosting, lasso, ...). It is demonstrated how these can be easily recombined to make classical models more flexible or create new custom models for specific modeling challenges. △ Less

Submitted 25 September, 2019; originally announced September 2019.

Comments: 48 pages, 12 figures

arXiv:1908.00227 [pdf, ps, other]

An Improved Approximation Algorithm for TSP in the Half Integral Case

Authors: Anna Karlin, Nathan Klein, Shayan Oveis Gharan

Abstract: We design a $1.49993$-approximation algorithm for the metric traveling salesperson problem (TSP) for instances in which an optimal solution to the subtour linear programming relaxation is half-integral. These instances received significant attention over the last decade due to a conjecture of Schalekamp, Williamson and van Zuylen stating that half-integral LP solutions have the largest integrality… ▽ More We design a $1.49993$-approximation algorithm for the metric traveling salesperson problem (TSP) for instances in which an optimal solution to the subtour linear programming relaxation is half-integral. These instances received significant attention over the last decade due to a conjecture of Schalekamp, Williamson and van Zuylen stating that half-integral LP solutions have the largest integrality gap over all fractional solutions. So, if the conjecture of Schalekamp et al. holds true, our result shows that the integrality gap of the subtour polytope is bounded away from $3/2$. △ Less

Submitted 1 August, 2019; originally announced August 2019.

arXiv:1901.07032 [pdf, other]

A PTAS for Bounded-Capacity Vehicle Routing in Planar Graphs

Authors: Amariah Becker, Philip N. Klein, Aaron Schild

Abstract: The Capacitated Vehicle Routing problem is to find a minimum-cost set of tours that collectively cover clients in a graph, such that each tour starts and ends at a specified depot and is subject to a capacity bound on the number of clients it can serve. In this paper, we present a polynomial-time approximation scheme (PTAS) for instances in which the input graph is planar and the capacity is bound… ▽ More The Capacitated Vehicle Routing problem is to find a minimum-cost set of tours that collectively cover clients in a graph, such that each tour starts and ends at a specified depot and is subject to a capacity bound on the number of clients it can serve. In this paper, we present a polynomial-time approximation scheme (PTAS) for instances in which the input graph is planar and the capacity is bounded. Previously, only a quasipolynomial-time approximation scheme was known for these instances. To obtain this result, we show how to embed planar graphs into bounded-treewidth graphs while preserving, in expectation, the client-to-client distances up to a small additive error proportional to client distances to the depot. △ Less

Submitted 21 January, 2019; originally announced January 2019.

arXiv:1710.03358 [pdf, other]

Balanced power diagrams for redistricting

Authors: Vincent Cohen-Addad, Philip N. Klein, Neal E. Young

Abstract: We propose a method for redistricting, decomposing a geographical area into subareas, called districts, so that the populations of the districts are as close as possible and the districts are compact and contiguous. Each district is the intersection of a polygon with the geographical area. The polygons are convex and the average number of sides per polygon is less than six. The polygons tend to be… ▽ More We propose a method for redistricting, decomposing a geographical area into subareas, called districts, so that the populations of the districts are as close as possible and the districts are compact and contiguous. Each district is the intersection of a polygon with the geographical area. The polygons are convex and the average number of sides per polygon is less than six. The polygons tend to be quite compact. With each polygon is associated a center. The center is the centroid of the locations of the residents associated with the polygon. The algorithm can be viewed as a heuristic for finding centers and a balanced assignment of residents to centers so as to minimize the sum of squared distances of residents to centers; hence the solution can be said to have low dispersion. △ Less

Submitted 7 January, 2018; v1 submitted 9 October, 2017; originally announced October 2017.

arXiv:1707.08270 [pdf, other]

Polynomial-Time Approximation Schemes for k-Center and Bounded-Capacity Vehicle Routing in Graphs with Bounded Highway Dimension

Authors: Amariah Becker, Philip N. Klein, David Saulpic

Abstract: The concept of bounded highway dimension was developed to capture observed properties of the metrics of road networks. We show that a graph with bounded highway dimension, for any vertex, can be embedded into a a graph of bounded treewidth in such a way that the distance between $u$ and $v$ is preserved up to an additive error of $ε$ times the distance from $u$ or $v$ to the selected vertex. We sh… ▽ More The concept of bounded highway dimension was developed to capture observed properties of the metrics of road networks. We show that a graph with bounded highway dimension, for any vertex, can be embedded into a a graph of bounded treewidth in such a way that the distance between $u$ and $v$ is preserved up to an additive error of $ε$ times the distance from $u$ or $v$ to the selected vertex. We show that this theorem yields a PTAS for Bounded-Capacity Vehicle Routing in graphs of bounded highway dimension. In this problem, the input specifies a depot and a set of clients, each with a location and demand; the output is a set of depot-to-depot tours, where each client is visited by some tour and each tour covers at most $Q$ units of client demand. Our PTAS can be extended to handle penalties for unvisited clients. We extend this embedding result to handle a set $S$ of distinguished vertices. The treewidth depends on $|S|$, and the distance between $u$ and $v$ is preserved up to an additive error of $ε$ times the distance from $u$ and $v$ to $S$. This embedding result implies a PTAS for Multiple Depot Bounded-Capacity Vehicle Routing: the tours can go from one depot to another. The embedding result also implies that, for fixed $k$, there is a PTAS for $k$-Center in graphs of bounded highway dimension. In this problem, the goal is to minimize $d$ such that there exist $k$ vertices (the centers) such that every vertex is within distance $d$ of some center. Similarly, for fixed $k$, there is a PTAS for $k$-Median in graphs of bounded highway dimension. In this problem, the goal is to minimize the sum of distances to the $k$ centers. △ Less

Submitted 13 November, 2017; v1 submitted 25 July, 2017; originally announced July 2017.

arXiv:1603.09535 [pdf, other]

Local search yields approximation schemes for k-means and k-median in Euclidean and minor-free metrics

Authors: Vincent Cohen-Addad, Philip N. Klein, Claire Mathieu

Abstract: We give the first polynomial-time approximation schemes (PTASs) for the following problems: (1) uniform facility location in edge-weighted planar graphs; (2) $k$-median and $k$-means in edge-weighted planar graphs; (3) $k$-means in Euclidean spaces of bounded dimension. Our first and second results extend to minor-closed families of graphs. All our results extend to cost functions that are the… ▽ More We give the first polynomial-time approximation schemes (PTASs) for the following problems: (1) uniform facility location in edge-weighted planar graphs; (2) $k$-median and $k$-means in edge-weighted planar graphs; (3) $k$-means in Euclidean spaces of bounded dimension. Our first and second results extend to minor-closed families of graphs. All our results extend to cost functions that are the $p$-th power of the shortest-path distance. The algorithm is local search where the local neighborhood of a solution $S$ consists of all solutions obtained from $S$ by removing and adding $1/ε^{O(1)}$ centers. △ Less

Submitted 7 April, 2016; v1 submitted 31 March, 2016; originally announced March 2016.

arXiv:1504.08008 [pdf, other]

A Polynomial-time Bicriteria Approximation Scheme for Planar Bisection

Authors: Kyle Fox, Philip N. Klein, Shay Mozes

Abstract: Given an undirected graph with edge costs and node weights, the minimum bisection problem asks for a partition of the nodes into two parts of equal weight such that the sum of edge costs between the parts is minimized. We give a polynomial time bicriteria approximation scheme for bisection on planar graphs. Specifically, let $W$ be the total weight of all nodes in a planar graph $G$. For any con… ▽ More Given an undirected graph with edge costs and node weights, the minimum bisection problem asks for a partition of the nodes into two parts of equal weight such that the sum of edge costs between the parts is minimized. We give a polynomial time bicriteria approximation scheme for bisection on planar graphs. Specifically, let $W$ be the total weight of all nodes in a planar graph $G$. For any constant $\varepsilon > 0$, our algorithm outputs a bipartition of the nodes such that each part weighs at most $W/2 + \varepsilon$ and the total cost of edges crossing the partition is at most $(1+\varepsilon)$ times the total cost of the optimal bisection. The previously best known approximation for planar minimum bisection, even with unit node weights, was $O(\log n)$. Our algorithm actually solves a more general problem where the input may include a target weight for the smaller side of the bipartition. △ Less

Submitted 29 April, 2015; originally announced April 2015.

Comments: To appear in STOC 2015

ACM Class: G.2.2

arXiv:1208.2223 [pdf, other]

Structured Recursive Separator Decompositions for Planar Graphs in Linear Time

Authors: Philip N. Klein, Shay Mozes, Christian Sommer

Abstract: Given a planar graph G on n vertices and an integer parameter r<n, an r-division of G with few holes is a decomposition of G into O(n/r) regions of size at most r such that each region contains at most a constant number of faces that are not faces of G (also called holes), and such that, for each region, the total number of vertices on these faces is O(sqrt r). We provide a linear-time algorithm… ▽ More Given a planar graph G on n vertices and an integer parameter r<n, an r-division of G with few holes is a decomposition of G into O(n/r) regions of size at most r such that each region contains at most a constant number of faces that are not faces of G (also called holes), and such that, for each region, the total number of vertices on these faces is O(sqrt r). We provide a linear-time algorithm for computing r-divisions with few holes. In fact, our algorithm computes a structure, called decomposition tree, which represents a recursive decomposition of G that includes r-divisions for essentially all values of r. In particular, given an exponentially increasing sequence r = (r_1,r_2,...), our algorithm can produce a recursive r-division with few holes in linear time. r-divisions with few holes have been used in efficient algorithms to compute shortest paths, minimum cuts, and maximum flows. Our linear-time algorithm improves upon the decomposition algorithm used in the state-of-the-art algorithm for minimum st-cut (Italiano, Nussbaum, Sankowski, and Wulff-Nilsen, STOC 2011), removing one of the bottlenecks in the overall running time of their algorithm (analogously for minimum cut in planar and bounded-genus graphs). △ Less

Submitted 17 May, 2013; v1 submitted 10 August, 2012; originally announced August 2012.

Comments: 30 pages, 5 figures

Journal ref: STOC 2013

arXiv:1105.2228 [pdf, other]

Multiple-Source Multiple-Sink Maximum Flow in Directed Planar Graphs in Near-Linear Time

Authors: Glencora Borradaile, Philip N. Klein, Shay Mozes, Yahav Nussbaum, Christian Wulff-Nilsen

Abstract: We give an O(n log^3 n) algorithm that, given an n-node directed planar graph with arc capacities, a set of source nodes, and a set of sink nodes, finds a maximum flow from the sources to the sinks. Previously, the fastest algorithms known for this problem were those for general graphs. We give an O(n log^3 n) algorithm that, given an n-node directed planar graph with arc capacities, a set of source nodes, and a set of sink nodes, finds a maximum flow from the sources to the sinks. Previously, the fastest algorithms known for this problem were those for general graphs. △ Less

Submitted 11 May, 2011; originally announced May 2011.

Comments: 18 pages, 1 figure

arXiv:1104.5214 [pdf, ps, other]

doi 10.1007/978-3-642-22006-7_12

Linear-Space Approximate Distance Oracles for Planar, Bounded-Genus, and Minor-Free Graphs

Authors: Ken-ichi Kawarabayashi, Philip N. Klein, Christian Sommer

Abstract: A (1 + eps)-approximate distance oracle for a graph is a data structure that supports approximate point-to-point shortest-path-distance queries. The most relevant measures for a distance-oracle construction are: space, query time, and preprocessing time. There are strong distance-oracle constructions known for planar graphs (Thorup, JACM'04) and, subsequently, minor-excluded graphs (Abraham and Ga… ▽ More A (1 + eps)-approximate distance oracle for a graph is a data structure that supports approximate point-to-point shortest-path-distance queries. The most relevant measures for a distance-oracle construction are: space, query time, and preprocessing time. There are strong distance-oracle constructions known for planar graphs (Thorup, JACM'04) and, subsequently, minor-excluded graphs (Abraham and Gavoille, PODC'06). However, these require Omega(eps^{-1} n lg n) space for n-node graphs. We argue that a very low space requirement is essential. Since modern computer architectures involve hierarchical memory (caches, primary memory, secondary memory), a high memory requirement in effect may greatly increase the actual running time. Moreover, we would like data structures that can be deployed on small mobile devices, such as handhelds, which have relatively small primary memory. In this paper, for planar graphs, bounded-genus graphs, and minor-excluded graphs we give distance-oracle constructions that require only O(n) space. The big O hides only a fixed constant, independent of εand independent of genus or size of an excluded minor. The preprocessing times for our distance oracle are also faster than those for the previously known constructions. For planar graphs, the preprocessing time is O(n lg^2 n). However, our constructions have slower query times. For planar graphs, the query time is O(eps^{-2} lg^2 n). For our linear-space results, we can in fact ensure, for any delta > 0, that the space required is only 1 + delta times the space required just to represent the graph itself. △ Less

Submitted 27 April, 2011; originally announced April 2011.

arXiv:1104.4728 [pdf, other]

Multiple-Source Single-Sink Maximum Flow in Directed Planar Graphs in O(diameter*n*log(n)) Time

Authors: Philip N. Klein, Shay Mozes

Abstract: We develop a new technique for computing maximum flow in directed planar graphs with multiple sources and a single sink that significantly deviates from previously known techniques for flow problems. This gives rise to an O(diameter*n*log(n)) algorithm for the problem. We develop a new technique for computing maximum flow in directed planar graphs with multiple sources and a single sink that significantly deviates from previously known techniques for flow problems. This gives rise to an O(diameter*n*log(n)) algorithm for the problem. △ Less

Submitted 10 May, 2011; v1 submitted 25 April, 2011; originally announced April 2011.

Comments: proofs included. preliminary version to appear in WADS 2011

arXiv:1008.5332 [pdf, other]

Multiple-source single-sink maximum flow in directed planar graphs in $O(n^{1.5} \log n)$ time

Authors: Philip N. Klein, Shay Mozes

Abstract: We give an $O(n^{1.5} \log n)$ algorithm that, given a directed planar graph with arc capacities, a set of source nodes and a single sink node, finds a maximum flow from the sources to the sink . This is the first subquadratic-time strongly polynomial algorithm for the problem. We give an $O(n^{1.5} \log n)$ algorithm that, given a directed planar graph with arc capacities, a set of source nodes and a single sink node, finds a maximum flow from the sources to the sink . This is the first subquadratic-time strongly polynomial algorithm for the problem. △ Less

Submitted 14 September, 2010; v1 submitted 31 August, 2010; originally announced August 2010.

Comments: 13 pages, 2 figures. Corrected spelling in one citation

arXiv:cs/0406052 [pdf, ps, other]

doi 10.1109/IAW.2004.1437807

NoSEBrEaK - Attacking Honeynets

Authors: Maximillian Dornseif, Thorsten Holz, Christian N. Klein

Abstract: It is usually assumed that Honeynets are hard to detect and that attempts to detect or disable them can be unconditionally monitored. We scrutinize this assumption and demonstrate a method how a host in a honeynet can be completely controlled by an attacker without any substantial logging taking place. It is usually assumed that Honeynets are hard to detect and that attempts to detect or disable them can be unconditionally monitored. We scrutinize this assumption and demonstrate a method how a host in a honeynet can be completely controlled by an attacker without any substantial logging taking place. △ Less

Submitted 28 June, 2004; originally announced June 2004.

ACM Class: K.6.5; K.5.m

Journal ref: Proceedings from the fifth IEEE Systems, Man and Cybernetics Information Assurance Workshop, Westpoint, 2004; Pages 123-129

arXiv:cs/0208004 [pdf, ps, other]

doi 10.1007/s00453-002-1004-3

Detecting Race Conditions in Parallel Programs that Use Semaphores

Authors: Philip N. Klein, Hsueh-I Lu, Rob H. B. Netzer

Abstract: We address the problem of detecting race conditions in programs that use semaphores for synchronization. Netzer and Miller showed that it is NP-complete to detect race conditions in programs that use many semaphores. We show in this paper that it remains NP-complete even if only two semaphores are used in the parallel programs. For the tractable case, i.e., using only one semaphore, we give tw… ▽ More We address the problem of detecting race conditions in programs that use semaphores for synchronization. Netzer and Miller showed that it is NP-complete to detect race conditions in programs that use many semaphores. We show in this paper that it remains NP-complete even if only two semaphores are used in the parallel programs. For the tractable case, i.e., using only one semaphore, we give two algorithms for detecting race conditions from the trace of executing a parallel program on p processors, where n semaphore operations are executed. The first algorithm determines in O(n) time whether a race condition exists between any two given operations. The second algorithm runs in O(np log n) time and outputs a compact representation from which one can determine in O(1) time whether a race condition exists between any two given operations. The second algorithm is near-optimal in that the running time is only O(log n) times the time required simply to write down the output. △ Less

Submitted 3 August, 2002; originally announced August 2002.

Comments: 24 pages, 12 figures, preliminary versions appeared in WADS 93 and ESA 96

ACM Class: F.2.2; G.2.2; D.1.3; D.4.1; E.1

Journal ref: Algorithmica, 35(4):321-345, 2003

Showing 1–47 of 47 results for author: Klein, N