Search | arXiv e-print repository

A universal sequence of tensors for the asymptotic rank conjecture

Authors: Petteri Kaski, Mateusz Michałek

Abstract: The exponent $σ(T)$ of a tensor $T\in\mathbb{F}^d\otimes\mathbb{F}^d\otimes\mathbb{F}^d$ over a field $\mathbb{F}$ captures the base of the exponential growth rate of the tensor rank of $T$ under Kronecker powers. Tensor exponents are fundamental from the standpoint of algorithms and computational complexity theory; for example, the exponent $ω$ of matrix multiplication can be characterized as… ▽ More The exponent $σ(T)$ of a tensor $T\in\mathbb{F}^d\otimes\mathbb{F}^d\otimes\mathbb{F}^d$ over a field $\mathbb{F}$ captures the base of the exponential growth rate of the tensor rank of $T$ under Kronecker powers. Tensor exponents are fundamental from the standpoint of algorithms and computational complexity theory; for example, the exponent $ω$ of matrix multiplication can be characterized as $ω=2σ(\mathrm{MM}_2)$, where $\mathrm{MM}_2\in\mathbb{F}^4\otimes\mathbb{F}^4\otimes\mathbb{F}^4$ is the tensor that represents $2\times 2$ matrix multiplication. Our main result is an explicit construction of a sequence $\mathcal{U}_d$ of zero-one-valued tensors that is universal for the worst-case tensor exponent; more precisely, we show that $σ(\mathcal{U}_d)=σ(d)$ where $σ(d)=\sup_{T\in\mathbb{F}^d\otimes\mathbb{F}^d\otimes\mathbb{F}^d}σ(T)$. We also supply an explicit universal sequence $\mathcal{U}_Δ$ localised to capture the worst-case exponent $σ(Δ)$ of tensors with support contained in $Δ\subseteq [d]\times[d]\times [d]$; by combining such sequences, we obtain a universal sequence $\mathcal{T}_d$ such that $σ(\mathcal{T}_d)=1$ holds if and only if Strassen's asymptotic rank conjecture [Progr. Math. 120 (1994)] holds for $d$. Finally, we show that the limit $\lim_{d\rightarrow\infty}σ(d)$ exists and can be captured as $\lim_{d\rightarrow\infty} σ(D_d)$ for an explicit sequence $(D_d)_{d=1}^\infty$ of tensors obtained by diagonalisation of the sequences $\mathcal{U}_d$. As our second result we relate the absence of polynomials of fixed degree vanishing on tensors of low rank, or more generally asymptotic rank, with upper bounds on the exponent $σ(d)$. Using this technique, one may bound asymptotic rank for all tensors of a given format, knowing enough specific tensors of low asymptotic rank. △ Less

Submitted 9 April, 2024; originally announced April 2024.

MSC Class: 14N07; 68W30 ACM Class: I.1.2; F.2.1

arXiv:2404.04987 [pdf, other]

Chromatic number in $1.9999^n$ time? Fast deterministic set partitioning under the asymptotic rank conjecture

Authors: Andreas Björklund, Radu Curticapean, Thore Husfeldt, Petteri Kaski, Kevin Pratt

Abstract: In this paper we further explore the recently discovered connection by Björklund and Kaski [STOC 2024] and Pratt [STOC 2024] between the asymptotic rank conjecture of Strassen [Progr. Math. 1994] and the three-way partitioning problem. We show that under the asymptotic rank conjecture, the chromatic number of an $n$-vertex graph can be computed deterministically in $O(1.99982^n)$ time, thus giving… ▽ More In this paper we further explore the recently discovered connection by Björklund and Kaski [STOC 2024] and Pratt [STOC 2024] between the asymptotic rank conjecture of Strassen [Progr. Math. 1994] and the three-way partitioning problem. We show that under the asymptotic rank conjecture, the chromatic number of an $n$-vertex graph can be computed deterministically in $O(1.99982^n)$ time, thus giving a conditional answer to a question of Zamir [ICALP 2021], and questioning the optimality of the $2^n\operatorname{poly}(n)$ time algorithm for chromatic number by Björklund, Husfeldt, and Koivisto [SICOMP 2009]. Our technique is a combination of earlier algorithms for detecting $k$-colorings for small $k$ and enumerating $k$-colorable subgraphs, with an extension and derandomisation of Pratt's tensor-based algorithm for balanced three-way partitioning to the unbalanced case. △ Less

Submitted 7 April, 2024; originally announced April 2024.

arXiv:2310.11926 [pdf, ps, other]

The Asymptotic Rank Conjecture and the Set Cover Conjecture are not Both True

Authors: Andreas Björklund, Petteri Kaski

Abstract: Strassen's asymptotic rank conjecture [Progr. Math. 120 (1994)] claims a strong submultiplicative upper bound on the rank of a three-tensor obtained as an iterated Kronecker product of a constant-size base tensor. The conjecture, if true, most notably would put square matrix multiplication in quadratic time. We note here that some more-or-less unexpected algorithmic results in the area of exponent… ▽ More Strassen's asymptotic rank conjecture [Progr. Math. 120 (1994)] claims a strong submultiplicative upper bound on the rank of a three-tensor obtained as an iterated Kronecker product of a constant-size base tensor. The conjecture, if true, most notably would put square matrix multiplication in quadratic time. We note here that some more-or-less unexpected algorithmic results in the area of exponential-time algorithms would also follow. Specifically, we study the so-called set cover conjecture, which states that for any $ε>0$ there exists a positive integer constant $k$ such that no algorithm solves the $k$-Set Cover problem in worst-case time $\mathcal{O}((2-ε)^n|\mathcal F|\operatorname{poly}(n))$. The $k$-Set Cover problem asks, given as input an $n$-element universe $U$, a family $\mathcal F$ of size-at-most-$k$ subsets of $U$, and a positive integer $t$, whether there is a subfamily of at most $t$ sets in $\mathcal F$ whose union is $U$. The conjecture was formulated by Cygan et al. in the monograph Parameterized Algorithms [Springer, 2015] but was implicit as a hypothesis already in Cygan et al. [CCC 2012, ACM Trans. Algorithms 2016], there conjectured to follow from the Strong Exponential Time Hypothesis. We prove that if the asymptotic rank conjecture is true, then the set cover conjecture is false. Using a reduction by Krauthgamer and Trabelsi [STACS 2019], in this scenario we would also get a $\mathcal{O}((2-δ)^n)$-time randomized algorithm for some constant $δ>0$ for another well-studied problem for which no such algorithm is known, namely that of deciding whether a given $n$-vertex directed graph has a Hamiltonian cycle. △ Less

Submitted 18 October, 2023; originally announced October 2023.

arXiv:2308.01574 [pdf, ps, other]

Another Hamiltonian Cycle in Bipartite Pfaffian Graphs

Authors: Andreas Björklund, Petteri Kaski, Jesper Nederlof

Abstract: Finding a Hamiltonian cycle in a given graph is computationally challenging, and in general remains so even when one is further given one Hamiltonian cycle in the graph and asked to find another. In fact, no significantly faster algorithms are known for finding another Hamiltonian cycle than for finding a first one even in the setting where another Hamiltonian cycle is structurally guaranteed to e… ▽ More Finding a Hamiltonian cycle in a given graph is computationally challenging, and in general remains so even when one is further given one Hamiltonian cycle in the graph and asked to find another. In fact, no significantly faster algorithms are known for finding another Hamiltonian cycle than for finding a first one even in the setting where another Hamiltonian cycle is structurally guaranteed to exist, such as for odd-degree graphs. We identify a graph class -- the bipartite Pfaffian graphs of minimum degree three -- where it is NP-complete to decide whether a given graph in the class is Hamiltonian, but when presented with a Hamiltonian cycle as part of the input, another Hamiltonian cycle can be found efficiently. We prove that Thomason's lollipop method~[Ann.~Discrete Math.,~1978], a well-known algorithm for finding another Hamiltonian cycle, runs in a linear number of steps in cubic bipartite Pfaffian graphs. This was conjectured for cubic bipartite planar graphs by Haddadan [MSc~thesis,~Waterloo,~2015]; in contrast, examples are known of both cubic bipartite graphs and cubic planar graphs where the lollipop method takes exponential time. Beyond the lollipop method, we address a slightly more general graph class and present two algorithms, one running in linear-time and one operating in logarithmic space, that take as input (i) a bipartite Pfaffian graph $G$ of minimum degree three, (ii) a Hamiltonian cycle $H$ in $G$, and (iii) an edge $e$ in $H$, and output at least three other Hamiltonian cycles through the edge $e$ in $G$. We also present further improved algorithms for finding optimal traveling salesperson tours and counting Hamiltonian cycles in bipartite planar graphs with running times that are not known to hold in general planar graphs. △ Less

Submitted 21 February, 2024; v1 submitted 3 August, 2023; originally announced August 2023.

Comments: Adds analysis of Thomason's lollipop method

arXiv:2111.02992 [pdf, other]

The Shortest Even Cycle Problem is Tractable

Authors: Andreas Björklund, Thore Husfeldt, Petteri Kaski

Abstract: Given a directed graph, we show how to efficiently find a shortest (directed, simple) cycle on an even number of vertices. As far as we know, no polynomial-time algorithm was previously known for this problem. In fact, finding any even cycle in a directed graph in polynomial time was open for more than two decades until Robertson, Seymour, and Thomas (Ann. of Math. (2) 1999) and, independently, Mc… ▽ More Given a directed graph, we show how to efficiently find a shortest (directed, simple) cycle on an even number of vertices. As far as we know, no polynomial-time algorithm was previously known for this problem. In fact, finding any even cycle in a directed graph in polynomial time was open for more than two decades until Robertson, Seymour, and Thomas (Ann. of Math. (2) 1999) and, independently, McCuaig (Electron. J. Combin. 2004; announced jointly at STOC 1997) gave an efficiently testable structural characterisation of even-cycle-free directed graphs. Methodologically, our algorithm relies on algebraic fingerprinting and randomized polynomial identity testing over a finite field, and uses a generating polynomial implicit in Vazirani and Yannakakis ( Discrete Appl. Math. 1989) that enumerates weighted cycle covers as a difference of a permanent and a determinant polynomial. The need to work with the permanent is where our main technical contribution occurs. We design a family of finite commutative rings of characteristic 4 that simultaneously (i) give a nondegenerate representation for the generating polynomial identity via the permanent and the determinant, (ii) support efficient permanent computations, and (iii) enable emulation of finite-field arithmetic in characteristic 2. Here our work is foreshadowed by that of Björklund and Husfeldt (SIAM J. Comput. 2019), who used a considerably less efficient ring design to obtain a polynomial-time algorithm for the shortest two disjoint paths problem. Building on work of Gilbert and Tarjan (Numer. Math. 1978) as well as Alon and Yuster (J. ACM 2013), we also show how ideas from the nested dissection technique for solving linear equation systems leads to faster algorithm designs when we have control on the separator structure of the input graph; for example, this happens when the input has bounded genus. △ Less

Submitted 4 November, 2021; originally announced November 2021.

arXiv:2007.14092 [pdf, other]

Counting Short Vector Pairs by Inner Product and Relations to the Permanent

Authors: Andreas Björklund, Petteri Kaski

Abstract: Given as input two $n$-element sets $\mathcal A,\mathcal B\subseteq\{0,1\}^d$ with $d=c\log n\leq(\log n)^2/(\log\log n)^4$ and a target $t\in \{0,1,\ldots,d\}$, we show how to count the number of pairs $(x,y)\in \mathcal A\times \mathcal B$ with integer inner product $\langle x,y \rangle=t$ deterministically, in $n^2/2^{Ω\bigl(\!\sqrt{\log n\log \log n/(c\log^2 c)}\bigr)}$ time. This demonstrates… ▽ More Given as input two $n$-element sets $\mathcal A,\mathcal B\subseteq\{0,1\}^d$ with $d=c\log n\leq(\log n)^2/(\log\log n)^4$ and a target $t\in \{0,1,\ldots,d\}$, we show how to count the number of pairs $(x,y)\in \mathcal A\times \mathcal B$ with integer inner product $\langle x,y \rangle=t$ deterministically, in $n^2/2^{Ω\bigl(\!\sqrt{\log n\log \log n/(c\log^2 c)}\bigr)}$ time. This demonstrates that one can solve this problem in deterministic subquadratic time almost up to $\log^2 n$ dimensions, nearly matching the dimension bound of a subquadratic randomized detection algorithm of Alman and Williams [FOCS 2015]. We also show how to modify their randomized algorithm to count the pairs w.h.p., to obtain a fast randomized algorithm. Our deterministic algorithm builds on a novel technique of reconstructing a function from sum-aggregates by prime residues, which can be seen as an {\em additive} analog of the Chinese Remainder Theorem. As our second contribution, we relate the fine-grained complexity of the task of counting of vector pairs by inner product to the task of computing a zero-one matrix permanent over the integers. △ Less

Submitted 28 July, 2020; originally announced July 2020.

MSC Class: 68Q25 (Primary) 05A15; 15A15 (Secondary) ACM Class: F.2.2; G.2.1

arXiv:2003.03595 [pdf, other]

The Fine-Grained Complexity of Computing the Tutte Polynomial of a Linear Matroid

Authors: Andreas Björklund, Petteri Kaski

Abstract: We show that computing the Tutte polynomial of a linear matroid of dimension $k$ on $k^{O(1)}$ points over a field of $k^{O(1)}$ elements requires $k^{Ω(k)}$ time unless the \#ETH---a counting extension of the Exponential Time Hypothesis of Impagliazzo and Paturi [CCC 1999] due to Dell {\em et al.} [ACM TALG 2014]---is false. This holds also for linear matroids that admit a representation where ev… ▽ More We show that computing the Tutte polynomial of a linear matroid of dimension $k$ on $k^{O(1)}$ points over a field of $k^{O(1)}$ elements requires $k^{Ω(k)}$ time unless the \#ETH---a counting extension of the Exponential Time Hypothesis of Impagliazzo and Paturi [CCC 1999] due to Dell {\em et al.} [ACM TALG 2014]---is false. This holds also for linear matroids that admit a representation where every point is associated to a vector with at most two nonzero coordinates. We also show that the same is true for computing the Tutte polynomial of a binary matroid of dimension $k$ on $k^{O(1)}$ points with at most three nonzero coordinates in each point's vector. This is in sharp contrast to computing the Tutte polynomial of a $k$-vertex graph (that is, the Tutte polynomial of a {\em graphic} matroid of dimension $k$---which is representable in dimension $k$ over the binary field so that every vector has two nonzero coordinates), which is known to be computable in $2^k k^{O(1)}$ time [Björklund {\em et al.}, FOCS 2008]. Our lower-bound proofs proceed via (i) a connection due to Crapo and Rota [1970] between the number of tuples of codewords of full support and the Tutte polynomial of the matroid associated with the code; (ii) an earlier-established \#ETH-hardness of counting the solutions to a bipartite $(d,2)$-CSP on $n$ vertices in $d^{o(n)}$ time; and (iii) new embeddings of such CSP instances as questions about codewords of full support in a linear code. We complement these lower bounds with two algorithm designs. The first design computes the Tutte polynomial of a linear matroid of dimension~$k$ on $k^{O(1)}$ points in $k^{O(k)}$ operations. The second design generalizes the Björklund~{\em et al.} algorithm and runs in $q^{k+1}k^{O(1)}$ time for linear matroids of dimension $k$ defined over the $q$-element field by $k^{O(1)}$ points with at most two nonzero coordinates each. △ Less

Submitted 28 July, 2020; v1 submitted 7 March, 2020; originally announced March 2020.

Comments: This version adds Theorem 4

MSC Class: 68Q25 (Primary) 05A15; 05B35 (Secondary) ACM Class: F.2.2; G.2.1; E.4

arXiv:1909.01554 [pdf, other]

Engineering Boolean Matrix Multiplication for Multiple-Accelerator Shared-Memory Architectures

Authors: Matti Karppa, Petteri Kaski

Abstract: We study the problem of multiplying two bit matrices with entries either over the Boolean algebra $(0,1,\vee,\wedge)$ or over the binary field $(0,1,+,\cdot)$. We engineer high-performance open-source algorithm implementations for contemporary multiple-accelerator shared-memory architectures, with the objective of time-and-energy-efficient scaling up to input sizes close to the available shared me… ▽ More We study the problem of multiplying two bit matrices with entries either over the Boolean algebra $(0,1,\vee,\wedge)$ or over the binary field $(0,1,+,\cdot)$. We engineer high-performance open-source algorithm implementations for contemporary multiple-accelerator shared-memory architectures, with the objective of time-and-energy-efficient scaling up to input sizes close to the available shared memory capacity. For example, given two terabinary-bit square matrices as input, our implementations compute the Boolean product in approximately 2100 seconds (1.0 Pbop/s at 3.3 pJ/bop for a total of 2.1 kWh/product) and the binary product in less than 950 seconds (2.4 effective Pbop/s at 1.5 effective pJ/bop for a total of 0.92 kWh/product) on an NVIDIA DGX-1 with power consumption at peak system power (3.5 kW). Our contributions are (a) for the binary product, we use alternative-basis techniques of Karstadt and Schwartz [SPAA '17] to design novel alternative-basis variants of Strassen's recurrence for $2\times 2$ block multiplication [Numer. Math. 13 (1969)] that have been optimized for both the number of additions and low working memory, (b) structuring the parallel block recurrences and the memory layout for coalescent and register-localized execution on accelerator hardware, (c) low-level engineering of the innermost block products for the specific target hardware, and (d) structuring the top-level shared-memory implementation to feed the accelerators with data and integrate the results for input and output sizes beyond the aggregate memory capacity of the available accelerators. △ Less

Submitted 4 September, 2019; originally announced September 2019.

Comments: 37 pages

arXiv:1712.09630 [pdf, other]

Tensor network complexity of multilinear maps

Authors: Per Austrin, Petteri Kaski, Kaie Kubjas

Abstract: We study tensor networks as a model of arithmetic computation for evaluating multilinear maps. These capture any algorithm based on low border rank tensor decompositions, such as $O(n^{ω+ε})$ time matrix multiplication, and in addition many other algorithms such as $O(n \log n)$ time discrete Fourier transform and $O^*(2^n)$ time for computing the permanent of a matrix. However tensor networks som… ▽ More We study tensor networks as a model of arithmetic computation for evaluating multilinear maps. These capture any algorithm based on low border rank tensor decompositions, such as $O(n^{ω+ε})$ time matrix multiplication, and in addition many other algorithms such as $O(n \log n)$ time discrete Fourier transform and $O^*(2^n)$ time for computing the permanent of a matrix. However tensor networks sometimes yield faster algorithms than those that follow from low-rank decompositions. For instance the fastest known $O(n^{(ω+ε)t})$ time algorithms for counting $3t$-cliques can be implemented with tensor networks, even though the underlying tensor has border rank $n^{3t}$ for all $t \ge 2$. For counting homomorphisms of a general pattern graph $P$ into a host graph on $n$ vertices we obtain an upper bound of $O(n^{(ω+ε)\operatorname{bw}(P)/2})$ where $\operatorname{bw}(P)$ is the branchwidth of $P$. This essentially matches the bound for counting cliques, and yields small improvements over previous algorithms for many choices of $P$. While powerful, the model still has limitations, and we are able to show a number of unconditional lower bounds for various multilinear maps, including: (a) an $Ω(n^{\operatorname{bw}(P)})$ time lower bound for counting homomorphisms from $P$ to an $n$-vertex graph, matching the upper bound if $ω= 2$. In particular for $P$ a $v$-clique this yields an $Ω(n^{\lceil 2v/3 \rceil})$ time lower bound for counting $v$-cliques, and for $P$ a $k$-uniform $v$-hyperclique we obtain an $Ω(n^v)$ time lower bound for $k \ge 3$, ruling out tensor networks as an approach to obtaining non-trivial algorithms for hyperclique counting and the Max-$3$-CSP problem. (b) an $Ω(2^{0.918n})$ time lower bound for the permanent of an $n \times n$ matrix. △ Less

Submitted 15 November, 2018; v1 submitted 27 December, 2017; originally announced December 2017.

arXiv:1706.08325 [pdf, other]

An adaptive prefix-assignment technique for symmetry reduction

Authors: Tommi Junttila, Matti Karppa, Petteri Kaski, Jukka Kohonen

Abstract: This paper presents a technique for symmetry reduction that adaptively assigns a prefix of variables in a system of constraints so that the generated prefix-assignments are pairwise nonisomorphic under the action of the symmetry group of the system. The technique is based on McKay's canonical extension framework [J.~Algorithms 26 (1998), no.~2, 306--324]. Among key features of the technique are (i… ▽ More This paper presents a technique for symmetry reduction that adaptively assigns a prefix of variables in a system of constraints so that the generated prefix-assignments are pairwise nonisomorphic under the action of the symmetry group of the system. The technique is based on McKay's canonical extension framework [J.~Algorithms 26 (1998), no.~2, 306--324]. Among key features of the technique are (i) adaptability---the prefix sequence can be user-prescribed and truncated for compatibility with the group of symmetries; (ii) parallelizability---prefix-assignments can be processed in parallel independently of each other; (iii) versatility---the method is applicable whenever the group of symmetries can be concisely represented as the automorphism group of a vertex-colored graph; and (iv) implementability---the method can be implemented relying on a canonical labeling map for vertex-colored graphs as the only nontrivial subroutine. To demonstrate the practical applicability of our technique, we have prepared an experimental open-source implementation of the technique and carry out a set of experiments that demonstrate ability to reduce symmetry on hard instances. Furthermore, we demonstrate that the implementation effectively parallelizes to compute clusters with multiple nodes via a message-passing interface. △ Less

Submitted 8 September, 2018; v1 submitted 26 June, 2017; originally announced June 2017.

Comments: Updated manuscript submitted for review

ACM Class: F.4.1; F.2.2; G.2.1

arXiv:1607.04002 [pdf, ps, other]

Directed Hamiltonicity and Out-Branchings via Generalized Laplacians

Authors: Andreas Björklund, Petteri Kaski, Ioannis Koutis

Abstract: We are motivated by a tantalizing open question in exact algorithms: can we detect whether an $n$-vertex directed graph $G$ has a Hamiltonian cycle in time significantly less than $2^n$? We present new randomized algorithms that improve upon several previous works: 1. We show that for any constant $0<λ<1$ and prime $p$ we can count the Hamiltonian cycles modulo… ▽ More We are motivated by a tantalizing open question in exact algorithms: can we detect whether an $n$-vertex directed graph $G$ has a Hamiltonian cycle in time significantly less than $2^n$? We present new randomized algorithms that improve upon several previous works: 1. We show that for any constant $0<λ<1$ and prime $p$ we can count the Hamiltonian cycles modulo $p^{\lfloor (1-λ)\frac{n}{3p}\rfloor}$ in expected time less than $c^n$ for a constant $c<2$ that depends only on $p$ and $λ$. Such an algorithm was previously known only for the case of counting modulo two [Björklund and Husfeldt, FOCS 2013]. 2. We show that we can detect a Hamiltonian cycle in $O^*(3^{n-α(G)})$ time and polynomial space, where $α(G)$ is the size of the maximum independent set in $G$. In particular, this yields an $O^*(3^{n/2})$ time algorithm for bipartite directed graphs, which is faster than the exponential-space algorithm in [Cygan et al., STOC 2013]. Our algorithms are based on the algebraic combinatorics of "incidence assignments" that we can capture through evaluation of determinants of Laplacian-like matrices, inspired by the Matrix--Tree Theorem for directed graphs. In addition to the novel algorithms for directed Hamiltonicity, we use the Matrix--Tree Theorem to derive simple algebraic algorithms for detecting out-branchings. Specifically, we give an $O^*(2^k)$-time randomized algorithm for detecting out-branchings with at least $k$ internal vertices, improving upon the algorithms of [Zehavi, ESA 2015] and [Björklund et al., ICALP 2015]. We also present an algebraic algorithm for the directed $k$-Leaf problem, based on a non-standard monomial detection problem. △ Less

Submitted 25 April, 2017; v1 submitted 14 July, 2016; originally announced July 2016.

arXiv:1606.05608 [pdf, ps, other]

Explicit correlation amplifiers for finding outlier correlations in deterministic subquadratic time

Authors: Matti Karppa, Petteri Kaski, Jukka Kohonen, Padraig Ó Catháin

Abstract: We derandomize G. Valiant's [J. ACM 62 (2015) Art. 13] subquadratic-time algorithm for finding outlier correlations in binary data. Our derandomized algorithm gives deterministic subquadratic scaling essentially for the same parameter range as Valiant's randomized algorithm, but the precise constants we save over quadratic scaling are more modest. Our main technical tool for derandomization is an… ▽ More We derandomize G. Valiant's [J. ACM 62 (2015) Art. 13] subquadratic-time algorithm for finding outlier correlations in binary data. Our derandomized algorithm gives deterministic subquadratic scaling essentially for the same parameter range as Valiant's randomized algorithm, but the precise constants we save over quadratic scaling are more modest. Our main technical tool for derandomization is an explicit family of correlation amplifiers built via a family of zigzag-product expanders in Reingold, Vadhan, and Wigderson [Ann. of Math. 155 (2002) 157--187]. We say that a function $f:\{-1,1\}^d\rightarrow\{-1,1\}^D$ is a correlation amplifier with threshold $0\leqτ\leq 1$, error $γ\geq 1$, and strength $p$ an even positive integer if for all pairs of vectors $x,y\in\{-1,1\}^d$ it holds that (i) $|\langle x,y\rangle|<τd$ implies $|\langle f(x),f(y)\rangle|\leq(τγ)^pD$; and (ii) $|\langle x,y\rangle|\geqτd$ implies $\bigl(\frac{\langle x,y\rangle}{γd}\bigr)^pD \leq\langle f(x),f(y)\rangle\leq \bigl(\frac{γ\langle x,y\rangle}{d}\bigr)^pD$. △ Less

Submitted 8 November, 2016; v1 submitted 17 June, 2016; originally announced June 2016.

ACM Class: F.2.1

arXiv:1605.00462 [pdf, ps, other]

Sharper Upper Bounds for Unbalanced Uniquely Decodable Code Pairs

Authors: Per Austrin, Petteri Kaski, Mikko Koivisto, Jesper Nederlof

Abstract: Two sets $A, B \subseteq \{0, 1\}^n$ form a Uniquely Decodable Code Pair (UDCP) if every pair $a \in A$, $b \in B$ yields a distinct sum $a+b$, where the addition is over $\mathbb{Z}^n$. We show that every UDCP $A, B$, with $|A| = 2^{(1-ε)n}$ and $|B| = 2^{βn}$, satisfies $β\leq 0.4228 +\sqrtε$. For sufficiently small $ε$, this bound significantly improves previous bounds by Urbanke and Li~[Inform… ▽ More Two sets $A, B \subseteq \{0, 1\}^n$ form a Uniquely Decodable Code Pair (UDCP) if every pair $a \in A$, $b \in B$ yields a distinct sum $a+b$, where the addition is over $\mathbb{Z}^n$. We show that every UDCP $A, B$, with $|A| = 2^{(1-ε)n}$ and $|B| = 2^{βn}$, satisfies $β\leq 0.4228 +\sqrtε$. For sufficiently small $ε$, this bound significantly improves previous bounds by Urbanke and Li~[Information Theory Workshop '98] and Ordentlich and Shayevitz~[2014, arXiv:1412.8415], which upper bound $β$ by $0.4921$ and $0.4798$, respectively, as $ε$ approaches $0$. △ Less

Submitted 2 May, 2016; originally announced May 2016.

Comments: 11 pages; to appear at ISIT 2016

arXiv:1602.01295 [pdf, ps, other]

How proofs are prepared at Camelot

Authors: Andreas Björklund, Petteri Kaski

Abstract: We study a design framework for robust, independently verifiable, and workload-balanced distributed algorithms working on a common input. An algorithm based on the framework is essentially a distributed encoding procedure for a Reed--Solomon code, which enables (a) robustness against byzantine failures with intrinsic error-correction and identification of failed nodes, and (b) independent randomiz… ▽ More We study a design framework for robust, independently verifiable, and workload-balanced distributed algorithms working on a common input. An algorithm based on the framework is essentially a distributed encoding procedure for a Reed--Solomon code, which enables (a) robustness against byzantine failures with intrinsic error-correction and identification of failed nodes, and (b) independent randomized verification to check the entire computation for correctness, which takes essentially no more resources than each node individually contributes to the computation. The framework builds on recent Merlin--Arthur proofs of batch evaluation of Williams~[{\em Electron.\ Colloq.\ Comput.\ Complexity}, Report TR16-002, January 2016] with the observation that {\em Merlin's magic is not needed} for batch evaluation---mere Knights can prepare the proof, in parallel, and with intrinsic error-correction. The contribution of this paper is to show that in many cases the verifiable batch evaluation framework admits algorithms that match in total resource consumption the best known sequential algorithm for solving the problem. As our main result, we show that the $k$-cliques in an $n$-vertex graph can be counted {\em and} verified in per-node $O(n^{(ω+ε)k/6})$ time and space on $O(n^{(ω+ε)k/6})$ compute nodes, for any constant $ε>0$ and positive integer $k$ divisible by $6$, where $2\leqω<2.3728639$ is the exponent of matrix multiplication. This matches in total running time the best known sequential algorithm, due to Ne{š}et{ř}il and Poljak [{\em Comment.~Math.~Univ.~Carolin.}~26 (1985) 415--419], and considerably improves its space usage and parallelizability. Further results include novel algorithms for counting triangles in sparse graphs, computing the chromatic polynomial of a graph, and computing the Tutte polynomial of a graph. △ Less

Submitted 3 February, 2016; originally announced February 2016.

Comments: 42 pp

ACM Class: F.2.2; G.2.2; G.3; I.1.2

arXiv:1510.03895 [pdf, other]

A faster subquadratic algorithm for finding outlier correlations

Authors: Matti Karppa, Petteri Kaski, Jukka Kohonen

Abstract: We study the problem of detecting outlier pairs of strongly correlated variables among a collection of $n$ variables with otherwise weak pairwise correlations. After normalization, this task amounts to the geometric task where we are given as input a set of $n$ vectors with unit Euclidean norm and dimension $d$, and for some constants $0<τ<ρ<1$, we are asked to find all the outlier pairs of vector… ▽ More We study the problem of detecting outlier pairs of strongly correlated variables among a collection of $n$ variables with otherwise weak pairwise correlations. After normalization, this task amounts to the geometric task where we are given as input a set of $n$ vectors with unit Euclidean norm and dimension $d$, and for some constants $0<τ<ρ<1$, we are asked to find all the outlier pairs of vectors whose inner product is at least $ρ$ in absolute value, subject to the promise that all but at most $q$ pairs of vectors have inner product at most $τ$ in absolute value. Improving on an algorithm of G. Valiant [FOCS 2012; J. ACM 2015], we present a randomized algorithm that for Boolean inputs ($\{-1,1\}$-valued data normalized to unit Euclidean length) runs in time \[ \tilde O\bigl(n^{\max\,\{1-γ+M(Δγ,γ),\,M(1-γ,2Δγ)\}}+qdn^{2γ}\bigr)\,, \] where $0<γ<1$ is a constant tradeoff parameter and $M(μ,ν)$ is the exponent to multiply an $\lfloor n^μ\rfloor\times\lfloor n^ν\rfloor$ matrix with an $\lfloor n^ν\rfloor\times \lfloor n^μ\rfloor$ matrix and $Δ=1/(1-\log_τρ)$. As corollaries we obtain randomized algorithms that run in time \[ \tilde O\bigl(n^{\frac{2ω}{3-\log_τρ}}+qdn^{\frac{2(1-\log_τρ)}{3-\log_τρ}}\bigr) \] and in time \[ \tilde O\bigl(n^{\frac{4}{2+α(1-\log_τρ)}}+qdn^{\frac{2α(1-\log_τρ)}{2+α(1-\log_τρ)}}\bigr)\,, \] where $2\leqω<2.38$ is the exponent for square matrix multiplication and $0.3<α\leq 1$ is the exponent for rectangular matrix multiplication. The notation $\tilde O(\cdot)$ hides polylogarithmic factors in $n$ and $d$ whose degree may depend on $ρ$ and $τ$. We present further corollaries for the light bulb problem and for learning sparse Boolean functions. △ Less

Submitted 4 January, 2018; v1 submitted 13 October, 2015; originally announced October 2015.

Comments: ACM TALG, to appear

MSC Class: 65F30; 68W20; 62H20; 68T05; 68Q32 ACM Class: F.2.1; I.1.2; G.3; H.2.8; H.3.3; I.2.6

arXiv:1508.06019 [pdf, ps, other]

Dense Subset Sum may be the hardest

Authors: Per Austrin, Mikko Koivisto, Petteri Kaski, Jesper Nederlof

Abstract: The Subset Sum problem asks whether a given set of $n$ positive integers contains a subset of elements that sum up to a given target $t$. It is an outstanding open question whether the $O^*(2^{n/2})$-time algorithm for Subset Sum by Horowitz and Sahni [J. ACM 1974] can be beaten in the worst-case setting by a "truly faster", $O^*(2^{(0.5-δ)n})$-time algorithm, with some constant $δ> 0$. Continuing… ▽ More The Subset Sum problem asks whether a given set of $n$ positive integers contains a subset of elements that sum up to a given target $t$. It is an outstanding open question whether the $O^*(2^{n/2})$-time algorithm for Subset Sum by Horowitz and Sahni [J. ACM 1974] can be beaten in the worst-case setting by a "truly faster", $O^*(2^{(0.5-δ)n})$-time algorithm, with some constant $δ> 0$. Continuing an earlier work [STACS 2015], we study Subset Sum parameterized by the maximum bin size $β$, defined as the largest number of subsets of the $n$ input integers that yield the same sum. For every $ε> 0$ we give a truly faster algorithm for instances with $β\leq 2^{(0.5-ε)n}$, as well as instances with $β\geq 2^{0.661n}$. Consequently, we also obtain a characterization in terms of the popular density parameter $n/\log_2 t$: if all instances of density at least $1.003$ admit a truly faster algorithm, then so does every instance. This goes against the current intuition that instances of density 1 are the hardest, and therefore is a step toward answering the open question in the affirmative. Our results stem from novel combinations of earlier algorithms for Subset Sum and a study of an extremal question in additive combinatorics connected to the problem of Uniquely Decodable Code Pairs in information theory. △ Less

Submitted 24 August, 2015; originally announced August 2015.

Comments: 14 pages

arXiv:1508.03572 [pdf, other]

Fast Witness Extraction Using a Decision Oracle

Authors: Andreas Björklund, Petteri Kaski, Łukasz Kowalik

Abstract: The gist of many (NP-)hard combinatorial problems is to decide whether a universe of $n$ elements contains a witness consisting of $k$ elements that match some prescribed pattern. For some of these problems there are known advanced algebra-based FPT algorithms which solve the decision problem but do not return the witness. We investigate techniques for turning such a YES/NO-decision oracle into an… ▽ More The gist of many (NP-)hard combinatorial problems is to decide whether a universe of $n$ elements contains a witness consisting of $k$ elements that match some prescribed pattern. For some of these problems there are known advanced algebra-based FPT algorithms which solve the decision problem but do not return the witness. We investigate techniques for turning such a YES/NO-decision oracle into an algorithm for extracting a single witness, with an objective to obtain practical scalability for large values of $n$. By relying on techniques from combinatorial group testing, we demonstrate that a witness may be extracted with $O(k\log n)$ queries to either a deterministic or a randomized set inclusion oracle with one-sided probability of error. Furthermore, we demonstrate through implementation and experiments that the algebra-based FPT algorithms are practical, in particular in the setting of the $k$-path problem. Also discussed are engineering issues such as optimizing finite field arithmetic. △ Less

Submitted 14 August, 2015; originally announced August 2015.

Comments: Journal version, 16 pages. Extended abstract presented at ESA'14

ACM Class: F.2.2; G.2.2

arXiv:1503.04963 [pdf, other]

doi 10.1007/s00446-016-0270-2

Algebraic Methods in the Congested Clique

Authors: Keren Censor-Hillel, Petteri Kaski, Janne H. Korhonen, Christoph Lenzen, Ami Paz, Jukka Suomela

Abstract: In this work, we use algebraic methods for studying distance computation and subgraph detection tasks in the congested clique model. Specifically, we adapt parallel matrix multiplication implementations to the congested clique, obtaining an $O(n^{1-2/ω})$ round matrix multiplication algorithm, where $ω< 2.3728639$ is the exponent of matrix multiplication. In conjunction with known techniques from… ▽ More In this work, we use algebraic methods for studying distance computation and subgraph detection tasks in the congested clique model. Specifically, we adapt parallel matrix multiplication implementations to the congested clique, obtaining an $O(n^{1-2/ω})$ round matrix multiplication algorithm, where $ω< 2.3728639$ is the exponent of matrix multiplication. In conjunction with known techniques from centralised algorithmics, this gives significant improvements over previous best upper bounds in the congested clique model. The highlight results include: -- triangle and 4-cycle counting in $O(n^{0.158})$ rounds, improving upon the $O(n^{1/3})$ triangle detection algorithm of Dolev et al. [DISC 2012], -- a $(1 + o(1))$-approximation of all-pairs shortest paths in $O(n^{0.158})$ rounds, improving upon the $\tilde{O} (n^{1/2})$-round $(2 + o(1))$-approximation algorithm of Nanongkai [STOC 2014], and -- computing the girth in $O(n^{0.158})$ rounds, which is the first non-trivial solution in this model. In addition, we present a novel constant-round combinatorial algorithm for detecting 4-cycles. △ Less

Submitted 17 March, 2015; originally announced March 2015.

Comments: This is work is a merger of arxiv:1412.2109 and arxiv:1412.2667

arXiv:1412.2109

Algebrisation in Distributed Graph Algorithms: Fast Matrix Multiplication in the Congested Clique

Authors: Petteri Kaski, Janne H. Korhonen, Christoph Lenzen, Jukka Suomela

Abstract: While algebrisation constitutes a powerful technique in the design and analysis of centralised algorithms, to date there have been hardly any applications of algebraic techniques in the context of distributed graph algorithms. This work is a case study that demonstrates the potential of algebrisation in the distributed context. We will focus on distributed graph algorithms in the congested clique… ▽ More While algebrisation constitutes a powerful technique in the design and analysis of centralised algorithms, to date there have been hardly any applications of algebraic techniques in the context of distributed graph algorithms. This work is a case study that demonstrates the potential of algebrisation in the distributed context. We will focus on distributed graph algorithms in the congested clique model; the graph problems that we will consider include, e.g., the triangle detection problem and the all-pairs shortest path problem (APSP). There is plenty of prior work on combinatorial algorithms in the congested clique model: for example, Dolev et al. (DISC 2012) gave an algorithm for triangle detection with a running time of $\tilde O(n^{1/3})$, and Nanongkai (STOC 2014) gave an approximation algorithm for APSP with a running time of $\tilde O(n^{1/2})$. In this work, we will use algebraic techniques -- in particular, algorithms based on fast matrix multiplication -- to solve both triangle detection and the unweighted APSP in time $O(n^{0.15715})$; for weighted APSP, we give a $(1+o(1))$-approximation with this running time, as well as an exact $\tilde O(n^{1/3})$ solution. △ Less

Submitted 18 March, 2015; v1 submitted 5 December, 2014; originally announced December 2014.

Comments: This paper has been withdrawn by the authors. This paper has been superseded by arXiv:1503.04963 (merged from arXiv:1412.2109 and arXiv:1412.2667)

arXiv:1306.4111 [pdf, other]

Counting thin subgraphs via packings faster than meet-in-the-middle time

Authors: Andreas Björklund, Petteri Kaski, Łukasz Kowalik

Abstract: Vassilevska and Williams (STOC 2009) showed how to count simple paths on $k$ vertices and matchings on $k/2$ edges in an $n$-vertex graph in time $n^{k/2+O(1)}$. In the same year, two different algorithms with the same runtime were given by Koutis and Williams~(ICALP 2009), and Björklund \emph{et al.} (ESA 2009), via $n^{st/2+O(1)}$-time algorithms for counting $t$-tuples of pairwise disjoint sets… ▽ More Vassilevska and Williams (STOC 2009) showed how to count simple paths on $k$ vertices and matchings on $k/2$ edges in an $n$-vertex graph in time $n^{k/2+O(1)}$. In the same year, two different algorithms with the same runtime were given by Koutis and Williams~(ICALP 2009), and Björklund \emph{et al.} (ESA 2009), via $n^{st/2+O(1)}$-time algorithms for counting $t$-tuples of pairwise disjoint sets drawn from a given family of $s$-sized subsets of an $n$-element universe. Shortly afterwards, Alon and Gutner (TALG 2010) showed that these problems have $Ω(n^{\lfloor st/2\rfloor})$ and $Ω(n^{\lfloor k/2\rfloor})$ lower bounds when counting by color coding. Here we show that one can do better, namely, we show that the "meet-in-the-middle" exponent $st/2$ can be beaten and give an algorithm that counts in time $n^{0.45470382 st + O(1)}$ for $t$ a multiple of three. This implies algorithms for counting occurrences of a fixed subgraph on $k$ vertices and pathwidth $p\ll k$ in an $n$-vertex graph in $n^{0.45470382k+2p+O(1)}$ time, improving on the three mentioned algorithms for paths and matchings, and circumventing the color-coding lower bound. We also give improved bounds for counting $t$-tuples of disjoint $s$-sets for $s=2,3,4$. Our algorithms use fast matrix multiplication. We show an argument that this is necessary to go below the meet-in-the-middle barrier. △ Less

Submitted 14 August, 2015; v1 submitted 18 June, 2013; originally announced June 2013.

Comments: Journal version, 26 pages. Compared to the SODA'14 version, it contains some new results: a) improved algorithms for counting t-tuples of disjoint s-sets for the special cases of s = 2, 3, 4 and b) new hardness arguments

ACM Class: G.2.2; F.2.2

arXiv:1304.0513 [pdf, other]

Separating OR, SUM, and XOR Circuits

Authors: Magnus Find, Mika Göös, Matti Järvisalo, Petteri Kaski, Mikko Koivisto, Janne H. Korhonen

Abstract: Given a boolean n by n matrix A we consider arithmetic circuits for computing the transformation x->Ax over different semirings. Namely, we study three circuit models: monotone OR-circuits, monotone SUM-circuits (addition of non-negative integers), and non-monotone XOR-circuits (addition modulo 2). Our focus is on \emph{separating} these models in terms of their circuit complexities. We give three… ▽ More Given a boolean n by n matrix A we consider arithmetic circuits for computing the transformation x->Ax over different semirings. Namely, we study three circuit models: monotone OR-circuits, monotone SUM-circuits (addition of non-negative integers), and non-monotone XOR-circuits (addition modulo 2). Our focus is on \emph{separating} these models in terms of their circuit complexities. We give three results towards this goal: (1) We prove a direct sum type theorem on the monotone complexity of tensor product matrices. As a corollary, we obtain matrices that admit OR-circuits of size O(n), but require SUM-circuits of size Ω(n^{3/2}/\log^2n). (2) We construct so-called \emph{k-uniform} matrices that admit XOR-circuits of size O(n), but require OR-circuits of size Ω(n^2/\log^2n). (3) We consider the task of \emph{rewriting} a given OR-circuit as a XOR-circuit and prove that any subquadratic-time algorithm for this task violates the strong exponential time hypothesis. △ Less

Submitted 22 April, 2013; v1 submitted 1 April, 2013; originally announced April 2013.

Comments: 1 + 16 pages, 2 figures. In this version we have improved the presentation following comments made by Stasys Jukna and Igor Sergeev

arXiv:1303.0609 [pdf, other]

Space--Time Tradeoffs for Subset Sum: An Improved Worst Case Algorithm

Authors: Per Austrin, Petteri Kaski, Mikko Koivisto, Jussi Määttä

Abstract: The technique of Schroeppel and Shamir (SICOMP, 1981) has long been the most efficient way to trade space against time for the SUBSET SUM problem. In the random-instance setting, however, improved tradeoffs exist. In particular, the recently discovered dissection method of Dinur et al. (CRYPTO 2012) yields a significantly improved space--time tradeoff curve for instances with strong randomness pro… ▽ More The technique of Schroeppel and Shamir (SICOMP, 1981) has long been the most efficient way to trade space against time for the SUBSET SUM problem. In the random-instance setting, however, improved tradeoffs exist. In particular, the recently discovered dissection method of Dinur et al. (CRYPTO 2012) yields a significantly improved space--time tradeoff curve for instances with strong randomness properties. Our main result is that these strong randomness assumptions can be removed, obtaining the same space--time tradeoffs in the worst case. We also show that for small space usage the dissection algorithm can be almost fully parallelized. Our strategy for dealing with arbitrary instances is to instead inject the randomness into the dissection process itself by working over a carefully selected but random composite modulus, and to introduce explicit space--time controls into the algorithm by means of a "bailout mechanism". △ Less

Submitted 4 March, 2013; originally announced March 2013.

ACM Class: F.2.1; F.2.3; G.2.1; G.3

arXiv:1209.1082 [pdf, other]

Constrained Multilinear Detection and Generalized Graph Motifs

Authors: Andreas Bjorklund, Petteri Kaski, Lukasz Kowalik

Abstract: We introduce a new algebraic sieving technique to detect constrained multilinear monomials in multivariate polynomial generating functions given by an evaluation oracle. As applications of the technique, we show an $O^*(2^k)$-time polynomial space algorithm for the $k$-sized Graph Motif problem. We also introduce a new optimization variant of the problem, called Closest Graph Motif and solve it wi… ▽ More We introduce a new algebraic sieving technique to detect constrained multilinear monomials in multivariate polynomial generating functions given by an evaluation oracle. As applications of the technique, we show an $O^*(2^k)$-time polynomial space algorithm for the $k$-sized Graph Motif problem. We also introduce a new optimization variant of the problem, called Closest Graph Motif and solve it within the same time bound. The Closest Graph Motif problem encompasses several previously studied optimization variants, like Maximum Graph Motif, Min-Substitute Graph Motif, and Min-Add Graph Motif. Finally, we provide a piece of evidence that our result might be essentially tight: the existence of an $O^*((2-ε)^k)$-time algorithm for the Graph Motif problem implies an $O((2-ε')^n)$-time algorithm for Set Cover. △ Less

Submitted 14 May, 2013; v1 submitted 5 September, 2012; originally announced September 2012.

Comments: Journal submission, 18 pages. The conference version (see http://drops.dagstuhl.de/opus/volltexte/2013/3919/pdf/7.pdf) was presented at STACS 2013

arXiv:1208.0554 [pdf, other]

Fast Monotone Summation over Disjoint Sets

Authors: Petteri Kaski, Mikko Koivisto, Janne H. Korhonen

Abstract: We study the problem of computing an ensemble of multiple sums where the summands in each sum are indexed by subsets of size $p$ of an $n$-element ground set. More precisely, the task is to compute, for each subset of size $q$ of the ground set, the sum over the values of all subsets of size $p$ that are disjoint from the subset of size $q$. We present an arithmetic circuit that, without subtracti… ▽ More We study the problem of computing an ensemble of multiple sums where the summands in each sum are indexed by subsets of size $p$ of an $n$-element ground set. More precisely, the task is to compute, for each subset of size $q$ of the ground set, the sum over the values of all subsets of size $p$ that are disjoint from the subset of size $q$. We present an arithmetic circuit that, without subtraction, solves the problem using $O((n^p+n^q)\log n)$ arithmetic gates, all monotone; for constant $p$, $q$ this is within the factor $\log n$ of the optimal. The circuit design is based on viewing the summation as a "set nucleation" task and using a tree-projection approach to implement the nucleation. Applications include improved algorithms for counting heaviest $k$-paths in a weighted graph, computing permanents of rectangular matrices, and dynamic feature selection in machine learning. △ Less

Submitted 2 August, 2012; originally announced August 2012.

arXiv:1203.4063 [pdf, ps, other]

Homomorphic Hashing for Sparse Coefficient Extraction

Authors: Petteri Kaski, Mikko Koivisto, Jesper Nederlof

Abstract: We study classes of Dynamic Programming (DP) algorithms which, due to their algebraic definitions, are closely related to coefficient extraction methods. DP algorithms can easily be modified to exploit sparseness in the DP table through memorization. Coefficient extraction techniques on the other hand are both space-efficient and parallelisable, but no tools have been available to exploit sparsene… ▽ More We study classes of Dynamic Programming (DP) algorithms which, due to their algebraic definitions, are closely related to coefficient extraction methods. DP algorithms can easily be modified to exploit sparseness in the DP table through memorization. Coefficient extraction techniques on the other hand are both space-efficient and parallelisable, but no tools have been available to exploit sparseness. We investigate the systematic use of homomorphic hash functions to combine the best of these methods and obtain improved space-efficient algorithms for problems including LINEAR SAT, SET PARTITION, and SUBSET SUM. Our algorithms run in time proportional to the number of nonzero entries of the last segment of the DP table, which presents a strict improvement over sparse DP. The last property also gives an improved algorithm for CNF SAT with sparse projections. △ Less

Submitted 19 March, 2012; originally announced March 2012.

arXiv:1007.1161 [pdf, ps, other]

Narrow sieves for parameterized paths and packings

Authors: Andreas Björklund, Thore Husfeldt, Petteri Kaski, Mikko Koivisto

Abstract: We present randomized algorithms for some well-studied, hard combinatorial problems: the k-path problem, the p-packing of q-sets problem, and the q-dimensional p-matching problem. Our algorithms solve these problems with high probability in time exponential only in the parameter (k, p, q) and using polynomial space; the constant bases of the exponentials are significantly smaller than in previous… ▽ More We present randomized algorithms for some well-studied, hard combinatorial problems: the k-path problem, the p-packing of q-sets problem, and the q-dimensional p-matching problem. Our algorithms solve these problems with high probability in time exponential only in the parameter (k, p, q) and using polynomial space; the constant bases of the exponentials are significantly smaller than in previous works. For example, for the k-path problem the improvement is from 2 to 1.66. We also show how to detect if a d-regular graph admits an edge coloring with $d$ colors in time within a polynomial factor of O(2^{(d-1)n/2}). Our techniques build upon and generalize some recently published ideas by I. Koutis (ICALP 2009), R. Williams (IPL 2009), and A. Björklund (STACS 2010, FOCS 2010). △ Less

Submitted 7 July, 2010; originally announced July 2010.

arXiv:0904.3251 [pdf, other]

On evaluation of permanents

Authors: Andreas Björklund, Thore Husfeldt, Petteri Kaski, Mikko Koivisto

Abstract: We study the time and space complexity of matrix permanents over rings and semirings. We study the time and space complexity of matrix permanents over rings and semirings. △ Less

Submitted 21 April, 2009; originally announced April 2009.

arXiv:0904.3093 [pdf, other]

Counting Paths and Packings in Halves

Authors: Andreas Björklund, Thore Husfeldt, Petteri Kaski, Mikko Koivisto

Abstract: It is shown that one can count $k$-edge paths in an $n$-vertex graph and $m$-set $k$-packings on an $n$-element universe, respectively, in time ${n \choose k/2}$ and ${n \choose mk/2}$, up to a factor polynomial in $n$, $k$, and $m$; in polynomial space, the bounds hold if multiplied by $3^{k/2}$ or $5^{mk/2}$, respectively. These are implications of a more general result: given two set families… ▽ More It is shown that one can count $k$-edge paths in an $n$-vertex graph and $m$-set $k$-packings on an $n$-element universe, respectively, in time ${n \choose k/2}$ and ${n \choose mk/2}$, up to a factor polynomial in $n$, $k$, and $m$; in polynomial space, the bounds hold if multiplied by $3^{k/2}$ or $5^{mk/2}$, respectively. These are implications of a more general result: given two set families on an $n$-element universe, one can count the disjoint pairs of sets in the Cartesian product of the two families with $\nO(n \ell)$ basic operations, where $\ell$ is the number of members in the two families and their subsets. △ Less

Submitted 20 April, 2009; originally announced April 2009.

arXiv:0812.4893 [pdf, other]

doi 10.1007/s00453-009-9353-9

Almost stable matchings in constant time

Authors: Patrik Floréen, Petteri Kaski, Valentin Polishchuk, Jukka Suomela

Abstract: We show that the ratio of matched individuals to blocking pairs grows linearly with the number of propose--accept rounds executed by the Gale--Shapley algorithm for the stable marriage problem. Consequently, the participants can arrive at an almost stable matching even without full information about the problem instance; for each participant, knowing only its local neighbourhood is enough. In di… ▽ More We show that the ratio of matched individuals to blocking pairs grows linearly with the number of propose--accept rounds executed by the Gale--Shapley algorithm for the stable marriage problem. Consequently, the participants can arrive at an almost stable matching even without full information about the problem instance; for each participant, knowing only its local neighbourhood is enough. In distributed-systems parlance, this means that if each person has only a constant number of acceptable partners, an almost stable matching emerges after a constant number of synchronous communication rounds. This holds even if ties are present in the preference lists. We apply our results to give a distributed $(2+ε)$-approximation algorithm for maximum-weight matching in bicoloured graphs and a centralised randomised constant-time approximation scheme for estimating the size of a stable matching. △ Less

Submitted 29 December, 2008; originally announced December 2008.

Comments: 20 pages

Journal ref: Algorithmica 58 (2010) 102-118

arXiv:0809.2489 [pdf, other]

The fast intersection transform with applications to counting paths

Authors: Andreas Björklund, Thore Husfeldt, Petteri Kaski, Mikko Koivisto

Abstract: We present an algorithm for evaluating a linear ``intersection transform'' of a function defined on the lattice of subsets of an $n$-element set. In particular, the algorithm constructs an arithmetic circuit for evaluating the transform in ``down-closure time'' relative to the support of the function and the evaluation domain. As an application, we develop an algorithm that, given as input a dig… ▽ More We present an algorithm for evaluating a linear ``intersection transform'' of a function defined on the lattice of subsets of an $n$-element set. In particular, the algorithm constructs an arithmetic circuit for evaluating the transform in ``down-closure time'' relative to the support of the function and the evaluation domain. As an application, we develop an algorithm that, given as input a digraph with $n$ vertices and bounded integer weights at the edges, counts paths by weight and given length $0\leq\ell\leq n-1$ in time $O^*(\exp(n\cdot H(\ell/(2n))))$, where $H(p)=-p\log p-(1-p)\log(1-p)$, and the notation $O^*(\cdot)$ suppresses a factor polynomial in $n$. △ Less

Submitted 15 September, 2008; originally announced September 2008.

Comments: 11 pages

arXiv:0809.1489 [pdf, ps, other]

doi 10.1145/1583991.1584058

An optimal local approximation algorithm for max-min linear programs

Authors: Patrik Floréen, Joel Kaasinen, Petteri Kaski, Jukka Suomela

Abstract: We present a local algorithm (constant-time distributed algorithm) for approximating max-min LPs. The objective is to maximise $ω$ subject to $Ax \le 1$, $Cx \ge ω1$, and $x \ge 0$ for nonnegative matrices $A$ and $C$. The approximation ratio of our algorithm is the best possible for any local algorithm; there is a matching unconditional lower bound. We present a local algorithm (constant-time distributed algorithm) for approximating max-min LPs. The objective is to maximise $ω$ subject to $Ax \le 1$, $Cx \ge ω1$, and $x \ge 0$ for nonnegative matrices $A$ and $C$. The approximation ratio of our algorithm is the best possible for any local algorithm; there is a matching unconditional lower bound. △ Less

Submitted 9 September, 2008; originally announced September 2008.

Comments: 16 pages, 3 figures

arXiv:0806.0282 [pdf, ps, other]

Local approximation algorithms for a class of 0/1 max-min linear programs

Authors: Patrik Floréen, Marja Hassinen, Petteri Kaski, Jukka Suomela

Abstract: We study the applicability of distributed, local algorithms to 0/1 max-min LPs where the objective is to maximise ${\min_k \sum_v c_{kv} x_v}$ subject to ${\sum_v a_{iv} x_v \le 1}$ for each $i$ and ${x_v \ge 0}$ for each $v$. Here $c_{kv} \in \{0,1\}$, $a_{iv} \in \{0,1\}$, and the support sets ${V_i = \{v : a_{iv} > 0 \}}$ and ${V_k = \{v : c_{kv}>0 \}}$ have bounded size; in particular, we st… ▽ More We study the applicability of distributed, local algorithms to 0/1 max-min LPs where the objective is to maximise ${\min_k \sum_v c_{kv} x_v}$ subject to ${\sum_v a_{iv} x_v \le 1}$ for each $i$ and ${x_v \ge 0}$ for each $v$. Here $c_{kv} \in \{0,1\}$, $a_{iv} \in \{0,1\}$, and the support sets ${V_i = \{v : a_{iv} > 0 \}}$ and ${V_k = \{v : c_{kv}>0 \}}$ have bounded size; in particular, we study the case $|V_k| \le 2$. Each agent $v$ is responsible for choosing the value of $x_v$ based on information within its constant-size neighbourhood; the communication network is the hypergraph where the sets $V_k$ and $V_i$ constitute the hyperedges. We present a local approximation algorithm which achieves an approximation ratio arbitrarily close to the theoretical lower bound presented in prior work. △ Less

Submitted 2 June, 2008; originally announced June 2008.

Comments: 7 pages, 3 figures

arXiv:0804.4815 [pdf, ps, other]

doi 10.1007/978-3-540-92862-1_2

Tight local approximation results for max-min linear programs

Authors: Patrik Floréen, Marja Hassinen, Petteri Kaski, Jukka Suomela

Abstract: In a bipartite max-min LP, we are given a bipartite graph $\myG = (V \cup I \cup K, E)$, where each agent $v \in V$ is adjacent to exactly one constraint $i \in I$ and exactly one objective $k \in K$. Each agent $v$ controls a variable $x_v$. For each $i \in I$ we have a nonnegative linear constraint on the variables of adjacent agents. For each $k \in K$ we have a nonnegative linear objective f… ▽ More In a bipartite max-min LP, we are given a bipartite graph $\myG = (V \cup I \cup K, E)$, where each agent $v \in V$ is adjacent to exactly one constraint $i \in I$ and exactly one objective $k \in K$. Each agent $v$ controls a variable $x_v$. For each $i \in I$ we have a nonnegative linear constraint on the variables of adjacent agents. For each $k \in K$ we have a nonnegative linear objective function of the variables of adjacent agents. The task is to maximise the minimum of the objective functions. We study local algorithms where each agent $v$ must choose $x_v$ based on input within its constant-radius neighbourhood in $\myG$. We show that for every $ε>0$ there exists a local algorithm achieving the approximation ratio ${Δ_I (1 - 1/Δ_K)} + ε$. We also show that this result is the best possible -- no local algorithm can achieve the approximation ratio ${Δ_I (1 - 1/Δ_K)}$. Here $Δ_I$ is the maximum degree of a vertex $i \in I$, and $Δ_K$ is the maximum degree of a vertex $k \in K$. As a methodological contribution, we introduce the technique of graph unfolding for the design of local approximation algorithms. △ Less

Submitted 30 April, 2008; originally announced April 2008.

Comments: 16 pages

arXiv:0802.2834 [pdf, ps, other]

Trimmed Moebius Inversion and Graphs of Bounded Degree

Authors: Andreas Björklund, Thore Husfeldt, Petteri Kaski, Mikko Koivisto

Abstract: We study ways to expedite Yates's algorithm for computing the zeta and Moebius transforms of a function defined on the subset lattice. We develop a trimmed variant of Moebius inversion that proceeds point by point, finishing the calculation at a subset before considering its supersets. For an $n$-element universe $U$ and a family $\scr F$ of its subsets, trimmed Moebius inversion allows us to co… ▽ More We study ways to expedite Yates's algorithm for computing the zeta and Moebius transforms of a function defined on the subset lattice. We develop a trimmed variant of Moebius inversion that proceeds point by point, finishing the calculation at a subset before considering its supersets. For an $n$-element universe $U$ and a family $\scr F$ of its subsets, trimmed Moebius inversion allows us to compute the number of packings, coverings, and partitions of $U$ with $k$ sets from $\scr F$ in time within a polynomial factor (in $n$) of the number of supersets of the members of $\scr F$. Relying on an intersection theorem of Chung et al. (1986) to bound the sizes of set families, we apply these ideas to well-studied combinatorial optimisation problems on graphs of maximum degree $Δ$. In particular, we show how to compute the Domatic Number in time within a polynomial factor of $(2^{Δ+1-2)^{n/(Δ+1)$ and the Chromatic Number in time within a polynomial factor of $(2^{Δ+1-Δ-1)^{n/(Δ+1)$. For any constant $Δ$, these bounds are $O\bigl((2-ε)^n\bigr)$ for $ε>0$ independent of the number of vertices $n$. △ Less

Submitted 20 February, 2008; originally announced February 2008.

Journal ref: Dans Proceedings of the 25th Annual Symposium on the Theoretical Aspects of Computer Science - STACS 2008, Bordeaux : France (2008)

arXiv:0711.4902 [pdf, ps, other]

doi 10.1073/pnas.0712263105

Circumspect descent prevails in solving random constraint satisfaction problems

Authors: Mikko Alava, John Ardelius, Erik Aurell, Petteri Kaski, Supriya Krishnamurthy, Pekka Orponen, Sakari Seitz

Abstract: We study the performance of stochastic local search algorithms for random instances of the $K$-satisfiability ($K$-SAT) problem. We introduce a new stochastic local search algorithm, ChainSAT, which moves in the energy landscape of a problem instance by {\em never going upwards} in energy. ChainSAT is a \emph{focused} algorithm in the sense that it considers only variables occurring in unsatisfi… ▽ More We study the performance of stochastic local search algorithms for random instances of the $K$-satisfiability ($K$-SAT) problem. We introduce a new stochastic local search algorithm, ChainSAT, which moves in the energy landscape of a problem instance by {\em never going upwards} in energy. ChainSAT is a \emph{focused} algorithm in the sense that it considers only variables occurring in unsatisfied clauses. We show by extensive numerical investigations that ChainSAT and other focused algorithms solve large $K$-SAT instances almost surely in linear time, up to high clause-to-variable ratios $α$; for example, for K=4 we observe linear-time performance well beyond the recently postulated clustering and condensation transitions in the solution space. The performance of ChainSAT is a surprise given that by design the algorithm gets trapped into the first local energy minimum it encounters, yet no such minima are encountered. We also study the geometry of the solution space as accessed by stochastic local search algorithms. △ Less

Submitted 30 November, 2007; originally announced November 2007.

Comments: 6 figures, about 17 pates

arXiv:0711.2585 [pdf, other]

Computing the Tutte polynomial in vertex-exponential time

Authors: Andreas Björklund, Thore Husfeldt, Petteri Kaski, Mikko Koivisto

Abstract: The deletion--contraction algorithm is perhaps the most popular method for computing a host of fundamental graph invariants such as the chromatic, flow, and reliability polynomials in graph theory, the Jones polynomial of an alternating link in knot theory, and the partition functions of the models of Ising, Potts, and Fortuin--Kasteleyn in statistical physics. Prior to this work, deletion--cont… ▽ More The deletion--contraction algorithm is perhaps the most popular method for computing a host of fundamental graph invariants such as the chromatic, flow, and reliability polynomials in graph theory, the Jones polynomial of an alternating link in knot theory, and the partition functions of the models of Ising, Potts, and Fortuin--Kasteleyn in statistical physics. Prior to this work, deletion--contraction was also the fastest known general-purpose algorithm for these invariants, running in time roughly proportional to the number of spanning trees in the input graph. Here, we give a substantially faster algorithm that computes the Tutte polynomial--and hence, all the aforementioned invariants and more--of an arbitrary graph in time within a polynomial factor of the number of connected vertex sets. The algorithm actually evaluates a multivariate generalization of the Tutte polynomial by making use of an identity due to Fortuin and Kasteleyn. We also provide a polynomial-space variant of the algorithm and give an analogous result for Chung and Graham's cover polynomial. An implementation of the algorithm outperforms deletion--contraction also in practice. △ Less

Submitted 14 April, 2008; v1 submitted 16 November, 2007; originally announced November 2007.

ACM Class: F.2.2; G.2.1; G.2.2

arXiv:0710.1499 [pdf, other]

doi 10.1109/IPDPS.2008.4536235

Approximating max-min linear programs with local algorithms

Authors: Patrik Floréen, Petteri Kaski, Topi Musto, Jukka Suomela

Abstract: A local algorithm is a distributed algorithm where each node must operate solely based on the information that was available at system startup within a constant-size neighbourhood of the node. We study the applicability of local algorithms to max-min LPs where the objective is to maximise $\min_k \sum_v c_{kv} x_v$ subject to $\sum_v a_{iv} x_v \le 1$ for each $i$ and $x_v \ge 0$ for each $v$. H… ▽ More A local algorithm is a distributed algorithm where each node must operate solely based on the information that was available at system startup within a constant-size neighbourhood of the node. We study the applicability of local algorithms to max-min LPs where the objective is to maximise $\min_k \sum_v c_{kv} x_v$ subject to $\sum_v a_{iv} x_v \le 1$ for each $i$ and $x_v \ge 0$ for each $v$. Here $c_{kv} \ge 0$, $a_{iv} \ge 0$, and the support sets $V_i = \{v : a_{iv} > 0 \}$, $V_k = \{v : c_{kv}>0 \}$, $I_v = \{i : a_{iv} > 0 \}$ and $K_v = \{k : c_{kv} > 0 \}$ have bounded size. In the distributed setting, each agent $v$ is responsible for choosing the value of $x_v$, and the communication network is a hypergraph $\mathcal{H}$ where the sets $V_k$ and $V_i$ constitute the hyperedges. We present inapproximability results for a wide range of structural assumptions; for example, even if $|V_i|$ and $|V_k|$ are bounded by some constants larger than 2, there is no local approximation scheme. To contrast the negative results, we present a local approximation algorithm which achieves good approximation ratios if we can bound the relative growth of the vertex neighbourhoods in $\mathcal{H}$. △ Less

Submitted 8 October, 2007; originally announced October 2007.

Comments: 16 pages, 2 figures

arXiv:cs/0611103 [pdf, ps, other]

Barriers and local minima in energy landscapes of stochastic local search

Authors: Petteri Kaski

Abstract: A local search algorithm operating on an instance of a Boolean constraint satisfaction problem (in particular, k-SAT) can be viewed as a stochastic process traversing successive adjacent states in an ``energy landscape'' defined by the problem instance on the n-dimensional Boolean hypercube. We investigate analytically the worst-case topography of such landscapes in the context of satisfiable k-… ▽ More A local search algorithm operating on an instance of a Boolean constraint satisfaction problem (in particular, k-SAT) can be viewed as a stochastic process traversing successive adjacent states in an ``energy landscape'' defined by the problem instance on the n-dimensional Boolean hypercube. We investigate analytically the worst-case topography of such landscapes in the context of satisfiable k-SAT via a random ensemble of satisfiable ``k-regular'' linear equations modulo 2. We show that for each fixed k=3,4,..., the typical k-SAT energy landscape induced by an instance drawn from the ensemble has a set of 2^{Ω(n)} local energy minima, each separated by an unconditional Ω(n) energy barrier from each of the O(1) ground states, that is, solution states with zero energy. The main technical aspect of the analysis is that a random k-regular 0/1 matrix constitutes a strong boundary expander with almost full GF(2)-linear rank, a property which also enables us to prove a 2^{Ω(n)} lower bound for the expected number of steps required by the focused random walk heuristic to solve typical instances drawn from the ensemble. These results paint a grim picture of the worst-case topography of k-SAT for local search, and constitute apparently the first rigorous analysis of the growth of energy barriers in a random ensemble of k-SAT landscapes as the number of variables n is increased. △ Less

Submitted 21 November, 2006; originally announced November 2006.

ACM Class: F.2.2; G.2.1; G.3; I.2.8

arXiv:cs/0611101 [pdf, ps, other]

Fourier meets Möbius: fast subset convolution

Authors: Andreas Björklund, Thore Husfeldt, Petteri Kaski, Mikko Koivisto

Abstract: We present a fast algorithm for the subset convolution problem: given functions f and g defined on the lattice of subsets of an n-element set N, compute their subset convolution f*g, defined for all S\subseteq N by (f * g)(S) = \sum_{T \subseteq S}f(T) g(S\setminus T), where addition and multiplication is carried out in an arbitrary ring. Via Möbius transform and inversion, our algorithm evaluat… ▽ More We present a fast algorithm for the subset convolution problem: given functions f and g defined on the lattice of subsets of an n-element set N, compute their subset convolution f*g, defined for all S\subseteq N by (f * g)(S) = \sum_{T \subseteq S}f(T) g(S\setminus T), where addition and multiplication is carried out in an arbitrary ring. Via Möbius transform and inversion, our algorithm evaluates the subset convolution in O(n^2 2^n) additions and multiplications, substantially improving upon the straightforward O(3^n) algorithm. Specifically, if the input functions have an integer range {-M,-M+1,...,M}, their subset convolution over the ordinary sum-product ring can be computed in O^*(2^n log M) time; the notation O^* suppresses polylogarithmic factors. Furthermore, using a standard embedding technique we can compute the subset convolution over the max-sum or min-sum semiring in O^*(2^n M) time. To demonstrate the applicability of fast subset convolution, we present the first O^*(2^k n^2 + n m) algorithm for the minimum Steiner tree problem in graphs with n vertices, k terminals, and m edges with bounded integer weights, improving upon the O^*(3^k n + 2^k n^2 + n m) time bound of the classical Dreyfus-Wagner algorithm. We also discuss extensions to recent O^*(2^n)-time algorithms for covering and partitioning problems (Björklund and Husfeldt, FOCS 2006; Koivisto, FOCS 2006). △ Less

Submitted 21 November, 2006; originally announced November 2006.

ACM Class: F.2.1; F.2.2; G.2.1; G.2.2

Showing 1–39 of 39 results for author: Kaski, P