Search | arXiv e-print repository

Dynamic angular synchronization under smoothness constraints

Authors: Ernesto Araya, Mihai Cucuringu, Hemant Tyagi

Abstract: Given an undirected measurement graph $\mathcal{H} = ([n], \mathcal{E})$, the classical angular synchronization problem consists of recovering unknown angles $θ_1^*,\dots,θ_n^*$ from a collection of noisy pairwise measurements of the form $(θ_i^* - θ_j^*) \mod 2π$, for all $\{i,j\} \in \mathcal{E}$. This problem arises in a variety of applications, including computer vision, time synchronization o… ▽ More Given an undirected measurement graph $\mathcal{H} = ([n], \mathcal{E})$, the classical angular synchronization problem consists of recovering unknown angles $θ_1^*,\dots,θ_n^*$ from a collection of noisy pairwise measurements of the form $(θ_i^* - θ_j^*) \mod 2π$, for all $\{i,j\} \in \mathcal{E}$. This problem arises in a variety of applications, including computer vision, time synchronization of distributed networks, and ranking from pairwise comparisons. In this paper, we consider a dynamic version of this problem where the angles, and also the measurement graphs evolve over $T$ time points. Assuming a smoothness condition on the evolution of the latent angles, we derive three algorithms for joint estimation of the angles over all time points. Moreover, for one of the algorithms, we establish non-asymptotic recovery guarantees for the mean-squared error (MSE) under different statistical models. In particular, we show that the MSE converges to zero as $T$ increases under milder conditions than in the static setting. This includes the setting where the measurement graphs are highly sparse and disconnected, and also when the measurement noise is large and can potentially increase with $T$. We complement our theoretical results with experiments on synthetic data. △ Less

Submitted 6 June, 2024; originally announced June 2024.

Comments: 40 pages, 9 figures

arXiv:2403.13230 [pdf, other]

BFT-PoLoc: A Byzantine Fortified Trigonometric Proof of Location Protocol using Internet Delays

Authors: Peiyao Sheng, Vishal Sevani, Ranvir Rana, Himanshu Tyagi, Pramod Viswanath

Abstract: Internet platforms depend on accurately determining the geographical locations of online users to deliver targeted services (e.g., advertising). The advent of decentralized platforms (blockchains) emphasizes the importance of geographically distributed nodes, making the validation of locations more crucial. In these decentralized settings, mutually non-trusting participants need to {\em prove} the… ▽ More Internet platforms depend on accurately determining the geographical locations of online users to deliver targeted services (e.g., advertising). The advent of decentralized platforms (blockchains) emphasizes the importance of geographically distributed nodes, making the validation of locations more crucial. In these decentralized settings, mutually non-trusting participants need to {\em prove} their locations to each other. The incentives for claiming desired location include decentralization properties (validators of a blockchain), explicit rewards for improving coverage (physical infrastructure blockchains) and regulatory compliance -- and entice participants towards prevaricating their true location malicious via VPNs, tampering with internet delays, or compromising other parties (challengers) to misrepresent their location. Traditional delay-based geolocation methods focus on reducing the noise in measurements and are very vulnerable to wilful divergences from prescribed protocol. In this paper we use Internet delay measurements to securely prove the location of IP addresses while being immune to a large fraction of Byzantine actions. Our core methods are to endow Internet telemetry tools (e.g., ping) with cryptographic primitives (signatures and hash functions) together with Byzantine resistant data inferences subject to Euclidean geometric constraints. We introduce two new networking protocols, robust against Byzantine actions: Proof of Internet Geometry (PoIG) converts delay measurements into precise distance estimates across the Internet; Proof of Location (PoLoc) enables accurate and efficient multilateration of a specific IP address. The key algorithmic innovations are in conducting ``Byzantine fortified trigonometry" (BFT) inferences of data, endowing low rank matrix completion methods with Byzantine resistance. △ Less

Submitted 28 March, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

arXiv:2402.07241 [pdf, other]

Proof of Diligence: Cryptoeconomic Security for Rollups

Authors: Peiyao Sheng, Ranvir Rana, Senthil Bala, Himanshu Tyagi, Pramod Viswanath

Abstract: Layer 1 (L1) blockchains such as Ethereum are secured under an "honest supermajority of stake" assumption for a large pool of validators who verify each and every transaction on it. This high security comes at a scalability cost which not only effects the throughput of the blockchain but also results in high gas fees for executing transactions on chain. The most successful solution for this proble… ▽ More Layer 1 (L1) blockchains such as Ethereum are secured under an "honest supermajority of stake" assumption for a large pool of validators who verify each and every transaction on it. This high security comes at a scalability cost which not only effects the throughput of the blockchain but also results in high gas fees for executing transactions on chain. The most successful solution for this problem is provided by optimistic rollups, Layer 2 (L2) blockchains that execute transactions outside L1 but post the transaction data on L1. The security for such L2 chains is argued, informally, under the assumption that a set of nodes will check the transaction data posted on L1 and raise an alarm (a fraud proof) if faulty transactions are detected. However, all current deployments lack a proper incentive mechanism for ensuring that these nodes will do their job ``diligently'', and simply rely on a cursory incentive alignment argument for security. We solve this problem by introducing an incentivized watchtower network designed to serve as the first line of defense for rollups. Our main contribution is a ``Proof of Diligence'' protocol that requires watchtowers to continuously provide a proof that they have verified L2 assertions and get rewarded for the same. Proof of Diligence protocol includes a carefully-designed incentive mechanism that is provably secure when watchtowers are rational actors, under a mild rational independence assumption. △ Less

Submitted 23 July, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

arXiv:2311.13475 [pdf, other]

Machine Translation to Control Formality Features in the Target Language

Authors: Harshita Tyagi, Prashasta Jung, Hyowon Lee

Abstract: Formality plays a significant role in language communication, especially in low-resource languages such as Hindi, Japanese and Korean. These languages utilise formal and informal expressions to convey messages based on social contexts and relationships. When a language translation technique is used to translate from a source language that does not pertain the formality (e.g. English) to a target l… ▽ More Formality plays a significant role in language communication, especially in low-resource languages such as Hindi, Japanese and Korean. These languages utilise formal and informal expressions to convey messages based on social contexts and relationships. When a language translation technique is used to translate from a source language that does not pertain the formality (e.g. English) to a target language that does, there is a missing information on formality that could be a challenge in producing an accurate outcome. This research explores how this issue should be resolved when machine learning methods are used to translate from English to languages with formality, using Hindi as the example data. This was done by training a bilingual model in a formality-controlled setting and comparing its performance with a pre-trained multilingual model in a similar setting. Since there are not a lot of training data with ground truth, automated annotation techniques were employed to increase the data size. The primary modeling approach involved leveraging transformer models, which have demonstrated effectiveness in various natural language processing tasks. We evaluate the official formality accuracy(ACC) by comparing the predicted masked tokens with the ground truth. This metric provides a quantitative measure of how well the translations align with the desired outputs. Our study showcases a versatile translation strategy that considers the nuances of formality in the target language, catering to diverse language communication needs and scenarios. △ Less

Submitted 22 November, 2023; originally announced November 2023.

Comments: 9 pages, based on DCU MCM Practicum 2022/2023

arXiv:2310.20609 [pdf, other]

doi 10.3934/fods.2024034

Graph Matching via convex relaxation to the simplex

Authors: Ernesto Araya Valdivia, Hemant Tyagi

Abstract: This paper addresses the Graph Matching problem, which consists of finding the best possible alignment between two input graphs, and has many applications in computer vision, network deanonymization and protein alignment. A common approach to tackle this problem is through convex relaxations of the NP-hard \emph{Quadratic Assignment Problem} (QAP). Here, we introduce a new convex relaxation onto… ▽ More This paper addresses the Graph Matching problem, which consists of finding the best possible alignment between two input graphs, and has many applications in computer vision, network deanonymization and protein alignment. A common approach to tackle this problem is through convex relaxations of the NP-hard \emph{Quadratic Assignment Problem} (QAP). Here, we introduce a new convex relaxation onto the unit simplex and develop an efficient mirror descent scheme with closed-form iterations for solving this problem. Under the correlated Gaussian Wigner model, we show that the simplex relaxation admits a unique solution with high probability. In the noiseless case, this is shown to imply exact recovery of the ground truth permutation. Additionally, we establish a novel sufficiency condition for the input matrix in standard greedy rounding methods, which is less restrictive than the commonly used `diagonal dominance' condition. We use this condition to show exact one-step recovery of the ground truth (holding almost surely) via the mirror descent scheme, in the noiseless setting. We also use this condition to obtain significantly improved conditions for the GRAMPA algorithm [Fan et al. 2019] in the noiseless setting. △ Less

Submitted 9 August, 2024; v1 submitted 31 October, 2023; originally announced October 2023.

Comments: We fixed some typos and added Lemma 4. Reference to the published version below. Added link to the code

Journal ref: Ernesto Araya, Hemant Tyagi. Graph Matching via convex relaxation to the simplex. Foundations of Data Science.2024

arXiv:2307.16562 [pdf, other]

SAKSHI: Decentralized AI Platforms

Authors: Suma Bhat, Canhui Chen, Zerui Cheng, Zhixuan Fang, Ashwin Hebbar, Sreeram Kannan, Ranvir Rana, Peiyao Sheng, Himanshu Tyagi, Pramod Viswanath, Xuechao Wang

Abstract: Large AI models (e.g., Dall-E, GPT4) have electrified the scientific, technological and societal landscape through their superhuman capabilities. These services are offered largely in a traditional web2.0 format (e.g., OpenAI's GPT4 service). As more large AI models proliferate (personalizing and specializing to a variety of domains), there is a tremendous need to have a neutral trust-free platfor… ▽ More Large AI models (e.g., Dall-E, GPT4) have electrified the scientific, technological and societal landscape through their superhuman capabilities. These services are offered largely in a traditional web2.0 format (e.g., OpenAI's GPT4 service). As more large AI models proliferate (personalizing and specializing to a variety of domains), there is a tremendous need to have a neutral trust-free platform that allows the hosting of AI models, clients receiving AI services efficiently, yet in a trust-free, incentive compatible, Byzantine behavior resistant manner. In this paper we propose SAKSHI, a trust-free decentralized platform specifically suited for AI services. The key design principles of SAKSHI are the separation of the data path (where AI query and service is managed) and the control path (where routers and compute and storage hosts are managed) from the transaction path (where the metering and billing of services are managed over a blockchain). This separation is enabled by a "proof of inference" layer which provides cryptographic resistance against a variety of misbehaviors, including poor AI service, nonpayment for service, copying of AI models. This is joint work between multiple universities (Princeton University, University of Illinois at Urbana-Champaign, Tsinghua University, HKUST) and two startup companies (Witness Chain and Eigen Layer). △ Less

Submitted 31 July, 2023; originally announced July 2023.

Comments: 23 pages, 9 figures

arXiv:2307.02641 [pdf, other]

Active Class Selection for Few-Shot Class-Incremental Learning

Authors: Christopher McClurg, Ali Ayub, Harsh Tyagi, Sarah M. Rajtmajer, Alan R. Wagner

Abstract: For real-world applications, robots will need to continually learn in their environments through limited interactions with their users. Toward this, previous works in few-shot class incremental learning (FSCIL) and active class selection (ACS) have achieved promising results but were tested in constrained setups. Therefore, in this paper, we combine ideas from FSCIL and ACS to develop a novel fram… ▽ More For real-world applications, robots will need to continually learn in their environments through limited interactions with their users. Toward this, previous works in few-shot class incremental learning (FSCIL) and active class selection (ACS) have achieved promising results but were tested in constrained setups. Therefore, in this paper, we combine ideas from FSCIL and ACS to develop a novel framework that can allow an autonomous agent to continually learn new objects by asking its users to label only a few of the most informative objects in the environment. To this end, we build on a state-of-the-art (SOTA) FSCIL model and extend it with techniques from ACS literature. We term this model Few-shot Incremental Active class SeleCtiOn (FIASco). We further integrate a potential field-based navigation technique with our model to develop a complete framework that can allow an agent to process and reason on its sensory data through the FIASco model, navigate towards the most informative object in the environment, gather data about the object through its sensors and incrementally update the FIASco model. Experimental results on a simulated agent and a real robot show the significance of our approach for long-term real-world robotics applications. △ Less

Submitted 5 July, 2023; originally announced July 2023.

Comments: Accepted at the Conference on Lifelong Learning Agents (CoLLAs), 2023

arXiv:2212.09980 [pdf, ps, other]

Continual Mean Estimation Under User-Level Privacy

Authors: Anand Jerry George, Lekshmi Ramesh, Aditya Vikram Singh, Himanshu Tyagi

Abstract: We consider the problem of continually releasing an estimate of the population mean of a stream of samples that is user-level differentially private (DP). At each time instant, a user contributes a sample, and the users can arrive in arbitrary order. Until now these requirements of continual release and user-level privacy were considered in isolation. But, in practice, both these requirements come… ▽ More We consider the problem of continually releasing an estimate of the population mean of a stream of samples that is user-level differentially private (DP). At each time instant, a user contributes a sample, and the users can arrive in arbitrary order. Until now these requirements of continual release and user-level privacy were considered in isolation. But, in practice, both these requirements come together as the users often contribute data repeatedly and multiple queries are made. We provide an algorithm that outputs a mean estimate at every time instant $t$ such that the overall release is user-level $\varepsilon$-DP and has the following error guarantee: Denoting by $M_t$ the maximum number of samples contributed by a user, as long as $\tildeΩ(1/\varepsilon)$ users have $M_t/2$ samples each, the error at time $t$ is $\tilde{O}(1/\sqrt{t}+\sqrt{M}_t/t\varepsilon)$. This is a universal error guarantee which is valid for all arrival patterns of the users. Furthermore, it (almost) matches the existing lower bounds for the single-release setting at all time instants when users have contributed equal number of samples. △ Less

Submitted 19 December, 2022; originally announced December 2022.

arXiv:2210.11546 [pdf, other]

Proof of Backhaul: Trustfree Measurement of Broadband Bandwidth

Authors: Peiyao Sheng, Nikita Yadav, Vishal Sevani, Arun Babu, SVR Anand, Himanshu Tyagi, Pramod Viswanath

Abstract: Recent years have seen the emergence of decentralized wireless networks consisting of nodes hosted by many individuals and small enterprises, reawakening the decades-old dream of open networking. These networks have been deployed in an organic, distributed manner and are driven by new economic models resting on tokenized incentives. A critical requirement for the incentives to scale is the ability… ▽ More Recent years have seen the emergence of decentralized wireless networks consisting of nodes hosted by many individuals and small enterprises, reawakening the decades-old dream of open networking. These networks have been deployed in an organic, distributed manner and are driven by new economic models resting on tokenized incentives. A critical requirement for the incentives to scale is the ability to prove network performance in a decentralized trustfree manner, i.e., a Byzantine fault tolerant network telemetry system. In this paper, we present a Proof of Backhaul (PoB) protocol which measures the bandwidth of the (broadband) backhaul link of a wireless access point, termed prover, in a decentralized and trustfree manner. In particular, our proposed protocol is the first one to satisfy the following two properties: (1) Trustfree. Bandwidth measurement is secure against Byzantine attacks by collaborations of challenge servers and the prover. (2) Open. The barrier-to-entry for being a challenge server is low; there is no requirement of having a low latency and high throughput path to the measured link. At a high-level, our protocol aggregates the challenge traffic from multiple challenge servers and uses cryptographic primitives to ensure that a subset of challengers or, even challengers and provers, cannot maliciously modify results in their favor. A formal security model allows us to establish guarantees of accurate bandwidth measurement as a function of the fraction of malicious actors. Our evaluation shows that our PoB protocol can verify backhaul bandwidth of up to 1000 Mbps with less than 8% error using measurements lasting only 100 ms. The measurement accuracy is not affected in the presence of corrupted challengers. Importantly, the basic verification protocol lends itself to a minor modification that can measure available bandwidth even in the presence of cross-traffic. △ Less

Submitted 20 October, 2022; originally announced October 2022.

arXiv:2203.06870 [pdf, ps, other]

The Role of Interactivity in Structured Estimation

Authors: Jayadev Acharya, Clément L. Canonne, Ziteng Sun, Himanshu Tyagi

Abstract: We study high-dimensional sparse estimation under three natural constraints: communication constraints, local privacy constraints, and linear measurements (compressive sensing). Without sparsity assumptions, it has been established that interactivity cannot improve the minimax rates of estimation under these information constraints. The question of whether interactivity helps with natural inferenc… ▽ More We study high-dimensional sparse estimation under three natural constraints: communication constraints, local privacy constraints, and linear measurements (compressive sensing). Without sparsity assumptions, it has been established that interactivity cannot improve the minimax rates of estimation under these information constraints. The question of whether interactivity helps with natural inference tasks has been a topic of active research. We settle this question in the affirmative for the prototypical problems of high-dimensional sparse mean estimation and compressive sensing, by demonstrating a gap between interactive and noninteractive protocols. We further establish that the gap increases when we have more structured sparsity: for block sparsity this gap can be as large as polynomial in the dimensionality. Thus, the more structured the sparsity is, the greater is the advantage of interaction. Proving the lower bounds requires a careful breaking of a sum of correlated random variables into independent components using Baranyai's theorem on decomposition of hypergraphs, which might be of independent interest. △ Less

Submitted 14 March, 2022; originally announced March 2022.

arXiv:2112.10467 [pdf, other]

An iterative clustering algorithm for the Contextual Stochastic Block Model with optimality guarantees

Authors: Guillaume Braun, Hemant Tyagi, Christophe Biernacki

Abstract: Real-world networks often come with side information that can help to improve the performance of network analysis tasks such as clustering. Despite a large number of empirical and theoretical studies conducted on network clustering methods during the past decade, the added value of side information and the methods used to incorporate it optimally in clustering algorithms are relatively less unders… ▽ More Real-world networks often come with side information that can help to improve the performance of network analysis tasks such as clustering. Despite a large number of empirical and theoretical studies conducted on network clustering methods during the past decade, the added value of side information and the methods used to incorporate it optimally in clustering algorithms are relatively less understood. We propose a new iterative algorithm to cluster networks with side information for nodes (in the form of covariates) and show that our algorithm is optimal under the Contextual Symmetric Stochastic Block Model. Our algorithm can be applied to general Contextual Stochastic Block Models and avoids hyperparameter tuning in contrast to previously proposed methods. We confirm our theoretical results on synthetic data experiments where our algorithm significantly outperforms other methods, and show that it can also be applied to signed graphs. Finally we demonstrate the practical interest of our method on real data. △ Less

Submitted 28 July, 2022; v1 submitted 20 December, 2021; originally announced December 2021.

Journal ref: Proceedings of the 39th International Conference on Machine Learning, PMLR 162:2257-2291, 2022

arXiv:2109.05222 [pdf, ps, other]

Fundamental limits of over-the-air optimization: Are analog schemes optimal?

Authors: Shubham K Jha, Prathamesh Mayekar, Himanshu Tyagi

Abstract: We consider over-the-air convex optimization on a $d-$dimensional space where coded gradients are sent over an additive Gaussian noise channel with variance $σ^2$. The codewords satisfy an average power constraint $P$, resulting in the signal-to-noise ratio (SNR) of $P/σ^2$. We derive bounds for the convergence rates for over-the-air optimization. Our first result is a lower bound for the converge… ▽ More We consider over-the-air convex optimization on a $d-$dimensional space where coded gradients are sent over an additive Gaussian noise channel with variance $σ^2$. The codewords satisfy an average power constraint $P$, resulting in the signal-to-noise ratio (SNR) of $P/σ^2$. We derive bounds for the convergence rates for over-the-air optimization. Our first result is a lower bound for the convergence rate showing that any code must slowdown the convergence rate by a factor of roughly $\sqrt{d/\log(1+\mathtt{SNR})}$. Next, we consider a popular class of schemes called $analog$ $coding$, where a linear function of the gradient is sent. We show that a simple scaled transmission analog coding scheme results in a slowdown in convergence rate by a factor of $\sqrt{d(1+1/\mathtt{SNR})}$. This matches the previous lower bound up to constant factors for low SNR, making the scaled transmission scheme optimal at low SNR. However, we show that this slowdown is necessary for any analog coding scheme. In particular, a slowdown in convergence by a factor of $\sqrt{d}$ for analog coding remains even when SNR tends to infinity. Remarkably, we present a simple quantize-and-modulate scheme that uses $Amplitude$ $Shift$ $Keying$ and almost attains the optimal convergence rate at all SNRs. △ Less

Submitted 15 September, 2021; v1 submitted 11 September, 2021; originally announced September 2021.

Comments: Few typos fixed and one reference added. An abridged version of this paper will appear in the proceedings of IEEE Global Communications Conference (GLOBECOM), Spain, 2021

arXiv:2107.10078 [pdf, other]

Optimal Rates for Nonparametric Density Estimation under Communication Constraints

Authors: Jayadev Acharya, Clément L. Canonne, Aditya Vikram Singh, Himanshu Tyagi

Abstract: We consider density estimation for Besov spaces when each sample is quantized to only a limited number of bits. We provide a noninteractive adaptive estimator that exploits the sparsity of wavelet bases, along with a simulate-and-infer technique from parametric estimation under communication constraints. We show that our estimator is nearly rate-optimal by deriving minimax lower bounds that hold e… ▽ More We consider density estimation for Besov spaces when each sample is quantized to only a limited number of bits. We provide a noninteractive adaptive estimator that exploits the sparsity of wavelet bases, along with a simulate-and-infer technique from parametric estimation under communication constraints. We show that our estimator is nearly rate-optimal by deriving minimax lower bounds that hold even when interactive protocols are allowed. Interestingly, while our wavelet-based estimator is almost rate-optimal for Sobolev spaces as well, it is unclear whether the standard Fourier basis, which arise naturally for those spaces, can be used to achieve the same performance. △ Less

Submitted 21 July, 2021; originally announced July 2021.

arXiv:2105.09855 [pdf, ps, other]

doi 10.1109/TSP.2022.3169957

Multiple Support Recovery Using Very Few Measurements Per Sample

Authors: Lekshmi Ramesh, Chandra R. Murthy, Himanshu Tyagi

Abstract: In the problem of multiple support recovery, we are given access to linear measurements of multiple sparse samples in $\mathbb{R}^{d}$. These samples can be partitioned into $\ell$ groups, with samples having the same support belonging to the same group. For a given budget of $m$ measurements per sample, the goal is to recover the $\ell$ underlying supports, in the absence of the knowledge of grou… ▽ More In the problem of multiple support recovery, we are given access to linear measurements of multiple sparse samples in $\mathbb{R}^{d}$. These samples can be partitioned into $\ell$ groups, with samples having the same support belonging to the same group. For a given budget of $m$ measurements per sample, the goal is to recover the $\ell$ underlying supports, in the absence of the knowledge of group labels. We study this problem with a focus on the measurement-constrained regime where $m$ is smaller than the support size $k$ of each sample. We design a two-step procedure that estimates the union of the underlying supports first, and then uses a spectral algorithm to estimate the individual supports. Our proposed estimator can recover the supports with $m<k$ measurements per sample, from $\tilde{O}(k^{4}\ell^{4}/m^{4})$ samples. Our guarantees hold for a general, generative model assumption on the samples and measurement matrices. We also provide results from experiments conducted on synthetic data and on the MNIST dataset. △ Less

Submitted 20 May, 2021; originally announced May 2021.

arXiv:2104.00979 [pdf, ps, other]

Information-constrained optimization: can adaptive processing of gradients help?

Authors: Jayadev Acharya, Clément L. Canonne, Prathamesh Mayekar, Himanshu Tyagi

Abstract: We revisit first-order optimization under local information constraints such as local privacy, gradient quantization, and computational constraints limiting access to a few coordinates of the gradient. In this setting, the optimization algorithm is not allowed to directly access the complete output of the gradient oracle, but only gets limited information about it subject to the local information… ▽ More We revisit first-order optimization under local information constraints such as local privacy, gradient quantization, and computational constraints limiting access to a few coordinates of the gradient. In this setting, the optimization algorithm is not allowed to directly access the complete output of the gradient oracle, but only gets limited information about it subject to the local information constraints. We study the role of adaptivity in processing the gradient output to obtain this limited information from it.We consider optimization for both convex and strongly convex functions and obtain tight or nearly tight lower bounds for the convergence rate, when adaptive gradient processing is allowed. Prior work was restricted to convex functions and allowed only nonadaptive processing of gradients. For both of these function classes and for the three information constraints mentioned above, our lower bound implies that adaptive processing of gradients cannot outperform nonadaptive processing in most regimes of interest. We complement these results by exhibiting a natural optimization problem under information constraints for which adaptive processing of gradient strictly outperforms nonadaptive processing. △ Less

Submitted 2 April, 2021; originally announced April 2021.

arXiv:2103.03235 [pdf, other]

Clustering multilayer graphs with missing nodes

Authors: Guillaume Braun, Hemant Tyagi, Christophe Biernacki

Abstract: Relationship between agents can be conveniently represented by graphs. When these relationships have different modalities, they are better modelled by multilayer graphs where each layer is associated with one modality. Such graphs arise naturally in many contexts including biological and social networks. Clustering is a fundamental problem in network analysis where the goal is to regroup nodes wit… ▽ More Relationship between agents can be conveniently represented by graphs. When these relationships have different modalities, they are better modelled by multilayer graphs where each layer is associated with one modality. Such graphs arise naturally in many contexts including biological and social networks. Clustering is a fundamental problem in network analysis where the goal is to regroup nodes with similar connectivity profiles. In the past decade, various clustering methods have been extended from the unilayer setting to multilayer graphs in order to incorporate the information provided by each layer. While most existing works assume - rather restrictively - that all layers share the same set of nodes, we propose a new framework that allows for layers to be defined on different sets of nodes. In particular, the nodes not recorded in a layer are treated as missing. Within this paradigm, we investigate several generalizations of well-known clustering methods in the complete setting to the incomplete one and prove some consistency results under the Multi-Layer Stochastic Block Model assumption. Our theoretical results are complemented by thorough numerical comparisons between our proposed algorithms on synthetic data, and also on real datasets, thus highlighting the promising behaviour of our methods in various settings. △ Less

Submitted 4 March, 2021; originally announced March 2021.

Comments: 27 pages, 7 figures, accepted to AISTATS 2021

arXiv:2102.00235 [pdf, other]

Phase Transitions for Support Recovery from Gaussian Linear Measurements

Authors: Lekshmi Ramesh, Chandra R. Murthy, Himanshu Tyagi

Abstract: We study the problem of recovering the common $k$-sized support of a set of $n$ samples of dimension $d$, using $m$ noisy linear measurements per sample. Most prior work has focused on the case when $m$ exceeds $k$, in which case $n$ of the order $(k/m)\log(d/k)$ is both necessary and sufficient. Thus, in this regime, only the total number of measurements across the samples matter, and there is no… ▽ More We study the problem of recovering the common $k$-sized support of a set of $n$ samples of dimension $d$, using $m$ noisy linear measurements per sample. Most prior work has focused on the case when $m$ exceeds $k$, in which case $n$ of the order $(k/m)\log(d/k)$ is both necessary and sufficient. Thus, in this regime, only the total number of measurements across the samples matter, and there is not much benefit in getting more than $k$ measurements per sample. In the measurement-constrained regime where we have access to fewer than $k$ measurements per sample, we show an upper bound of $O((k^{2}/m^{2})\log d)$ on the sample complexity for successful support recovery when $m\ge 2\log d$. Along with the lower bound from our previous work, this shows a phase transition for the sample complexity of this problem around $k/m=1$. In fact, our proposed algorithm is sample-optimal in both the regimes. It follows that, in the $m\ll k$ regime, multiple measurements from the same sample are more valuable than measurements from different samples. △ Less

Submitted 12 May, 2021; v1 submitted 30 January, 2021; originally announced February 2021.

arXiv:2101.07981 [pdf, ps, other]

Inference under Information Constraints III: Local Privacy Constraints

Authors: Jayadev Acharya, Clément L. Canonne, Cody Freitag, Ziteng Sun, Himanshu Tyagi

Abstract: We study goodness-of-fit and independence testing of discrete distributions in a setting where samples are distributed across multiple users. The users wish to preserve the privacy of their data while enabling a central server to perform the tests. Under the notion of local differential privacy, we propose simple, sample-optimal, and communication-efficient protocols for these two questions in the… ▽ More We study goodness-of-fit and independence testing of discrete distributions in a setting where samples are distributed across multiple users. The users wish to preserve the privacy of their data while enabling a central server to perform the tests. Under the notion of local differential privacy, we propose simple, sample-optimal, and communication-efficient protocols for these two questions in the noninteractive setting, where in addition users may or may not share a common random seed. In particular, we show that the availability of shared (public) randomness greatly reduces the sample complexity. Underlying our public-coin protocols are privacy-preserving mappings which, when applied to the samples, minimally contract the distance between their respective probability distributions. △ Less

Submitted 20 January, 2021; originally announced January 2021.

Comments: To appear in the Special Issue on Privacy and Security of Information Systems of the IEEE Journal on Selected Areas in Information Theory (JSAIT), 2021. Journal version of the AISTATS'19 paper "Test without Trust: Optimal Locally Private Distribution Testing" (arXiv:1808.02174), which it extends and supersedes

arXiv:2012.14932 [pdf, other]

An extension of the angular synchronization problem to the heterogeneous setting

Authors: Mihai Cucuringu, Hemant Tyagi

Abstract: Given an undirected measurement graph $G = ([n], E)$, the classical angular synchronization problem consists of recovering unknown angles $θ_1,\dots,θ_n$ from a collection of noisy pairwise measurements of the form $(θ_i - θ_j) \mod 2π$, for each $\{i,j\} \in E$. This problem arises in a variety of applications, including computer vision, time synchronization of distributed networks, and ranking f… ▽ More Given an undirected measurement graph $G = ([n], E)$, the classical angular synchronization problem consists of recovering unknown angles $θ_1,\dots,θ_n$ from a collection of noisy pairwise measurements of the form $(θ_i - θ_j) \mod 2π$, for each $\{i,j\} \in E$. This problem arises in a variety of applications, including computer vision, time synchronization of distributed networks, and ranking from preference relationships. In this paper, we consider a generalization to the setting where there exist $k$ unknown groups of angles $θ_{l,1}, \dots,θ_{l,n}$, for $l=1,\dots,k$. For each $ \{i,j\} \in E$, we are given noisy pairwise measurements of the form $θ_{\ell,i} - θ_{\ell,j}$ for an unknown $\ell \in \{1,2,\ldots,k\}$. This can be thought of as a natural extension of the angular synchronization problem to the heterogeneous setting of multiple groups of angles, where the measurement graph has an unknown edge-disjoint decomposition $G = G_1 \cup G_2 \ldots \cup G_k$, where the $G_i$'s denote the subgraphs of edges corresponding to each group. We propose a probabilistic generative model for this problem, along with a spectral algorithm for which we provide a detailed theoretical analysis in terms of robustness against both sampling sparsity and noise. The theoretical findings are complemented by a comprehensive set of numerical experiments, showcasing the efficacy of our algorithm under various parameter regimes. Finally, we consider an application of bi-synchronization to the graph realization problem, and provide along the way an iterative graph disentangling procedure that uncovers the subgraphs $G_i$, $i=1,\ldots,k$ which is of independent interest, as it is shown to improve the final recovery accuracy across all the experiments considered. △ Less

Submitted 4 January, 2021; v1 submitted 29 December, 2020; originally announced December 2020.

Comments: 45 pages, 17 figures

arXiv:2011.12160 [pdf, ps, other]

Wyner-Ziv Estimators for Distributed Mean Estimation with Side Information and Optimization

Authors: Prathamesh Mayekar, Shubham Jha, Ananda Theertha Suresh, Himanshu Tyagi

Abstract: Communication efficient distributed mean estimation is an important primitive that arises in many distributed learning and optimization scenarios such as federated learning. Without any probabilistic assumptions on the underlying data, we study the problem of distributed mean estimation where the server has access to side information. We propose \emph{Wyner-Ziv estimators}, which are communication… ▽ More Communication efficient distributed mean estimation is an important primitive that arises in many distributed learning and optimization scenarios such as federated learning. Without any probabilistic assumptions on the underlying data, we study the problem of distributed mean estimation where the server has access to side information. We propose \emph{Wyner-Ziv estimators}, which are communication and computationally efficient and near-optimal when an upper bound for the distance between the side information and the data is known. As a corollary, we also show that our algorithms provide efficient schemes for the classic Wyner-Ziv problem in information theory. In a different direction, when there is no knowledge assumed about the distance between side information and the data, we present an alternative Wyner-Ziv estimator that uses correlated sampling. This latter setting offers {\em universal recovery guarantees}, and perhaps will be of interest in practice when the number of users is large and keeping track of the distances between the data and the side information may not be possible. With this mean estimator at our disposal, we revisit basic problems in decentralized optimization and compression where our Wyner-Ziv estimator yields algorithms with almost optimal performance. First, we consider the problem of communication constrained distributed optimization and provide an algorithm which attains the optimal convergence rate by exploiting the fact that the gradient estimates are close to each other. Specifically, the gradient compression scheme in our algorithm first uses half of the parties to form side information and then uses our Wyner-Ziv estimator to compress the remaining half of the gradient estimates. △ Less

Submitted 14 November, 2022; v1 submitted 24 November, 2020; originally announced November 2020.

arXiv:2011.01737 [pdf, other]

Regularized spectral methods for clustering signed networks

Authors: Mihai Cucuringu, Apoorv Vikram Singh, Déborah Sulem, Hemant Tyagi

Abstract: We study the problem of $k$-way clustering in signed graphs. Considerable attention in recent years has been devoted to analyzing and modeling signed graphs, where the affinity measure between nodes takes either positive or negative values. Recently, Cucuringu et al. [CDGT 2019] proposed a spectral method, namely SPONGE (Signed Positive over Negative Generalized Eigenproblem), which casts the clus… ▽ More We study the problem of $k$-way clustering in signed graphs. Considerable attention in recent years has been devoted to analyzing and modeling signed graphs, where the affinity measure between nodes takes either positive or negative values. Recently, Cucuringu et al. [CDGT 2019] proposed a spectral method, namely SPONGE (Signed Positive over Negative Generalized Eigenproblem), which casts the clustering task as a generalized eigenvalue problem optimizing a suitably defined objective function. This approach is motivated by social balance theory, where the clustering task aims to decompose a given network into disjoint groups, such that individuals within the same group are connected by as many positive edges as possible, while individuals from different groups are mainly connected by negative edges. Through extensive numerical simulations, SPONGE was shown to achieve state-of-the-art empirical performance. On the theoretical front, [CDGT 2019] analyzed SPONGE and the popular Signed Laplacian method under the setting of a Signed Stochastic Block Model (SSBM), for $k=2$ equal-sized clusters, in the regime where the graph is moderately dense. In this work, we build on the results in [CDGT 2019] on two fronts for the normalized versions of SPONGE and the Signed Laplacian. Firstly, for both algorithms, we extend the theoretical analysis in [CDGT 2019] to the general setting of $k \geq 2$ unequal-sized clusters in the moderately dense regime. Secondly, we introduce regularized versions of both methods to handle sparse graphs -- a regime where standard spectral methods underperform -- and provide theoretical guarantees under the same SSBM model. To the best of our knowledge, regularized spectral methods have so far not been considered in the setting of clustering signed graphs. We complement our theoretical results with an extensive set of numerical experiments on synthetic data. △ Less

Submitted 3 November, 2020; originally announced November 2020.

Comments: 55 pages, 5 figures

arXiv:2010.06562 [pdf, ps, other]

Unified lower bounds for interactive high-dimensional estimation under information constraints

Authors: Jayadev Acharya, Clément L. Canonne, Ziteng Sun, Himanshu Tyagi

Abstract: We consider distributed parameter estimation using interactive protocols subject to local information constraints such as bandwidth limitations, local differential privacy, and restricted measurements. We provide a unified framework enabling us to derive a variety of (tight) minimax lower bounds for different parametric families of distributions, both continuous and discrete, under any $\ell_p$ lo… ▽ More We consider distributed parameter estimation using interactive protocols subject to local information constraints such as bandwidth limitations, local differential privacy, and restricted measurements. We provide a unified framework enabling us to derive a variety of (tight) minimax lower bounds for different parametric families of distributions, both continuous and discrete, under any $\ell_p$ loss. Our lower bound framework is versatile and yields "plug-and-play" bounds that are widely applicable to a large range of estimation problems, and, for the prototypical case of the Gaussian family, circumvents limitations of previous techniques. In particular, our approach recovers bounds obtained using data processing inequalities and Cramér--Rao bounds, two other alternative approaches for proving lower bounds in our setting of interest. Further, for the families considered, we complement our lower bounds with matching upper bounds. △ Less

Submitted 15 November, 2022; v1 submitted 13 October, 2020; originally announced October 2020.

Comments: Streamline some statements; add the low-privacy corollary (high value of the privacy parameter) for Corollary 1, along with the implications for the applications considered; add the upper bound for the low-privacy regimes for Bernoulli (Theorem 3) and Gaussian (Theorem 4); slightly improve Lemma 5 by relaxing the independence assumption

arXiv:2010.04955 [pdf, other]

A Distributed Hierarchy Framework for Enhancing Cyber Security of Control Center Applications

Authors: Chetan Kumar Kuraganti, Bryan Paul Robert, Gurunath Gurrala, Ashish Joglekar, Arun Babu Puthuparambil, Rajesh Sundaresan, Himanshu Tyagi

Abstract: Recent cyber-attacks on power grids highlight the necessity to protect the critical functionalities of a control center vital for the safe operation of a grid. Even in a distributed framework one central control center acts as a coordinator in majority of the control center architectures. Such a control center can become a prime target for cyber as well as physical attacks, and, hence, a single po… ▽ More Recent cyber-attacks on power grids highlight the necessity to protect the critical functionalities of a control center vital for the safe operation of a grid. Even in a distributed framework one central control center acts as a coordinator in majority of the control center architectures. Such a control center can become a prime target for cyber as well as physical attacks, and, hence, a single point failure can lead to complete loss of visibility of the power grid. If the control center which runs the critical functions in a distributed computing environment can be randomly chosen between the available control centers in a secure framework, the ability of the attacker in causing a single point failure can be reduced to a great extent. To achieve this, a novel distributed hierarchy based framework to secure critical functions is proposed in this paper. The proposed framework ensures that the data aggregation and the critical functions are carried out at a random location, and incorporates security features such as attestation and trust management to detect compromised agents. A theoretical result is proved on the evolution and convergence of the trust values in the proposed trust management protocol. It is also shown that the system is nominally robust so long as the number of compromised nodes is strictly less than one-half of the nodes minus 1. For demonstration, a Kalman filter-based state estimation using phasor measurements is used as the critical function to be secured. The proposed framework's implementation feasibility is tested on a physical hardware cluster of Parallella boards. The framework is also validated using simulations on the IEEE 118 bus system. △ Less

Submitted 10 October, 2020; originally announced October 2020.

arXiv:2007.10976 [pdf, ps, other]

Interactive Inference under Information Constraints

Authors: Jayadev Acharya, Clément L. Canonne, Yuhan Liu, Ziteng Sun, Himanshu Tyagi

Abstract: We study the role of interactivity in distributed statistical inference under information constraints, e.g., communication constraints and local differential privacy. We focus on the tasks of goodness-of-fit testing and estimation of discrete distributions. From prior work, these tasks are well understood under noninteractive protocols. Extending these approaches directly for interactive protocols… ▽ More We study the role of interactivity in distributed statistical inference under information constraints, e.g., communication constraints and local differential privacy. We focus on the tasks of goodness-of-fit testing and estimation of discrete distributions. From prior work, these tasks are well understood under noninteractive protocols. Extending these approaches directly for interactive protocols is difficult due to correlations that can build due to interactivity; in fact, gaps can be found in prior claims of tight bounds of distribution estimation using interactive protocols. We propose a new approach to handle this correlation and establish a unified method to establish lower bounds for both tasks. As an application, we obtain optimal bounds for both estimation and testing under local differential privacy and communication constraints. We also provide an example of a natural testing problem where interactivity helps. △ Less

Submitted 23 October, 2021; v1 submitted 21 July, 2020; originally announced July 2020.

Comments: Modifying the proof and statement of Proposition 24 to address an issue in the variance analysis

arXiv:2005.10571 [pdf, other]

Communication Complexity of Distributed High Dimensional Correlation Testing

Authors: K. R. Sahasranand, Himanshu Tyagi

Abstract: Two parties observe independent copies of a $d$-dimensional vector and a scalar. They seek to test if their data is correlated or not, namely they seek to test if the norm $\|ρ\|_2$ of the correlation vector $ρ$ between their observations exceeds $τ$ or is it $0$. To that end, they communicate interactively and declare the output of the test. We show that roughly order $d/τ^2$ bits of communicatio… ▽ More Two parties observe independent copies of a $d$-dimensional vector and a scalar. They seek to test if their data is correlated or not, namely they seek to test if the norm $\|ρ\|_2$ of the correlation vector $ρ$ between their observations exceeds $τ$ or is it $0$. To that end, they communicate interactively and declare the output of the test. We show that roughly order $d/τ^2$ bits of communication are sufficient and necessary for resolving the distributed correlation testing problem above. Furthermore, we establish a lower bound of roughly $d^2/τ^2$ bits for communication needed for distributed correlation estimation, rendering the estimate-and-test approach suboptimal in communication required for distributed correlation testing. For the one-dimensional case with one-way communication, our bounds are tight even in the constant and provide a precise dependence of communication complexity on the probabilities of error of two types. △ Less

Submitted 21 May, 2020; originally announced May 2020.

Comments: 30 pages, 1 figure

arXiv:2004.12782 [pdf, other]

How Reliable are Test Numbers for Revealing the COVID-19 Ground Truth and Applying Interventions?

Authors: Aditya Gopalan, Himanshu Tyagi

Abstract: The number of confirmed cases of COVID-19 is often used as a proxy for the actual number of ground truth COVID-19 infected cases in both public discourse and policy making. However, the number of confirmed cases depends on the testing policy, and it is important to understand how the number of positive cases obtained using different testing policies reveals the unknown ground truth. We develop an… ▽ More The number of confirmed cases of COVID-19 is often used as a proxy for the actual number of ground truth COVID-19 infected cases in both public discourse and policy making. However, the number of confirmed cases depends on the testing policy, and it is important to understand how the number of positive cases obtained using different testing policies reveals the unknown ground truth. We develop an agent-based simulation framework in Python that can simulate various testing policies as well as interventions such as lockdown based on them. The interaction between the agents can take into account various communities and mobility patterns. A distinguishing feature of our framework is the presence of another `flu'-like illness with symptoms similar to COVID-19, that allows us to model the noise in selecting the pool of patients to be tested. We instantiate our model for the city of Bengaluru in India, using census data to distribute agents geographically, and traffic flow mobility data to model long-distance interactions and mixing. We use the simulation framework to compare the performance of three testing policies: Random Symptomatic Testing (RST), Contact Tracing (CT), and a new Location Based Testing policy (LBT). We observe that if a sufficient fraction of symptomatic patients come out for testing, then RST can capture the ground truth quite closely even with very few daily tests. However, CT consistently captures more positive cases. Interestingly, our new LBT, which is operationally less intensive than CT, gives performance that is comparable with CT. In another direction, we compare the efficacy of these three testing policies in enabling lockdown, and observe that CT flattens the ground truth curve maximally, followed closely by LBT, and significantly better than RST. △ Less

Submitted 24 April, 2020; originally announced April 2020.

arXiv:2003.09808 [pdf, ps, other]

doi 10.3390/e23030347

Tracking an Auto-Regressive Process with Limited Communication per Unit Time

Authors: Rooji Jinan, Parimal Parag, Himanshu Tyagi

Abstract: Samples from a high-dimensional AR[1] process are observed by a sender which can communicate only finitely many bits per unit time to a receiver. The receiver seeks to form an estimate of the process value at every time instant in real-time. We consider a time-slotted communication model in a slow-sampling regime where multiple communication slots occur between two sampling instants. We propose a… ▽ More Samples from a high-dimensional AR[1] process are observed by a sender which can communicate only finitely many bits per unit time to a receiver. The receiver seeks to form an estimate of the process value at every time instant in real-time. We consider a time-slotted communication model in a slow-sampling regime where multiple communication slots occur between two sampling instants. We propose a successive update scheme which uses communication between sampling instants to refine estimates of the latest sample and study the following question: Is it better to collect communication of multiple slots to send better refined estimates, making the receiver wait more for every refinement, or to be fast but loose and send new information in every communication opportunity? We show that the fast but loose successive update scheme with ideal spherical codes is universally optimal asymptotically for a large dimension. However, most practical quantization codes for fixed dimensions do not meet the ideal performance required for this optimality, and they typically will have a bias in the form of a fixed additive error. Interestingly, our analysis shows that the fast but loose scheme is not an optimal choice in the presence of such errors, and a judiciously chosen frequency of updates outperforms it. △ Less

Submitted 22 March, 2020; originally announced March 2020.

Report number: Entropy 2021, 23(3), 347

Journal ref: Entropy 2021, 23(3), 347

arXiv:2001.09032 [pdf, ps, other]

Limits on Gradient Compression for Stochastic Optimization

Authors: Prathamesh Mayekar, Himanshu Tyagi

Abstract: We consider stochastic optimization over $\ell_p$ spaces using access to a first-order oracle. We ask: {What is the minimum precision required for oracle outputs to retain the unrestricted convergence rates?} We characterize this precision for every $p\geq 1$ by deriving information theoretic lower bounds and by providing quantizers that (almost) achieve these lower bounds. Our quantizers are new… ▽ More We consider stochastic optimization over $\ell_p$ spaces using access to a first-order oracle. We ask: {What is the minimum precision required for oracle outputs to retain the unrestricted convergence rates?} We characterize this precision for every $p\geq 1$ by deriving information theoretic lower bounds and by providing quantizers that (almost) achieve these lower bounds. Our quantizers are new and easy to implement. In particular, our results are exact for $p=2$ and $p=\infty$, showing the minimum precision needed in these settings are $Θ(d)$ and $Θ(\log d)$, respectively. The latter result is surprising since recovering the gradient vector will require $Ω(d)$ bits. △ Less

Submitted 24 January, 2020; originally announced January 2020.

arXiv:1912.11247 [pdf, ps, other]

Sample-Measurement Tradeoff in Support Recovery under a Subgaussian Prior

Authors: Lekshmi Ramesh, Chandra R Murthy, Himanshu Tyagi

Abstract: Data samples from $\mathbb{R}^{d}$ with a common support of size $k$ are accessed through $m$ random linear projections (measurements) per sample. It is well-known that roughly $k$ measurements from a single sample are sufficient to recover the support. In the multiple sample setting, do $k$ overall measurements still suffice when only $m$ measurements per sample are allowed, with $m<k$? We answer… ▽ More Data samples from $\mathbb{R}^{d}$ with a common support of size $k$ are accessed through $m$ random linear projections (measurements) per sample. It is well-known that roughly $k$ measurements from a single sample are sufficient to recover the support. In the multiple sample setting, do $k$ overall measurements still suffice when only $m$ measurements per sample are allowed, with $m<k$? We answer this question in the negative by considering a generative model setting with independent samples drawn from a subgaussian prior. We show that $n=Θ((k^2/m^2)\cdot\log k(d-k))$ samples are necessary and sufficient to recover the support exactly. In turn, this shows that when $m<k$, $k$ overall measurements are insufficient for support recovery; instead we need about $m$ measurements each from $k^{2}/m^2$ samples, i.e., $k^{2}/m$ overall measurements are necessary. △ Less

Submitted 19 September, 2020; v1 submitted 24 December, 2019; originally announced December 2019.

Comments: A preliminary version of this paper appeared at IEEE International Symposium on Information Theory 2019

arXiv:1908.08200 [pdf, ps, other]

RATQ: A Universal Fixed-Length Quantizer for Stochastic Optimization

Authors: Prathamesh Mayekar, Himanshu Tyagi

Abstract: We present Rotated Adaptive Tetra-iterated Quantizer (RATQ), a fixed-length quantizer for gradients in first order stochastic optimization. RATQ is easy to implement and involves only a Hadamard transform computation and adaptive uniform quantization with appropriately chosen dynamic ranges. For noisy gradients with almost surely bounded Euclidean norms, we establish an information theoretic lower… ▽ More We present Rotated Adaptive Tetra-iterated Quantizer (RATQ), a fixed-length quantizer for gradients in first order stochastic optimization. RATQ is easy to implement and involves only a Hadamard transform computation and adaptive uniform quantization with appropriately chosen dynamic ranges. For noisy gradients with almost surely bounded Euclidean norms, we establish an information theoretic lower bound for optimization accuracy using finite precision gradients and show that RATQ almost attains this lower bound. For mean square bounded noisy gradients, we use a gain-shape quantizer which separately quantizes the Euclidean norm and uses RATQ to quantize the normalized unit norm vector. We establish lower bounds for performance of any optimization procedure and shape quantizer, when used with a uniform gain quantizer. Finally, we propose an adaptive quantizer for gain which when used with RATQ for shape quantizer outperforms uniform gain quantization and is, in fact, close to optimal. As a by-product, we show that our fixed-length quantizer RATQ has almost the same performance as the optimal variable-length quantizers for distributed mean estimation. Also, we obtain an efficient quantizer for Gaussian vectors which attains a rate very close to the Gaussian rate-distortion function and is, in fact, universal for subgaussian input vectors. △ Less

Submitted 16 December, 2019; v1 submitted 22 August, 2019; originally announced August 2019.

arXiv:1907.08743 [pdf, ps, other]

Domain Compression and its Application to Randomness-Optimal Distributed Goodness-of-Fit

Authors: Jayadev Acharya, Clément L. Canonne, Yanjun Han, Ziteng Sun, Himanshu Tyagi

Abstract: We study goodness-of-fit of discrete distributions in the distributed setting, where samples are divided between multiple users who can only release a limited amount of information about their samples due to various information constraints. Recently, a subset of the authors showed that having access to a common random seed (i.e., shared randomness) leads to a significant reduction in the sample co… ▽ More We study goodness-of-fit of discrete distributions in the distributed setting, where samples are divided between multiple users who can only release a limited amount of information about their samples due to various information constraints. Recently, a subset of the authors showed that having access to a common random seed (i.e., shared randomness) leads to a significant reduction in the sample complexity of this problem. In this work, we provide a complete understanding of the interplay between the amount of shared randomness available, the stringency of information constraints, and the sample complexity of the testing problem by characterizing a tight trade-off between these three parameters. We provide a general distributed goodness-of-fit protocol that as a function of the amount of shared randomness interpolates smoothly between the private- and public-coin sample complexities. We complement our upper bound with a general framework to prove lower bounds on the sample complexity of this testing problems under limited shared randomness. Finally, we instantiate our bounds for the two archetypal information constraints of communication and local privacy, and show that our sample complexity bounds are optimal as a function of all the parameters of the problem, including the amount of shared randomness. A key component of our upper bounds is a new primitive of domain compression, a tool that allows us to map distributions to a much smaller domain size while preserving their pairwise distances, using a limited amount of randomness. △ Less

Submitted 19 July, 2019; originally announced July 2019.

arXiv:1906.02746 [pdf, other]

Ranking and synchronization from pairwise measurements via SVD

Authors: Alexandre d'Aspremont, Mihai Cucuringu, Hemant Tyagi

Abstract: Given a measurement graph $G= (V,E)$ and an unknown signal $r \in \mathbb{R}^n$, we investigate algorithms for recovering $r$ from pairwise measurements of the form $r_i - r_j$; $\{i,j\} \in E$. This problem arises in a variety of applications, such as ranking teams in sports data and time synchronization of distributed networks. Framed in the context of ranking, the task is to recover the ranking… ▽ More Given a measurement graph $G= (V,E)$ and an unknown signal $r \in \mathbb{R}^n$, we investigate algorithms for recovering $r$ from pairwise measurements of the form $r_i - r_j$; $\{i,j\} \in E$. This problem arises in a variety of applications, such as ranking teams in sports data and time synchronization of distributed networks. Framed in the context of ranking, the task is to recover the ranking of $n$ teams (induced by $r$) given a small subset of noisy pairwise rank offsets. We propose a simple SVD-based algorithmic pipeline for both the problem of time synchronization and ranking. We provide a detailed theoretical analysis in terms of robustness against both sampling sparsity and noise perturbations with outliers, using results from matrix perturbation and random matrix theory. Our theoretical findings are complemented by a detailed set of numerical experiments on both synthetic and real data, showcasing the competitiveness of our proposed algorithms with other state-of-the-art methods. △ Less

Submitted 7 August, 2020; v1 submitted 6 June, 2019; originally announced June 2019.

Comments: 49 pages, 10 figures

arXiv:1905.08302 [pdf, ps, other]

Inference under Information Constraints II: Communication Constraints and Shared Randomness

Authors: Jayadev Acharya, Clément L. Canonne, Himanshu Tyagi

Abstract: A central server needs to perform statistical inference based on samples that are distributed over multiple users who can each send a message of limited length to the center. We study problems of distribution learning and identity testing in this distributed inference setting and examine the role of shared randomness as a resource. We propose a general-purpose simulate-and-infer strategy that uses… ▽ More A central server needs to perform statistical inference based on samples that are distributed over multiple users who can each send a message of limited length to the center. We study problems of distribution learning and identity testing in this distributed inference setting and examine the role of shared randomness as a resource. We propose a general-purpose simulate-and-infer strategy that uses only private-coin communication protocols and is sample-optimal for distribution learning. This general strategy turns out to be sample-optimal even for distribution testing among private-coin protocols. Interestingly, we propose a public-coin protocol that outperforms simulate-and-infer for distribution testing and is, in fact, sample-optimal. Underlying our public-coin protocol is a random hash that when applied to the samples minimally contracts the chi-squared distance of their distribution to the uniform distribution. △ Less

Submitted 1 October, 2020; v1 submitted 20 May, 2019; originally announced May 2019.

Comments: To appear in IEEE Transactions on Information Theory. An abridged version of this work appeared in the 2019 International Conference on Machine Learning (ICML)

arXiv:1904.09563 [pdf, other]

Communication for Generating Correlation: A Unifying Survey

Authors: Madhu Sudan, Himanshu Tyagi, Shun Watanabe

Abstract: The task of manipulating correlated random variables in a distributed setting has received attention in the fields of both Information Theory and Computer Science. Often shared correlations can be converted, using a little amount of communication, into perfectly shared uniform random variables. Such perfect shared randomness, in turn, enables the solutions of many tasks. Even the reverse conversio… ▽ More The task of manipulating correlated random variables in a distributed setting has received attention in the fields of both Information Theory and Computer Science. Often shared correlations can be converted, using a little amount of communication, into perfectly shared uniform random variables. Such perfect shared randomness, in turn, enables the solutions of many tasks. Even the reverse conversion of perfectly shared uniform randomness into variables with a desired form of correlation turns out to be insightful and technically useful. In this survey article, we describe progress-to-date on such problems and lay out pertinent measures, achievability results, limits of performance, and point to new directions. △ Less

Submitted 2 October, 2019; v1 submitted 21 April, 2019; originally announced April 2019.

Comments: A review article to appear in IEEE Transactions on Information Theory

arXiv:1904.08575 [pdf, other]

SPONGE: A generalized eigenproblem for clustering signed networks

Authors: Mihai Cucuringu, Peter Davies, Aldo Glielmo, Hemant Tyagi

Abstract: We introduce a principled and theoretically sound spectral method for $k$-way clustering in signed graphs, where the affinity measure between nodes takes either positive or negative values. Our approach is motivated by social balance theory, where the task of clustering aims to decompose the network into disjoint groups, such that individuals within the same group are connected by as many positive… ▽ More We introduce a principled and theoretically sound spectral method for $k$-way clustering in signed graphs, where the affinity measure between nodes takes either positive or negative values. Our approach is motivated by social balance theory, where the task of clustering aims to decompose the network into disjoint groups, such that individuals within the same group are connected by as many positive edges as possible, while individuals from different groups are connected by as many negative edges as possible. Our algorithm relies on a generalized eigenproblem formulation inspired by recent work on constrained clustering. We provide theoretical guarantees for our approach in the setting of a signed stochastic block model, by leveraging tools from matrix perturbation theory and random matrix theory. An extensive set of numerical experiments on both synthetic and real data shows that our approach compares favorably with state-of-the-art methods for signed clustering, especially for large number of clusters and sparse measurement graphs. △ Less

Submitted 19 May, 2019; v1 submitted 17 April, 2019; originally announced April 2019.

Comments: 33 pages, 18 figures

Journal ref: AISTATS 2019

arXiv:1902.08970 [pdf, ps, other]

Secret Key Capacity For Multipleaccess Channel With Public Feedback

Authors: Himanshu Tyagi, Shun Watanabe

Abstract: We consider the generation of a secret key (SK) by the inputs and the output of a secure multipleaccess channel (MAC) that additionally have access to a noiseless public communication channel. Under specific restrictions on the protocols, we derive various upper bounds on the rate of such SKs. Specifically, if the public communication consists of only the feedback from the output terminal, then th… ▽ More We consider the generation of a secret key (SK) by the inputs and the output of a secure multipleaccess channel (MAC) that additionally have access to a noiseless public communication channel. Under specific restrictions on the protocols, we derive various upper bounds on the rate of such SKs. Specifically, if the public communication consists of only the feedback from the output terminal, then the rate of SKs that can be generated is bounded above by the maximum symmetric rate $R_f^\ast$ in the capacity region of the MAC with feedback. On the other hand, if the public communication is allowed only before and after the transmission over the MAC, then the rate of SKs is bounded above by the maximum symmetric rate $R^\ast$ in the capacity region of the MAC without feedback. Furthermore, for a symmetric MAC, we present a scheme that generates an SK of rate $R_f^\ast$, improving the best previously known achievable rate $R^\ast$. An application of our results establishes the SK capacity for adder MAC, without any restriction on the protocols. △ Less

Submitted 24 February, 2019; originally announced February 2019.

Comments: This paper was presented at Allerton 2013. We were hoping to extend the results further, but it is clear that this is not going to happen soon. Now is as good a time as any for sharing this on arXiv. The results will be of interest for anyone working on secret key agreement or common randomness generation

arXiv:1901.11105 [pdf, ps, other]

A New Proof of Nonsignalling Multiprover Parallel Repetition Theorem

Authors: Himanshu Tyagi, Shun Watanabe

Abstract: We present an information theoretic proof of the nonsignalling multiprover parallel repetition theorem, a recent extension of its two-prover variant that underlies many hardness of approximation results. The original proofs used de Finetti type decomposition for strategies. We present a new proof that is based on a technique we introduced recently for proving strong converse results in multiuser i… ▽ More We present an information theoretic proof of the nonsignalling multiprover parallel repetition theorem, a recent extension of its two-prover variant that underlies many hardness of approximation results. The original proofs used de Finetti type decomposition for strategies. We present a new proof that is based on a technique we introduced recently for proving strong converse results in multiuser information theory and entails a change of measure after replacing hard information constraints with soft ones. △ Less

Submitted 30 January, 2019; originally announced January 2019.

arXiv:1812.11476 [pdf, other]

Inference under Information Constraints I: Lower Bounds from Chi-Square Contraction

Authors: Jayadev Acharya, Clément L. Canonne, Himanshu Tyagi

Abstract: Multiple players are each given one independent sample, about which they can only provide limited information to a central referee. Each player is allowed to describe its observed sample to the referee using a channel from a family of channels $\mathcal{W}$, which can be instantiated to capture both the communication- and privacy-constrained settings and beyond. The referee uses the messages from… ▽ More Multiple players are each given one independent sample, about which they can only provide limited information to a central referee. Each player is allowed to describe its observed sample to the referee using a channel from a family of channels $\mathcal{W}$, which can be instantiated to capture both the communication- and privacy-constrained settings and beyond. The referee uses the messages from players to solve an inference problem for the unknown distribution that generated the samples. We derive lower bounds for sample complexity of learning and testing discrete distributions in this information-constrained setting. Underlying our bounds is a characterization of the contraction in chi-square distances between the observed distributions of the samples when information constraints are placed. This contraction is captured in a local neighborhood in terms of chi-square and decoupled chi-square fluctuations of a given channel, two quantities we introduce. The former captures the average distance between distributions of channel output for two product distributions on the input, and the latter for a product distribution and a mixture of product distribution on the input. Our bounds are tight for both public- and private-coin protocols. Interestingly, the sample complexity of testing is order-wise higher when restricted to private-coin protocols. △ Less

Submitted 1 October, 2020; v1 submitted 30 December, 2018; originally announced December 2018.

Comments: To appear in IEEE Transactions on Information Theory

arXiv:1810.05561 [pdf, ps, other]

doi 10.1109/TIT.2020.2983151

Optimal Source Codes for Timely Updates

Authors: Prathamesh Mayekar, Parimal Parag, Himanshu Tyagi

Abstract: A transmitter observing a sequence of independent and identically distributed random variables seeks to keep a receiver updated about its latest observations. The receiver need not be apprised about each symbol seen by the transmitter, but needs to output a symbol at each time instant $t$. If at time $t$ the receiver outputs the symbol seen by the transmitter at time $U(t)\leq t$, the age of infor… ▽ More A transmitter observing a sequence of independent and identically distributed random variables seeks to keep a receiver updated about its latest observations. The receiver need not be apprised about each symbol seen by the transmitter, but needs to output a symbol at each time instant $t$. If at time $t$ the receiver outputs the symbol seen by the transmitter at time $U(t)\leq t$, the age of information at the receiver at time $t$ is $t-U(t)$. We study the design of lossless source codes that enable transmission with minimum average age at the receiver. We show that the asymptotic minimum average age can be attained up to a constant gap by the Shannon codes for a tilted version of the original pmf generating the symbols, which can be computed easily by solving an optimization problem. Furthermore, we exhibit an example with alphabet $\X$ where Shannon codes for the original pmf incur an asymptotic average age of a factor $O(\sqrt{\log |\X|})$ more than that achieved by our codes. Underlying our prescription for optimal codes is a new variational formula for integer moments of random variables, which may be of independent interest. Also, we discuss possible extensions of our formulation to randomized schemes and to the erasure channel, and include a treatment of the related problem of source coding for minimum average queuing delay. △ Less

Submitted 27 March, 2020; v1 submitted 12 October, 2018; originally announced October 2018.

Comments: Added a missing reference, in IEEE Transactions on Information Theory, 2020

Journal ref: IEEE Transactions on Information Theory, vol. 66, no. 6, pp. 3714--3731, June 2020

arXiv:1808.02174 [pdf, ps, other]

Test without Trust: Optimal Locally Private Distribution Testing

Authors: Jayadev Acharya, Clément L. Canonne, Cody Freitag, Himanshu Tyagi

Abstract: We study the problem of distribution testing when the samples can only be accessed using a locally differentially private mechanism and focus on two representative testing questions of identity (goodness-of-fit) and independence testing for discrete distributions. We are concerned with two settings: First, when we insist on using an already deployed, general-purpose locally differentially private… ▽ More We study the problem of distribution testing when the samples can only be accessed using a locally differentially private mechanism and focus on two representative testing questions of identity (goodness-of-fit) and independence testing for discrete distributions. We are concerned with two settings: First, when we insist on using an already deployed, general-purpose locally differentially private mechanism such as the popular RAPPOR or the recently introduced Hadamard Response for collecting data, and must build our tests based on the data collected via this mechanism; and second, when no such restriction is imposed, and we can design a bespoke mechanism specifically for testing. For the latter purpose, we introduce the Randomized Aggregated Private Testing Optimal Response (RAPTOR) mechanism which is remarkably simple and requires only one bit of communication per sample. We propose tests based on these mechanisms and analyze their sample complexities. Each proposed test can be implemented efficiently. In each case (barring one), we complement our performance bounds for algorithms with information-theoretic lower bounds and establish sample optimality of our proposed algorithm. A peculiar feature that emerges is that our sample-optimal algorithm based on RAPTOR uses public-coins, and any test based on RAPPOR or Hadamard Response, which are both private-coin mechanisms, requires significantly more samples. △ Less

Submitted 6 August, 2018; originally announced August 2018.

arXiv:1807.02862 [pdf, other]

Multi-kernel unmixing and super-resolution using the Modified Matrix Pencil method

Authors: Stéphane Chrétien, Hemant Tyagi

Abstract: Consider $L$ groups of point sources or spike trains, with the $l^{\text{th}}$ group represented by $x_l(t)$. For a function $g:\mathbb{R} \rightarrow \mathbb{R}$, let $g_l(t) = g(t/μ_l)$ denote a point spread function with scale $μ_l > 0$, and with $μ_1 < \cdots < μ_L$. With $y(t) = \sum_{l=1}^{L} (g_l \star x_l)(t)$, our goal is to recover the source parameters given samples of $y$, or given the… ▽ More Consider $L$ groups of point sources or spike trains, with the $l^{\text{th}}$ group represented by $x_l(t)$. For a function $g:\mathbb{R} \rightarrow \mathbb{R}$, let $g_l(t) = g(t/μ_l)$ denote a point spread function with scale $μ_l > 0$, and with $μ_1 < \cdots < μ_L$. With $y(t) = \sum_{l=1}^{L} (g_l \star x_l)(t)$, our goal is to recover the source parameters given samples of $y$, or given the Fourier samples of $y$. This problem is a generalization of the usual super-resolution setup wherein $L = 1$; we call this the multi-kernel unmixing super-resolution problem. Assuming access to Fourier samples of $y$, we derive an algorithm for this problem for estimating the source parameters of each group, along with precise non-asymptotic guarantees. Our approach involves estimating the group parameters sequentially in the order of increasing scale parameters, i.e., from group $1$ to $L$. In particular, the estimation process at stage $1 \leq l \leq L$ involves (i) carefully sampling the tail of the Fourier transform of $y$, (ii) a \emph{deflation} step wherein we subtract the contribution of the groups processed thus far from the obtained Fourier samples, and (iii) applying Moitra's modified Matrix Pencil method on a deconvolved version of the samples in (ii). △ Less

Submitted 7 January, 2020; v1 submitted 8 July, 2018; originally announced July 2018.

Comments: 50 pages, 10 figures, made notational changes and corrected typos after reviewer feedback, to appear in Journal of Fourier Analysis and Applications

arXiv:1805.04625 [pdf, ps, other]

Strong Converse using Change of Measure Arguments

Authors: Himanshu Tyagi, Shun Watanabe

Abstract: The strong converse for a coding theorem shows that the optimal asymptotic rate possible with vanishing error cannot be improved by allowing a fixed error. Building on a method introduced by Gu and Effros for centralized coding problems, we develop a general and simple recipe for proving strong converse that is applicable for distributed problems as well. Heuristically, our proof of strong convers… ▽ More The strong converse for a coding theorem shows that the optimal asymptotic rate possible with vanishing error cannot be improved by allowing a fixed error. Building on a method introduced by Gu and Effros for centralized coding problems, we develop a general and simple recipe for proving strong converse that is applicable for distributed problems as well. Heuristically, our proof of strong converse mimics the standard steps for proving a weak converse, except that we apply those steps to a modified distribution obtained by conditioning the original distribution on the event that no error occurs. A key component of our recipe is the replacement of the hard Markov constraints implied by the distributed nature of the problem with a soft information cost using a variational formula introduced by Oohama. We illustrate our method by providing a short proof of the strong converse for the Wyner-Ziv problem and strong converse theorems for interactive function computation, common randomness and secret key agreement, and the wiretap channel; the latter three strong converse problems were open prior to this work. △ Less

Submitted 21 August, 2019; v1 submitted 11 May, 2018; originally announced May 2018.

Comments: 35 pages, no figure; v2 updated references

arXiv:1804.06952 [pdf, ps, other]

Distributed Simulation and Distributed Inference

Authors: Jayadev Acharya, Clément L. Canonne, Himanshu Tyagi

Abstract: Independent samples from an unknown probability distribution $\bf p$ on a domain of size $k$ are distributed across $n$ players, with each player holding one sample. Each player can communicate $\ell$ bits to a central referee in a simultaneous message passing model of communication to help the referee infer a property of the unknown $\bf p$. What is the least number of players for inference requi… ▽ More Independent samples from an unknown probability distribution $\bf p$ on a domain of size $k$ are distributed across $n$ players, with each player holding one sample. Each player can communicate $\ell$ bits to a central referee in a simultaneous message passing model of communication to help the referee infer a property of the unknown $\bf p$. What is the least number of players for inference required in the communication-starved setting of $\ell<\log k$? We begin by exploring a general "simulate-and-infer" strategy for such inference problems where the center simulates the desired number of samples from the unknown distribution and applies standard inference algorithms for the collocated setting. Our first result shows that for $\ell<\log k$ perfect simulation of even a single sample is not possible. Nonetheless, we present a Las Vegas algorithm that simulates a single sample from the unknown distribution using $O(k/2^\ell)$ samples in expectation. As an immediate corollary, we get that simulate-and-infer attains the optimal sample complexity of $Θ(k^2/2^\ellε^2)$ for learning the unknown distribution to total variation distance $ε$. For the prototypical testing problem of identity testing, simulate-and-infer works with $O(k^{3/2}/2^\ellε^2)$ samples, a requirement that seems to be inherent for all communication protocols not using any additional resources. Interestingly, we can break this barrier using public coins. Specifically, we exhibit a public-coin communication protocol that performs identity testing using $O(k/\sqrt{2^\ell}ε^2)$ samples. Furthermore, we show that this is optimal up to constant factors. Our theoretically sample-optimal protocol is easy to implement in practice. Our proof of lower bound entails showing a contraction in $χ^2$ distance of product distributions due to communication constraints and may be of independent interest. △ Less

Submitted 23 May, 2019; v1 submitted 18 April, 2018; originally announced April 2018.

Comments: This work is superseded by the more recent "Inference under Information Constraints II: Communication Constraints and Shared Randomness" (arXiv:1905.08302), by the same authors

arXiv:1804.01490 [pdf, ps, other]

Sparse non-negative super-resolution -- simplified and stabilised

Authors: Armin Eftekhari, Jared Tanner, Andrew Thompson, Bogdan Toader, Hemant Tyagi

Abstract: The convolution of a discrete measure, $x=\sum_{i=1}^ka_iδ_{t_i}$, with a local window function, $φ(s-t)$, is a common model for a measurement device whose resolution is substantially lower than that of the objects being observed. Super-resolution concerns localising the point sources $\{a_i,t_i\}_{i=1}^k$ with an accuracy beyond the essential support of $φ(s-t)$, typically from $m$ samples… ▽ More The convolution of a discrete measure, $x=\sum_{i=1}^ka_iδ_{t_i}$, with a local window function, $φ(s-t)$, is a common model for a measurement device whose resolution is substantially lower than that of the objects being observed. Super-resolution concerns localising the point sources $\{a_i,t_i\}_{i=1}^k$ with an accuracy beyond the essential support of $φ(s-t)$, typically from $m$ samples $y(s_j)=\sum_{i=1}^k a_iφ(s_j-t_i)+η_j$, where $η_j$ indicates an inexactness in the sample value. We consider the setting of $x$ being non-negative and seek to characterise all non-negative measures approximately consistent with the samples. We first show that $x$ is the unique non-negative measure consistent with the samples provided the samples are exact, i.e. $η_j=0$, $m\ge 2k+1$ samples are available, and $φ(s-t)$ generates a Chebyshev system. This is independent of how close the sample locations are and {\em does not rely on any regulariser beyond non-negativity}; as such, it extends and clarifies the work by Schiebinger et al. and De Castro et al., who achieve the same results but require a total variation regulariser, which we show is unnecessary. Moreover, we characterise non-negative solutions $\hat{x}$ consistent with the samples within the bound $\sum_{j=1}^mη_j^2\le δ^2$. Any such non-negative measure is within ${\mathcal O}(δ^{1/7})$ of the discrete measure $x$ generating the samples in the generalised Wasserstein distance, converging to one another as $δ$ approaches zero. We also show how to make these general results, for windows that form a Chebyshev system, precise for the case of $φ(s-t)$ being a Gaussian window. The main innovation of these results is that non-negativity alone is sufficient to localise point sources beyond the essential sensor resolution. △ Less

Submitted 26 November, 2019; v1 submitted 4 April, 2018; originally announced April 2018.

Comments: 59 pages, 7 figures

arXiv:1605.01033 [pdf, other]

Universal Multiparty Data Exchange and Secret Key Agreement

Authors: Himanshu Tyagi, Shun Watanabe

Abstract: Multiple parties observing correlated data seek to recover each other's data and attain omniscience. To that end, they communicate interactively over a noiseless broadcast channel - each bit transmitted over this channel is received by all the parties. We give a universal interactive communication protocol, termed the recursive data exchange protocol (RDE), which attains omniscience for any sequen… ▽ More Multiple parties observing correlated data seek to recover each other's data and attain omniscience. To that end, they communicate interactively over a noiseless broadcast channel - each bit transmitted over this channel is received by all the parties. We give a universal interactive communication protocol, termed the recursive data exchange protocol (RDE), which attains omniscience for any sequence of data observed by the parties and provide an individual sequence guarantee of performance. As a by-product, for observations of length $n$, we show the universal rate optimality of RDE up to an $\mathcal{O}\left(n^{-\frac 12}\sqrt{\log n}\right)$ term in a generative setting where the data sequence is independent and identically distributed (in time). Furthermore, drawing on the duality between omniscience and secret key agreement due to Csiszar and Narayan, we obtain a universal protocol for generating a multiparty secret key of rate at most $\mathcal{O}\left(n^{-\frac 12}\sqrt{\log n}\right)$ less than the maximum rate possible. A key feature of RDE is its recursive structure whereby when a subset $A$ of parties recover each-other's data, the rates appear as if the parties have been executing the protocol in an alternative model where the parties in $A$ are collocated. △ Less

Submitted 23 January, 2017; v1 submitted 3 May, 2016; originally announced May 2016.

arXiv:1605.00609 [pdf, other]

Algorithms for Learning Sparse Additive Models with Interactions in High Dimensions

Authors: Hemant Tyagi, Anastasios Kyrillidis, Bernd Gärtner, Andreas Krause

Abstract: A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}φ_{l}(x_l)$ where $\mathcal{S} \subset [d]$, $|\mathcal{S}| \ll d$. Assuming $φ$'s, $\mathcal{S}$ to be unknown, there exists extensive work for estimating $f$ from its samples. In this work, we consider a generalized version of SPAMs, that also allow… ▽ More A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}φ_{l}(x_l)$ where $\mathcal{S} \subset [d]$, $|\mathcal{S}| \ll d$. Assuming $φ$'s, $\mathcal{S}$ to be unknown, there exists extensive work for estimating $f$ from its samples. In this work, we consider a generalized version of SPAMs, that also allows for the presence of a sparse number of second order interaction terms. For some $\mathcal{S}_1 \subset [d], \mathcal{S}_2 \subset {[d] \choose 2}$, with $|\mathcal{S}_1| \ll d, |\mathcal{S}_2| \ll d^2$, the function $f$ is now assumed to be of the form: $\sum_{p \in \mathcal{S}_1}φ_{p} (x_p) + \sum_{(l,l^{\prime}) \in \mathcal{S}_2}φ_{(l,l^{\prime})} (x_l,x_{l^{\prime}})$. Assuming we have the freedom to query $f$ anywhere in its domain, we derive efficient algorithms that provably recover $\mathcal{S}_1,\mathcal{S}_2$ with finite sample bounds. Our analysis covers the noiseless setting where exact samples of $f$ are obtained, and also extends to the noisy setting where the queries are corrupted with noise. For the noisy setting in particular, we consider two noise models namely: i.i.d Gaussian noise and arbitrary but bounded noise. Our main methods for identification of $\mathcal{S}_2$ essentially rely on estimation of sparse Hessian matrices, for which we provide two novel compressed sensing based schemes. Once $\mathcal{S}_1, \mathcal{S}_2$ are known, we show how the individual components $φ_p$, $φ_{(l,l^{\prime})}$ can be estimated via additional queries of $f$, with uniform error bounds. Lastly, we provide simulation results on synthetic data that validate our theoretical findings. △ Less

Submitted 8 May, 2017; v1 submitted 2 May, 2016; originally announced May 2016.

Comments: To appear in Information and Inference: A Journal of the IMA. Made following changes after review process: (a) Corrected typos throughout the text. (b) Corrected choice of sampling distribution in Section 5, see eqs. (5.2), (5.3). (c) More detailed comparison with existing work in Section 8. (d) Added Section B in appendix on roots of cubic equation

arXiv:1604.05307 [pdf, other]

Learning Sparse Additive Models with Interactions in High Dimensions

Authors: Hemant Tyagi, Anastasios Kyrillidis, Bernd Gärtner, Andreas Krause

Abstract: A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is referred to as a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}φ_{l}(x_l)$, where $\mathcal{S} \subset [d]$, $|\mathcal{S}| \ll d$. Assuming $φ_l$'s and $\mathcal{S}$ to be unknown, the problem of estimating $f$ from its samples has been studied extensively. In this work, we consider a generalized… ▽ More A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is referred to as a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}φ_{l}(x_l)$, where $\mathcal{S} \subset [d]$, $|\mathcal{S}| \ll d$. Assuming $φ_l$'s and $\mathcal{S}$ to be unknown, the problem of estimating $f$ from its samples has been studied extensively. In this work, we consider a generalized SPAM, allowing for second order interaction terms. For some $\mathcal{S}_1 \subset [d], \mathcal{S}_2 \subset {[d] \choose 2}$, the function $f$ is assumed to be of the form: $$f(\mathbf{x}) = \sum_{p \in \mathcal{S}_1}φ_{p} (x_p) + \sum_{(l,l^{\prime}) \in \mathcal{S}_2}φ_{(l,l^{\prime})} (x_{l},x_{l^{\prime}}).$$ Assuming $φ_{p},φ_{(l,l^{\prime})}$, $\mathcal{S}_1$ and, $\mathcal{S}_2$ to be unknown, we provide a randomized algorithm that queries $f$ and exactly recovers $\mathcal{S}_1,\mathcal{S}_2$. Consequently, this also enables us to estimate the underlying $φ_p, φ_{(l,l^{\prime})}$. We derive sample complexity bounds for our scheme and also extend our analysis to include the situation where the queries are corrupted with noise -- either stochastic, or arbitrary but bounded. Lastly, we provide simulation results on synthetic data, that validate our theoretical findings. △ Less

Submitted 18 April, 2016; originally announced April 2016.

Comments: 23 pages, to appear in Proceedings of the 19th International Conference on Artificial Intelligence and Statistics (AISTATS) 2016

arXiv:1601.03617 [pdf, other]

Interactive Communication for Data Exchange

Authors: Himanshu Tyagi, Pramod Viswanath, Shun Watanabe

Abstract: Two parties observing correlated data seek to exchange their data using interactive communication. How many bits must they communicate? We propose a new interactive protocol for data exchange which increases the communication size in steps until the task is done. Next, we derive a lower bound on the minimum number of bits that is based on relating the data exchange problem to the secret key agreem… ▽ More Two parties observing correlated data seek to exchange their data using interactive communication. How many bits must they communicate? We propose a new interactive protocol for data exchange which increases the communication size in steps until the task is done. Next, we derive a lower bound on the minimum number of bits that is based on relating the data exchange problem to the secret key agreement problem. Our single-shot analysis applies to all discrete random variables and yields upper and lower bound of a similar form. In fact, the bounds are asymptotically tight and lead to a characterization of the optimal rate of communication needed for data exchange for a general sequence such as mixture of IID random variables as well as the optimal second-order asymptotic term in the length of communication needed for data exchange for the IID random variables, when the probability of error is fixed. This gives a precise characterization of the asymptotic reduction in the length of optimal communication due to interaction; in particular, two-sided Slepian-Wolf compression is strictly suboptimal. △ Less

Submitted 11 November, 2017; v1 submitted 14 January, 2016; originally announced January 2016.

Comments: 13 pages (two column), 4 figures, a longer version of ISIT 2015

arXiv:1504.05666 [pdf, other]

Information Complexity Density and Simulation of Protocols

Authors: Himanshu Tyagi, Shaileshh Venkatakrishnan, Pramod Viswanath, Shun Watanabe

Abstract: Two parties observing correlated random variables seek to run an interactive communication protocol. How many bits must they exchange to simulate the protocol, namely to produce a view with a joint distribution within a fixed statistical distance of the joint distribution of the input and the transcript of the original protocol? We present an information spectrum approach for this problem whereby… ▽ More Two parties observing correlated random variables seek to run an interactive communication protocol. How many bits must they exchange to simulate the protocol, namely to produce a view with a joint distribution within a fixed statistical distance of the joint distribution of the input and the transcript of the original protocol? We present an information spectrum approach for this problem whereby the information complexity of the protocol is replaced by its information complexity density. Our single-shot bounds relate the communication complexity of simulating a protocol to tail bounds for information complexity density. As a consequence, we obtain a strong converse and characterize the second-order asymptotic term in communication complexity for indepedent and identically distributed observation sequences. Furthermore, we obtain a general formula for the rate of communication complexity which applies to any sequence of observations and protocols. Connections with results from theoretical computer science and implications for the function computation problem are discussed. △ Less

Submitted 29 January, 2016; v1 submitted 22 April, 2015; originally announced April 2015.

Comments: Submitted to the IEEE Transactions on Information Theory

arXiv:1412.4958 [pdf, other]

Universal Hashing for Information Theoretic Security

Authors: Himanshu Tyagi, Alexander Vardy

Abstract: The information theoretic approach to security entails harnessing the correlated randomness available in nature to establish security. It uses tools from information theory and coding and yields provable security, even against an adversary with unbounded computational power. However, the feasibility of this approach in practice depends on the development of efficiently implementable schemes. In th… ▽ More The information theoretic approach to security entails harnessing the correlated randomness available in nature to establish security. It uses tools from information theory and coding and yields provable security, even against an adversary with unbounded computational power. However, the feasibility of this approach in practice depends on the development of efficiently implementable schemes. In this article, we review a special class of practical schemes for information theoretic security that are based on 2-universal hash families. Specific cases of secret key agreement and wiretap coding are considered, and general themes are identified. The scheme presented for wiretap coding is modular and can be implemented easily by including an extra pre-processing layer over the existing transmission codes. △ Less

Submitted 22 March, 2016; v1 submitted 16 December, 2014; originally announced December 2014.

Comments: Corrected an error in the proof of Lemma 6

Showing 1–50 of 62 results for author: Tyagi, H