Search | arXiv e-print repository

Privacy-Preserving In-Context Learning with Differentially Private Few-Shot Generation

Authors: Xinyu Tang, Richard Shin, Huseyin A. Inan, Andre Manoel, Fatemehsadat Mireshghallah, Zinan Lin, Sivakanth Gopi, Janardhan Kulkarni, Robert Sim

Abstract: We study the problem of in-context learning (ICL) with large language models (LLMs) on private datasets. This scenario poses privacy risks, as LLMs may leak or regurgitate the private examples demonstrated in the prompt. We propose a novel algorithm that generates synthetic few-shot demonstrations from the private dataset with formal differential privacy (DP) guarantees, and show empirically that… ▽ More We study the problem of in-context learning (ICL) with large language models (LLMs) on private datasets. This scenario poses privacy risks, as LLMs may leak or regurgitate the private examples demonstrated in the prompt. We propose a novel algorithm that generates synthetic few-shot demonstrations from the private dataset with formal differential privacy (DP) guarantees, and show empirically that it can achieve effective ICL. We conduct extensive experiments on standard benchmarks and compare our algorithm with non-private ICL and zero-shot solutions. Our results demonstrate that our algorithm can achieve competitive performance with strong privacy levels. These results open up new possibilities for ICL with privacy protection for a broad range of applications. △ Less

Submitted 27 January, 2024; v1 submitted 20 September, 2023; originally announced September 2023.

arXiv:2307.11899 [pdf, other]

Project Florida: Federated Learning Made Easy

Authors: Daniel Madrigal Diaz, Andre Manoel, Jialei Chen, Nalin Singal, Robert Sim

Abstract: We present Project Florida, a system architecture and software development kit (SDK) enabling deployment of large-scale Federated Learning (FL) solutions across a heterogeneous device ecosystem. Federated learning is an approach to machine learning based on a strong data sovereignty principle, i.e., that privacy and security of data is best enabled by storing it at its origin, whether on end-user… ▽ More We present Project Florida, a system architecture and software development kit (SDK) enabling deployment of large-scale Federated Learning (FL) solutions across a heterogeneous device ecosystem. Federated learning is an approach to machine learning based on a strong data sovereignty principle, i.e., that privacy and security of data is best enabled by storing it at its origin, whether on end-user devices or in segregated cloud storage silos. Federated learning enables model training across devices and silos while the training data remains within its security boundary, by distributing a model snapshot to a client running inside the boundary, running client code to update the model, and then aggregating updated snapshots across many clients in a central orchestrator. Deploying a FL solution requires implementation of complex privacy and security mechanisms as well as scalable orchestration infrastructure. Scale and performance is a paramount concern, as the model training process benefits from full participation of many client devices, which may have a wide variety of performance characteristics. Project Florida aims to simplify the task of deploying cross-device FL solutions by providing cloud-hosted infrastructure and accompanying task management interfaces, as well as a multi-platform SDK supporting most major programming languages including C++, Java, and Python, enabling FL training across a wide range of operating system (OS) and hardware specifications. The architecture decouples service management from the FL workflow, enabling a cloud service provider to deliver FL-as-a-service (FLaaS) to ML engineers and application developers. We present an overview of Florida, including a description of the architecture, sample code, and illustrative experiments demonstrating system capabilities. △ Less

Submitted 21 July, 2023; originally announced July 2023.

arXiv:2301.02344 [pdf, other]

TrojanPuzzle: Covertly Poisoning Code-Suggestion Models

Authors: Hojjat Aghakhani, Wei Dai, Andre Manoel, Xavier Fernandes, Anant Kharkar, Christopher Kruegel, Giovanni Vigna, David Evans, Ben Zorn, Robert Sim

Abstract: With tools like GitHub Copilot, automatic code suggestion is no longer a dream in software engineering. These tools, based on large language models, are typically trained on massive corpora of code mined from unvetted public sources. As a result, these models are susceptible to data poisoning attacks where an adversary manipulates the model's training by injecting malicious data. Poisoning attacks… ▽ More With tools like GitHub Copilot, automatic code suggestion is no longer a dream in software engineering. These tools, based on large language models, are typically trained on massive corpora of code mined from unvetted public sources. As a result, these models are susceptible to data poisoning attacks where an adversary manipulates the model's training by injecting malicious data. Poisoning attacks could be designed to influence the model's suggestions at run time for chosen contexts, such as inducing the model into suggesting insecure code payloads. To achieve this, prior attacks explicitly inject the insecure code payload into the training data, making the poison data detectable by static analysis tools that can remove such malicious data from the training set. In this work, we demonstrate two novel attacks, COVERT and TROJANPUZZLE, that can bypass static analysis by planting malicious poison data in out-of-context regions such as docstrings. Our most novel attack, TROJANPUZZLE, goes one step further in generating less suspicious poison data by never explicitly including certain (suspicious) parts of the payload in the poison data, while still inducing a model that suggests the entire payload when completing code (i.e., outside docstrings). This makes TROJANPUZZLE robust against signature-based dataset-cleansing methods that can filter out suspicious sequences from the training data. Our evaluation against models of two sizes demonstrates that both COVERT and TROJANPUZZLE have significant implications for practitioners when selecting code used to train or tune code-suggestion models. △ Less

Submitted 24 January, 2024; v1 submitted 5 January, 2023; originally announced January 2023.

arXiv:2211.09722 [pdf, other]

Federated Multilingual Models for Medical Transcript Analysis

Authors: Andre Manoel, Mirian Hipolito Garcia, Tal Baumel, Shize Su, Jialei Chen, Dan Miller, Danny Karmon, Robert Sim, Dimitrios Dimitriadis

Abstract: Federated Learning (FL) is a novel machine learning approach that allows the model trainer to access more data samples, by training the model across multiple decentralized data sources, while data access constraints are in place. Such trained models can achieve significantly higher performance beyond what can be done when trained on a single data source. As part of FL's promises, none of the train… ▽ More Federated Learning (FL) is a novel machine learning approach that allows the model trainer to access more data samples, by training the model across multiple decentralized data sources, while data access constraints are in place. Such trained models can achieve significantly higher performance beyond what can be done when trained on a single data source. As part of FL's promises, none of the training data is ever transmitted to any central location, ensuring that sensitive data remains local and private. These characteristics make FL perfectly suited for large-scale applications in healthcare, where a variety of compliance constraints restrict how data may be handled, processed, and stored. Despite the apparent benefits of federated learning, the heterogeneity in the local data distributions pose significant challenges, and such challenges are even more pronounced in the case of multilingual data providers. In this paper we present a federated learning system for training a large-scale multi-lingual model suitable for fine-tuning on downstream tasks such as medical entity tagging. Our work represents one of the first such production-scale systems, capable of training across multiple highly heterogeneous data providers, and achieving levels of accuracy that could not be otherwise achieved by using central training with public data. Finally, we show that the global model performance can be further improved by a training step performed locally. △ Less

Submitted 3 November, 2022; originally announced November 2022.

arXiv:2204.12703 [pdf, other]

Heterogeneous Ensemble Knowledge Transfer for Training Large Models in Federated Learning

Authors: Yae Jee Cho, Andre Manoel, Gauri Joshi, Robert Sim, Dimitrios Dimitriadis

Abstract: Federated learning (FL) enables edge-devices to collaboratively learn a model without disclosing their private data to a central aggregating server. Most existing FL algorithms require models of identical architecture to be deployed across the clients and server, making it infeasible to train large models due to clients' limited system resources. In this work, we propose a novel ensemble knowledge… ▽ More Federated learning (FL) enables edge-devices to collaboratively learn a model without disclosing their private data to a central aggregating server. Most existing FL algorithms require models of identical architecture to be deployed across the clients and server, making it infeasible to train large models due to clients' limited system resources. In this work, we propose a novel ensemble knowledge transfer method named Fed-ET in which small models (different in architecture) are trained on clients, and used to train a larger model at the server. Unlike in conventional ensemble learning, in FL the ensemble can be trained on clients' highly heterogeneous data. Cognizant of this property, Fed-ET uses a weighted consensus distillation scheme with diversity regularization that efficiently extracts reliable consensus from the ensemble while improving generalization by exploiting the diversity within the ensemble. We show the generalization bound for the ensemble of weighted models trained on heterogeneous datasets that supports the intuition of Fed-ET. Our experiments on image and language tasks show that Fed-ET significantly outperforms other state-of-the-art FL algorithms with fewer communicated parameters, and is also robust against high data-heterogeneity. △ Less

Submitted 27 April, 2022; originally announced April 2022.

Comments: To appear in the proceedings of the 31st International Joint Conference on Artificial Intelligence (IJCAI 2022)

arXiv:2203.13789 [pdf, other]

FLUTE: A Scalable, Extensible Framework for High-Performance Federated Learning Simulations

Authors: Mirian Hipolito Garcia, Andre Manoel, Daniel Madrigal Diaz, Fatemehsadat Mireshghallah, Robert Sim, Dimitrios Dimitriadis

Abstract: In this paper we introduce "Federated Learning Utilities and Tools for Experimentation" (FLUTE), a high-performance open-source platform for federated learning research and offline simulations. The goal of FLUTE is to enable rapid prototyping and simulation of new federated learning algorithms at scale, including novel optimization, privacy, and communications strategies. We describe the architect… ▽ More In this paper we introduce "Federated Learning Utilities and Tools for Experimentation" (FLUTE), a high-performance open-source platform for federated learning research and offline simulations. The goal of FLUTE is to enable rapid prototyping and simulation of new federated learning algorithms at scale, including novel optimization, privacy, and communications strategies. We describe the architecture of FLUTE, enabling arbitrary federated modeling schemes to be realized. We compare the platform with other state-of-the-art platforms and describe available features of FLUTE for experimentation in core areas of active research, such as optimization, privacy, and scalability. A comparison with other established platforms shows speed-ups of up to 42x and savings in memory footprint of 3x. A sample of the platform capabilities is also presented for a range of tasks, as well as other functionality, such as linear scaling for the number of participating clients, and a variety of federated optimizers, including FedAdam, DGA, etcetera. △ Less

Submitted 14 November, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

Comments: 14 Pages, 3 Figures, 11 Tables

arXiv:2110.06500 [pdf, other]

Differentially Private Fine-tuning of Language Models

Authors: Da Yu, Saurabh Naik, Arturs Backurs, Sivakanth Gopi, Huseyin A. Inan, Gautam Kamath, Janardhan Kulkarni, Yin Tat Lee, Andre Manoel, Lukas Wutschitz, Sergey Yekhanin, Huishuai Zhang

Abstract: We give simpler, sparser, and faster algorithms for differentially private fine-tuning of large-scale pre-trained language models, which achieve the state-of-the-art privacy versus utility tradeoffs on many standard NLP tasks. We propose a meta-framework for this problem, inspired by the recent success of highly parameter-efficient methods for fine-tuning. Our experiments show that differentially… ▽ More We give simpler, sparser, and faster algorithms for differentially private fine-tuning of large-scale pre-trained language models, which achieve the state-of-the-art privacy versus utility tradeoffs on many standard NLP tasks. We propose a meta-framework for this problem, inspired by the recent success of highly parameter-efficient methods for fine-tuning. Our experiments show that differentially private adaptations of these approaches outperform previous private algorithms in three important dimensions: utility, privacy, and the computational and memory cost of private training. On many commonly studied datasets, the utility of private models approaches that of non-private models. For example, on the MNLI dataset we achieve an accuracy of $87.8\%$ using RoBERTa-Large and $83.5\%$ using RoBERTa-Base with a privacy budget of $ε= 6.7$. In comparison, absent privacy constraints, RoBERTa-Large achieves an accuracy of $90.2\%$. Our findings are similar for natural language generation tasks. Privately fine-tuning with DART, GPT-2-Small, GPT-2-Medium, GPT-2-Large, and GPT-2-XL achieve BLEU scores of 38.5, 42.0, 43.1, and 43.8 respectively (privacy budget of $ε= 6.8,δ=$ 1e-5) whereas the non-private baseline is $48.1$. All our experiments suggest that larger models are better suited for private fine-tuning: while they are well known to achieve superior accuracy non-privately, we find that they also better maintain their accuracy when privacy is introduced. △ Less

Submitted 14 July, 2022; v1 submitted 13 October, 2021; originally announced October 2021.

Comments: ICLR 2022. Code available at https://github.com/huseyinatahaninan/Differentially-Private-Fine-tuning-of-Language-Models

arXiv:2006.08997 [pdf, other]

Federated Survival Analysis with Discrete-Time Cox Models

Authors: Mathieu Andreux, Andre Manoel, Romuald Menuet, Charlie Saillard, Chloé Simpson

Abstract: Building machine learning models from decentralized datasets located in different centers with federated learning (FL) is a promising approach to circumvent local data scarcity while preserving privacy. However, the prominent Cox proportional hazards (PH) model, used for survival analysis, does not fit the FL framework, as its loss function is non-separable with respect to the samples. The naïve m… ▽ More Building machine learning models from decentralized datasets located in different centers with federated learning (FL) is a promising approach to circumvent local data scarcity while preserving privacy. However, the prominent Cox proportional hazards (PH) model, used for survival analysis, does not fit the FL framework, as its loss function is non-separable with respect to the samples. The naïve method to bypass this non-separability consists in calculating the losses per center, and minimizing their sum as an approximation of the true loss. We show that the resulting model may suffer from important performance loss in some adverse settings. Instead, we leverage the discrete-time extension of the Cox PH model to formulate survival analysis as a classification problem with a separable loss function. Using this approach, we train survival models using standard FL techniques on synthetic data, as well as real-world datasets from The Cancer Genome Atlas (TCGA), showing similar performance to a Cox PH model trained on aggregated data. Compared to previous works, the proposed method is more communication-efficient, more generic, and more amenable to using privacy-preserving techniques. △ Less

Submitted 16 June, 2020; originally announced June 2020.

Comments: 21 pages, 6 figures

Journal ref: International Workshop on Federated Learning for User Privacy and Data Confidentiality in Conjunction with ICML 2020 (FL-ICML'20)

arXiv:1912.06015 [pdf, other]

Efficient Per-Example Gradient Computations in Convolutional Neural Networks

Authors: Gaspar Rochette, Andre Manoel, Eric W. Tramel

Abstract: Deep learning frameworks leverage GPUs to perform massively-parallel computations over batches of many training examples efficiently. However, for certain tasks, one may be interested in performing per-example computations, for instance using per-example gradients to evaluate a quantity of interest unique to each example. One notable application comes from the field of differential privacy, where… ▽ More Deep learning frameworks leverage GPUs to perform massively-parallel computations over batches of many training examples efficiently. However, for certain tasks, one may be interested in performing per-example computations, for instance using per-example gradients to evaluate a quantity of interest unique to each example. One notable application comes from the field of differential privacy, where per-example gradients must be norm-bounded in order to limit the impact of each example on the aggregated batch gradient. In this work, we discuss how per-example gradients can be efficiently computed in convolutional neural networks (CNNs). We compare existing strategies by performing a few steps of differentially-private training on CNNs of varying sizes. We also introduce a new strategy for per-example gradient calculation, which is shown to be advantageous depending on the model architecture and how the model is trained. This is a first step in making differentially-private training of CNNs practical. △ Less

Submitted 12 December, 2019; originally announced December 2019.

Journal ref: Theory and Practice of Differential Privacy (TPDP) workshop at CCS 2020

arXiv:1809.06304 [pdf, other]

Approximate message-passing for convex optimization with non-separable penalties

Authors: Andre Manoel, Florent Krzakala, Gaël Varoquaux, Bertrand Thirion, Lenka Zdeborová

Abstract: We introduce an iterative optimization scheme for convex objectives consisting of a linear loss and a non-separable penalty, based on the expectation-consistent approximation and the vector approximate message-passing (VAMP) algorithm. Specifically, the penalties we approach are convex on a linear transformation of the variable to be determined, a notable example being total variation (TV). We des… ▽ More We introduce an iterative optimization scheme for convex objectives consisting of a linear loss and a non-separable penalty, based on the expectation-consistent approximation and the vector approximate message-passing (VAMP) algorithm. Specifically, the penalties we approach are convex on a linear transformation of the variable to be determined, a notable example being total variation (TV). We describe the connection between message-passing algorithms -- typically used for approximate inference -- and proximal methods for optimization, and show that our scheme is, as VAMP, similar in nature to the Peaceman-Rachford splitting, with the important difference that stepsizes are set adaptively. Finally, we benchmark the performance of our VAMP-like iteration in problems where TV penalties are useful, namely classification in task fMRI and reconstruction in tomography, and show faster convergence than that of state-of-the-art approaches such as FISTA and ADMM in most settings. △ Less

Submitted 17 September, 2018; originally announced September 2018.

Comments: 18 pages, 6 figures

arXiv:1805.09785 [pdf, other]

doi 10.1088/1742-5468/ab3430

Entropy and mutual information in models of deep neural networks

Authors: Marylou Gabrié, Andre Manoel, Clément Luneau, Jean Barbier, Nicolas Macris, Florent Krzakala, Lenka Zdeborová

Abstract: We examine a class of deep learning models with a tractable method to compute information-theoretic quantities. Our contributions are three-fold: (i) We show how entropies and mutual informations can be derived from heuristic statistical physics methods, under the assumption that weight matrices are independent and orthogonally-invariant. (ii) We extend particular cases in which this result is kno… ▽ More We examine a class of deep learning models with a tractable method to compute information-theoretic quantities. Our contributions are three-fold: (i) We show how entropies and mutual informations can be derived from heuristic statistical physics methods, under the assumption that weight matrices are independent and orthogonally-invariant. (ii) We extend particular cases in which this result is known to be rigorously exact by providing a proof for two-layers networks with Gaussian random weights, using the recently introduced adaptive interpolation method. (iii) We propose an experiment framework with generative models of synthetic datasets, on which we train deep neural networks with a weight constraint designed so that the assumption in (i) is verified during learning. We study the behavior of entropies and mutual informations throughout learning and conclude that, in the proposed setting, the relationship between compression and generalization remains elusive. △ Less

Submitted 29 October, 2018; v1 submitted 24 May, 2018; originally announced May 2018.

Journal ref: J. Stat. Mech. (2019) 124014. & NeurIPS 2018

arXiv:1706.00705 [pdf, other]

doi 10.1109/ALLERTON.2017.8262853

Streaming Bayesian inference: theoretical limits and mini-batch approximate message-passing

Authors: Andre Manoel, Florent Krzakala, Eric W. Tramel, Lenka Zdeborová

Abstract: In statistical learning for real-world large-scale data problems, one must often resort to "streaming" algorithms which operate sequentially on small batches of data. In this work, we present an analysis of the information-theoretic limits of mini-batch inference in the context of generalized linear models and low-rank matrix factorization. In a controlled Bayes-optimal setting, we characterize th… ▽ More In statistical learning for real-world large-scale data problems, one must often resort to "streaming" algorithms which operate sequentially on small batches of data. In this work, we present an analysis of the information-theoretic limits of mini-batch inference in the context of generalized linear models and low-rank matrix factorization. In a controlled Bayes-optimal setting, we characterize the optimal performance and phase transitions as a function of mini-batch size. We base part of our results on a detailed analysis of a mini-batch version of the approximate message-passing algorithm (Mini-AMP), which we introduce. Additionally, we show that this theoretical optimality carries over into real-data problems by illustrating that Mini-AMP is competitive with standard streaming algorithms for clustering. △ Less

Submitted 2 June, 2017; originally announced June 2017.

Comments: 19 pages, 4 figures

Journal ref: 2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 2017, pp. 1048-1055

arXiv:1702.03260 [pdf, other]

doi 10.1103/PhysRevX.8.041006

A Deterministic and Generalized Framework for Unsupervised Learning with Restricted Boltzmann Machines

Authors: Eric W. Tramel, Marylou Gabrié, Andre Manoel, Francesco Caltagirone, Florent Krzakala

Abstract: Restricted Boltzmann machines (RBMs) are energy-based neural-networks which are commonly used as the building blocks for deep architectures neural architectures. In this work, we derive a deterministic framework for the training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer (TAP) mean-field approximation of widely-connected systems with weak interactions coming from spin-gla… ▽ More Restricted Boltzmann machines (RBMs) are energy-based neural-networks which are commonly used as the building blocks for deep architectures neural architectures. In this work, we derive a deterministic framework for the training, evaluation, and use of RBMs based upon the Thouless-Anderson-Palmer (TAP) mean-field approximation of widely-connected systems with weak interactions coming from spin-glass theory. While the TAP approach has been extensively studied for fully-visible binary spin systems, our construction is generalized to latent-variable models, as well as to arbitrarily distributed real-valued spin systems with bounded support. In our numerical experiments, we demonstrate the effective deterministic training of our proposed models and are able to show interesting features of unsupervised learning which could not be directly observed with sampling. Additionally, we demonstrate how to utilize our TAP-based framework for leveraging trained RBMs as joint priors in denoising problems. △ Less

Submitted 9 October, 2018; v1 submitted 10 February, 2017; originally announced February 2017.

Journal ref: Phys. Rev. X 8, 041006 (2018)

arXiv:1701.06981 [pdf, other]

doi 10.1109/ISIT.2017.8006899

Multi-Layer Generalized Linear Estimation

Authors: Andre Manoel, Florent Krzakala, Marc Mézard, Lenka Zdeborová

Abstract: We consider the problem of reconstructing a signal from multi-layered (possibly) non-linear measurements. Using non-rigorous but standard methods from statistical physics we present the Multi-Layer Approximate Message Passing (ML-AMP) algorithm for computing marginal probabilities of the corresponding estimation problem and derive the associated state evolution equations to analyze its performance… ▽ More We consider the problem of reconstructing a signal from multi-layered (possibly) non-linear measurements. Using non-rigorous but standard methods from statistical physics we present the Multi-Layer Approximate Message Passing (ML-AMP) algorithm for computing marginal probabilities of the corresponding estimation problem and derive the associated state evolution equations to analyze its performance. We also give the expression of the asymptotic free energy and the minimal information-theoretically achievable reconstruction error. Finally, we present some applications of this measurement model for compressed sensing and perceptron learning with structured matrices/patterns, and for a simple model of estimation of latent variables in an auto-encoder. △ Less

Submitted 24 January, 2017; originally announced January 2017.

Comments: 5 pages, 1 figure

Journal ref: IEEE International Symposium on Information Theory (ISIT), pages 2098-2102 (2017)

arXiv:1609.04167 [pdf, other]

Proceedings of the third "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'16)

Authors: V. Abrol, O. Absil, P. -A. Absil, S. Anthoine, P. Antoine, T. Arildsen, N. Bertin, F. Bleichrodt, J. Bobin, A. Bol, A. Bonnefoy, F. Caltagirone, V. Cambareri, C. Chenot, V. Crnojević, M. Daňková, K. Degraux, J. Eisert, J. M. Fadili, M. Gabrié, N. Gac, D. Giacobello, A. Gonzalez, C. A. Gomez Gonzalez, A. González , et al. (36 additional authors not shown)

Abstract: The third edition of the "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) took place in Aalborg, the 4th largest city in Denmark situated beautifully in the northern part of the country, from the 24th to 26th of August 2016. The workshop venue was at the Aalborg University campus. One implicit objective of this biennial workshop is to foster collab… ▽ More The third edition of the "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) took place in Aalborg, the 4th largest city in Denmark situated beautifully in the northern part of the country, from the 24th to 26th of August 2016. The workshop venue was at the Aalborg University campus. One implicit objective of this biennial workshop is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For this third edition, iTWIST'16 gathered about 50 international participants and features 8 invited talks, 12 oral presentations, and 12 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing (e.g., optics, computer vision, genomics, biomedical, digital communication, channel estimation, astronomy); Application of sparse models in non-convex/non-linear inverse problems (e.g., phase retrieval, blind deconvolution, self calibration); Approximate probabilistic inference for sparse problems; Sparse machine learning and inference; "Blind" inverse problems and dictionary learning; Optimization for sparse modelling; Information theory, geometry and randomness; Sparsity? What's next? (Discrete-valued signals; Union of low-dimensional spaces, Cosparsity, mixed/group norm, model-based, low-complexity models, ...); Matrix/manifold sensing/processing (graph, low-rank approximation, ...); Complexity/accuracy tradeoffs in numerical methods/optimization; Electronic/optical compressive sensors (hardware). △ Less

Submitted 14 September, 2016; originally announced September 2016.

Comments: 69 pages, 22 extended abstracts, iTWIST'16 website: http://www.itwist16.es.aau.dk

arXiv:1606.03956 [pdf, other]

doi 10.1109/ITW.2016.7606837

Inferring Sparsity: Compressed Sensing using Generalized Restricted Boltzmann Machines

Authors: Eric W. Tramel, Andre Manoel, Francesco Caltagirone, Marylou Gabrié, Florent Krzakala

Abstract: In this work, we consider compressed sensing reconstruction from $M$ measurements of $K$-sparse structured signals which do not possess a writable correlation model. Assuming that a generative statistical model, such as a Boltzmann machine, can be trained in an unsupervised manner on example signals, we demonstrate how this signal model can be used within a Bayesian framework of signal reconstruct… ▽ More In this work, we consider compressed sensing reconstruction from $M$ measurements of $K$-sparse structured signals which do not possess a writable correlation model. Assuming that a generative statistical model, such as a Boltzmann machine, can be trained in an unsupervised manner on example signals, we demonstrate how this signal model can be used within a Bayesian framework of signal reconstruction. By deriving a message-passing inference for general distribution restricted Boltzmann machines, we are able to integrate these inferred signal models into approximate message passing for compressed sensing reconstruction. Finally, we show for the MNIST dataset that this approach can be very effective, even for $M < K$. △ Less

Submitted 13 June, 2016; originally announced June 2016.

Comments: IEEE Information Theory Workshop, 2016

Journal ref: 2016 IEEE Information Theory Workshop (ITW), Pages: 265 - 269

arXiv:1406.4311 [pdf, other]

Sparse Estimation with the Swept Approximated Message-Passing Algorithm

Authors: Andre Manoel, Florent Krzakala, Eric W. Tramel, Lenka Zdeborová

Abstract: Approximate Message Passing (AMP) has been shown to be a superior method for inference problems, such as the recovery of signals from sets of noisy, lower-dimensionality measurements, both in terms of reconstruction accuracy and in computational efficiency. However, AMP suffers from serious convergence issues in contexts that do not exactly match its assumptions. We propose a new approach to stabi… ▽ More Approximate Message Passing (AMP) has been shown to be a superior method for inference problems, such as the recovery of signals from sets of noisy, lower-dimensionality measurements, both in terms of reconstruction accuracy and in computational efficiency. However, AMP suffers from serious convergence issues in contexts that do not exactly match its assumptions. We propose a new approach to stabilizing AMP in these contexts by applying AMP updates to individual coefficients rather than in parallel. Our results show that this change to the AMP iteration can provide theoretically expected, but hitherto unobtainable, performance for problems on which the standard AMP iteration diverges. Additionally, we find that the computational costs of this swept coefficient update scheme is not unduly burdensome, allowing it to be applied efficiently to signals of large dimensionality. △ Less

Submitted 17 June, 2014; originally announced June 2014.

Comments: 11 pages, 3 figures, implementation available at https://github.com/eric-tramel/SwAMP-Demo

Journal ref: Proceedings of the 32nd International Conference on Machine Learning (ICML), 2015, 1123-1132

arXiv:1402.1384 [pdf, ps, other]

doi 10.1109/ISIT.2014.6875083

Variational Free Energies for Compressed Sensing

Authors: Florent Krzakala, Andre Manoel, Eric W. Tramel, Lenka Zdeborova

Abstract: We consider the variational free energy approach for compressed sensing. We first show that the naïve mean field approach performs remarkably well when coupled with a noise learning procedure. We also notice that it leads to the same equations as those used for iterative thresholding. We then discuss the Bethe free energy and how it corresponds to the fixed points of the approximate message passin… ▽ More We consider the variational free energy approach for compressed sensing. We first show that the naïve mean field approach performs remarkably well when coupled with a noise learning procedure. We also notice that it leads to the same equations as those used for iterative thresholding. We then discuss the Bethe free energy and how it corresponds to the fixed points of the approximate message passing algorithm. In both cases, we test numerically the direct optimization of the free energies as a converging sparse-estimationalgorithm. △ Less

Submitted 6 February, 2014; originally announced February 2014.

Comments: 5 pages, 3 figures

Journal ref: Information Theory Proceedings (ISIT), 2014 IEEE International Symposium on, page(s) 1499 - 1503

arXiv:1211.6462 [pdf, ps, other]

doi 10.1088/1742-5468/2013/08/P08002

Statistical mechanics of reputation systems in autonomous networks

Authors: Andre Manoel, Renato Vicente

Abstract: Reputation systems seek to infer which members of a community can be trusted based on ratings they issue about each other. We construct a Bayesian inference model and simulate approximate estimates using belief propagation (BP). The model is then mapped onto computing equilibrium properties of a spin glass in a random field and analyzed by employing the replica symmetric cavity approach. Having th… ▽ More Reputation systems seek to infer which members of a community can be trusted based on ratings they issue about each other. We construct a Bayesian inference model and simulate approximate estimates using belief propagation (BP). The model is then mapped onto computing equilibrium properties of a spin glass in a random field and analyzed by employing the replica symmetric cavity approach. Having the fraction of trustful nodes and environment noise level as control parameters, we evaluate the theoretical performance in terms of estimation error and the robustness of the BP approximation in different scenarios. Regions of degraded performance are then explained by the convergence properties of the BP algorithm and by the emergence of a glassy phase. △ Less

Submitted 11 April, 2013; v1 submitted 27 November, 2012; originally announced November 2012.

Comments: 20 pages, 14 figures

Journal ref: Journal of Statistical Mechanics (2013) P08002

Showing 1–19 of 19 results for author: Manoel, A