-
Xpikeformer: Hybrid Analog-Digital Hardware Acceleration for Spiking Transformers
Authors:
Zihang Song,
Prabodh Katti,
Osvaldo Simeone,
Bipin Rajendran
Abstract:
This paper introduces Xpikeformer, a hybrid analog-digital hardware architecture designed to accelerate spiking neural network (SNN)-based transformer models. By combining the energy efficiency and temporal dynamics of SNNs with the powerful sequence modeling capabilities of transformers, Xpikeformer leverages mixed analog-digital computing techniques to enhance performance and energy efficiency.…
▽ More
This paper introduces Xpikeformer, a hybrid analog-digital hardware architecture designed to accelerate spiking neural network (SNN)-based transformer models. By combining the energy efficiency and temporal dynamics of SNNs with the powerful sequence modeling capabilities of transformers, Xpikeformer leverages mixed analog-digital computing techniques to enhance performance and energy efficiency. The architecture integrates analog in-memory computing (AIMC) for feedforward and fully connected layers, and a stochastic spiking attention (SSA) engine for efficient attention mechanisms. We detail the design, implementation, and evaluation of Xpikeformer, demonstrating significant improvements in energy consumption and computational efficiency. Through an image classification task and a wireless communication symbol detection task, we show that Xpikeformer can achieve software-comparable inference accuracy. Energy evaluations reveal that Xpikeformer achieves up to a $17.8$--$19.2\times$ reduction in energy consumption compared to state-of-the-art digital ANN transformers and up to a $5.9$--$6.8\times$ reduction compared to fully digital SNN transformers. Xpikeformer also achieves a $12.0\times$ speedup compared to the GPU implementation of spiking transformers.
△ Less
Submitted 16 August, 2024;
originally announced August 2024.
-
Quantile Learn-Then-Test: Quantile-Based Risk Control for Hyperparameter Optimization
Authors:
Amirmohammad Farzaneh,
Sangwoo Park,
Osvaldo Simeone
Abstract:
The increasing adoption of Artificial Intelligence (AI) in engineering problems calls for the development of calibration methods capable of offering robust statistical reliability guarantees. The calibration of black box AI models is carried out via the optimization of hyperparameters dictating architecture, optimization, and/or inference configuration. Prior work has introduced learn-then-test (L…
▽ More
The increasing adoption of Artificial Intelligence (AI) in engineering problems calls for the development of calibration methods capable of offering robust statistical reliability guarantees. The calibration of black box AI models is carried out via the optimization of hyperparameters dictating architecture, optimization, and/or inference configuration. Prior work has introduced learn-then-test (LTT), a calibration procedure for hyperparameter optimization (HPO) that provides statistical guarantees on average performance measures. Recognizing the importance of controlling risk-aware objectives in engineering contexts, this work introduces a variant of LTT that is designed to provide statistical guarantees on quantiles of a risk measure. We illustrate the practical advantages of this approach by applying the proposed algorithm to a radio access scheduling problem.
△ Less
Submitted 24 July, 2024;
originally announced July 2024.
-
Automatic AI Model Selection for Wireless Systems: Online Learning via Digital Twinning
Authors:
Qiushuo Hou,
Matteo Zecchin,
Sangwoo Park,
Yunlong Cai,
Guanding Yu,
Kaushik Chowdhury,
Osvaldo Simeone
Abstract:
In modern wireless network architectures, such as O-RAN, artificial intelligence (AI)-based applications are deployed at intelligent controllers to carry out functionalities like scheduling or power control. The AI "apps" are selected on the basis of contextual information such as network conditions, topology, traffic statistics, and design goals. The mapping between context and AI model parameter…
▽ More
In modern wireless network architectures, such as O-RAN, artificial intelligence (AI)-based applications are deployed at intelligent controllers to carry out functionalities like scheduling or power control. The AI "apps" are selected on the basis of contextual information such as network conditions, topology, traffic statistics, and design goals. The mapping between context and AI model parameters is ideally done in a zero-shot fashion via an automatic model selection (AMS) mapping that leverages only contextual information without requiring any current data. This paper introduces a general methodology for the online optimization of AMS mappings. Optimizing an AMS mapping is challenging, as it requires exposure to data collected from many different contexts. Therefore, if carried out online, this initial optimization phase would be extremely time consuming. A possible solution is to leverage a digital twin of the physical system to generate synthetic data from multiple simulated contexts. However, given that the simulator at the digital twin is imperfect, a direct use of simulated data for the optimization of the AMS mapping would yield poor performance when tested in the real system. This paper proposes a novel method for the online optimization of AMS mapping that corrects for the bias of the simulator by means of limited real data collected from the physical system. Experimental results for a graph neural network-based power control app demonstrate the significant advantages of the proposed approach.
△ Less
Submitted 22 June, 2024;
originally announced June 2024.
-
Pre-Training and Personalized Fine-Tuning via Over-the-Air Federated Meta-Learning: Convergence-Generalization Trade-Offs
Authors:
Haifeng Wen,
Hong Xing,
Osvaldo Simeone
Abstract:
For modern artificial intelligence (AI) applications such as large language models (LLMs), the training paradigm has recently shifted to pre-training followed by fine-tuning. Furthermore, owing to dwindling open repositories of data and thanks to efforts to democratize access to AI models, pre-training is expected to increasingly migrate from the current centralized deployments to federated learni…
▽ More
For modern artificial intelligence (AI) applications such as large language models (LLMs), the training paradigm has recently shifted to pre-training followed by fine-tuning. Furthermore, owing to dwindling open repositories of data and thanks to efforts to democratize access to AI models, pre-training is expected to increasingly migrate from the current centralized deployments to federated learning (FL) implementations. Meta-learning provides a general framework in which pre-training and fine-tuning can be formalized. Meta-learning-based personalized FL (meta-pFL) moves beyond basic personalization by targeting generalization to new agents and tasks. This paper studies the generalization performance of meta-pFL for a wireless setting in which the agents participating in the pre-training phase, i.e., meta-learning, are connected via a shared wireless channel to the server. Adopting over-the-air computing, we study the trade-off between generalization to new agents and tasks, on the one hand, and convergence, on the other hand. The trade-off arises from the fact that channel impairments may enhance generalization, while degrading convergence. Extensive numerical results validate the theory.
△ Less
Submitted 17 June, 2024;
originally announced June 2024.
-
Semi-Supervised Learning via Cross-Prediction-Powered Inference for Wireless Systems
Authors:
Houssem Sifaou,
Osvaldo Simeone
Abstract:
In many wireless application scenarios, acquiring labeled data can be prohibitively costly, requiring complex optimization processes or measurement campaigns. Semi-supervised learning leverages unlabeled samples to augment the available dataset by assigning synthetic labels obtained via machine learning (ML)-based predictions. However, treating the synthetic labels as true labels may yield worse-p…
▽ More
In many wireless application scenarios, acquiring labeled data can be prohibitively costly, requiring complex optimization processes or measurement campaigns. Semi-supervised learning leverages unlabeled samples to augment the available dataset by assigning synthetic labels obtained via machine learning (ML)-based predictions. However, treating the synthetic labels as true labels may yield worse-performing models as compared to models trained using only labeled data. Inspired by the recently developed prediction-powered inference (PPI) framework, this work investigates how to leverage the synthetic labels produced by an ML model, while accounting for the inherent bias with respect to true labels. To this end, we first review PPI and its recent extensions, namely tuned PPI and cross-prediction-powered inference (CPPI). Then, we introduce a novel variant of PPI, referred to as tuned CPPI, that provides CPPI with an additional degree of freedom in adapting to the quality of the ML-based labels. Finally, we showcase two applications of PPI-based techniques in wireless systems, namely beam alignment based on channel knowledge maps in millimeter-wave systems and received signal strength information-based indoor localization. Simulation results show the advantages of PPI-based techniques over conventional approaches that rely solely on labeled data or that apply standard pseudo-labeling strategies from semi-supervised learning. Furthermore, the proposed tuned CPPI method is observed to guarantee the best performance among all benchmark schemes, especially in the regime of limited labeled data.
△ Less
Submitted 24 May, 2024;
originally announced May 2024.
-
Localized Adaptive Risk Control
Authors:
Matteo Zecchin,
Osvaldo Simeone
Abstract:
Adaptive Risk Control (ARC) is an online calibration strategy based on set prediction that offers worst-case deterministic long-term risk control, as well as statistical marginal coverage guarantees. ARC adjusts the size of the prediction set by varying a single scalar threshold based on feedback from past decisions. In this work, we introduce Localized Adaptive Risk Control (L-ARC), an online cal…
▽ More
Adaptive Risk Control (ARC) is an online calibration strategy based on set prediction that offers worst-case deterministic long-term risk control, as well as statistical marginal coverage guarantees. ARC adjusts the size of the prediction set by varying a single scalar threshold based on feedback from past decisions. In this work, we introduce Localized Adaptive Risk Control (L-ARC), an online calibration scheme that targets statistical localized risk guarantees ranging from conditional risk to marginal risk, while preserving the worst-case performance of ARC. L-ARC updates a threshold function within a reproducing kernel Hilbert space (RKHS), with the kernel determining the level of localization of the statistical risk guarantee. The theoretical results highlight a trade-off between localization of the statistical risk and convergence speed to the long-term risk target. Thanks to localization, L-ARC is demonstrated via experiments to produce prediction sets with risk guarantees across different data subpopulations, significantly improving the fairness of the calibrated model for tasks such as image segmentation and beam selection in wireless networks.
△ Less
Submitted 9 June, 2024; v1 submitted 13 May, 2024;
originally announced May 2024.
-
Calibrating Bayesian Learning via Regularization, Confidence Minimization, and Selective Inference
Authors:
Jiayi Huang,
Sangwoo Park,
Osvaldo Simeone
Abstract:
The application of artificial intelligence (AI) models in fields such as engineering is limited by the known difficulty of quantifying the reliability of an AI's decision. A well-calibrated AI model must correctly report its accuracy on in-distribution (ID) inputs, while also enabling the detection of out-of-distribution (OOD) inputs. A conventional approach to improve calibration is the applicati…
▽ More
The application of artificial intelligence (AI) models in fields such as engineering is limited by the known difficulty of quantifying the reliability of an AI's decision. A well-calibrated AI model must correctly report its accuracy on in-distribution (ID) inputs, while also enabling the detection of out-of-distribution (OOD) inputs. A conventional approach to improve calibration is the application of Bayesian ensembling. However, owing to computational limitations and model misspecification, practical ensembling strategies do not necessarily enhance calibration. This paper proposes an extension of variational inference (VI)-based Bayesian learning that integrates calibration regularization for improved ID performance, confidence minimization for OOD detection, and selective calibration to ensure a synergistic use of calibration regularization and confidence minimization. The scheme is constructed successively by first introducing calibration-regularized Bayesian learning (CBNN), then incorporating out-of-distribution confidence minimization (OCM) to yield CBNN-OCM, and finally integrating also selective calibration to produce selective CBNN-OCM (SCBNN-OCM). Selective calibration rejects inputs for which the calibration performance is expected to be insufficient. Numerical results illustrate the trade-offs between ID accuracy, ID calibration, and OOD calibration attained by both frequentist and Bayesian learning methods. Among the main conclusions, SCBNN-OCM is seen to achieve best ID and OOD performance as compared to existing state-of-the-art approaches at the cost of rejecting a sufficiently large number of inputs.
△ Less
Submitted 17 April, 2024;
originally announced April 2024.
-
Cell-Free Multi-User MIMO Equalization via In-Context Learning
Authors:
Matteo Zecchin,
Kai Yu,
Osvaldo Simeone
Abstract:
Large pre-trained sequence models, such as transformers, excel as few-shot learners capable of in-context learning (ICL). In ICL, a model is trained to adapt its operation to a new task based on limited contextual information, typically in the form of a few training examples for the given task. Previous work has explored the use of ICL for channel equalization in single-user multi-input and multip…
▽ More
Large pre-trained sequence models, such as transformers, excel as few-shot learners capable of in-context learning (ICL). In ICL, a model is trained to adapt its operation to a new task based on limited contextual information, typically in the form of a few training examples for the given task. Previous work has explored the use of ICL for channel equalization in single-user multi-input and multiple-output (MIMO) systems. In this work, we demonstrate that ICL can be also used to tackle the problem of multi-user equalization in cell-free MIMO systems with limited fronthaul capacity. In this scenario, a task is defined by channel statistics, signal-to-noise ratio, and modulation schemes. The context encompasses the users' pilot sequences, the corresponding quantized received signals, and the current received data signal. Different prompt design strategies are proposed and evaluated that encompass also large-scale fading and modulation information. Experiments demonstrate that ICL-based equalization provides estimates with lower mean squared error as compared to the linear minimum mean squared error equalizer, especially in the presence of limited fronthaul capacity and pilot contamination.
△ Less
Submitted 11 April, 2024; v1 submitted 8 April, 2024;
originally announced April 2024.
-
Neuromorphic Split Computing with Wake-Up Radios: Architecture and Design via Digital Twinning
Authors:
Jiechen Chen,
Sangwoo Park,
Petar Popovski,
H. Vincent Poor,
Osvaldo Simeone
Abstract:
Neuromorphic computing leverages the sparsity of temporal data to reduce processing energy by activating a small subset of neurons and synapses at each time step. When deployed for split computing in edge-based systems, remote neuromorphic processing units (NPUs) can reduce the communication power budget by communicating asynchronously using sparse impulse radio (IR) waveforms. This way, the input…
▽ More
Neuromorphic computing leverages the sparsity of temporal data to reduce processing energy by activating a small subset of neurons and synapses at each time step. When deployed for split computing in edge-based systems, remote neuromorphic processing units (NPUs) can reduce the communication power budget by communicating asynchronously using sparse impulse radio (IR) waveforms. This way, the input signal sparsity translates directly into energy savings both in terms of computation and communication. However, with IR transmission, the main contributor to the overall energy consumption remains the power required to maintain the main radio on. This work proposes a novel architecture that integrates a wake-up radio mechanism within a split computing system consisting of remote, wirelessly connected, NPUs. A key challenge in the design of a wake-up radio-based neuromorphic split computing system is the selection of thresholds for sensing, wake-up signal detection, and decision making. To address this problem, as a second contribution, this work proposes a novel methodology that leverages the use of a digital twin (DT), i.e., a simulator, of the physical system, coupled with a sequential statistical testing approach known as Learn Then Test (LTT) to provide theoretical reliability guarantees. The proposed DT-LTT methodology is broadly applicable to other design problems, and is showcased here for neuromorphic communications. Experimental results validate the design and the analysis, confirming the theoretical reliability guarantees and illustrating trade-offs among reliability, energy consumption, and informativeness of the decisions.
△ Less
Submitted 3 April, 2024; v1 submitted 2 April, 2024;
originally announced April 2024.
-
Neuromorphic Wireless Device-Edge Co-Inference via the Directed Information Bottleneck
Authors:
Yuzhen Ke,
Zoran Utkovski,
Mehdi Heshmati,
Osvaldo Simeone,
Johannes Dommel,
Slawomir Stanczak
Abstract:
An important use case of next-generation wireless systems is device-edge co-inference, where a semantic task is partitioned between a device and an edge server. The device carries out data collection and partial processing of the data, while the remote server completes the given task based on information received from the device. It is often required that processing and communication be run as eff…
▽ More
An important use case of next-generation wireless systems is device-edge co-inference, where a semantic task is partitioned between a device and an edge server. The device carries out data collection and partial processing of the data, while the remote server completes the given task based on information received from the device. It is often required that processing and communication be run as efficiently as possible at the device, while more computing resources are available at the edge. To address such scenarios, we introduce a new system solution, termed neuromorphic wireless device-edge co-inference. According to it, the device runs sensing, processing, and communication units using neuromorphic hardware, while the server employs conventional radio and computing technologies. The proposed system is designed using a transmitter-centric information-theoretic criterion that targets a reduction of the communication overhead, while retaining the most relevant information for the end-to-end semantic task of interest. Numerical results on standard data sets validate the proposed architecture, and a preliminary testbed realization is reported.
△ Less
Submitted 2 April, 2024;
originally announced April 2024.
-
Cell-Free MIMO Perceptive Mobile Networks: Cloud vs. Edge Processing
Authors:
Seongah Jeong,
Jinkyu Kang,
Osvaldo Simeone,
Shlomo Shamai
Abstract:
Perceptive mobile networks implement sensing and communication by reusing existing cellular infrastructure. Cell-free multiple-input multiple-output, thanks to the cooperation among distributed access points, supports the deployment of multistatic radar sensing, while providing high spectral efficiency for data communication services. To this end, the distributed access points communicate over fro…
▽ More
Perceptive mobile networks implement sensing and communication by reusing existing cellular infrastructure. Cell-free multiple-input multiple-output, thanks to the cooperation among distributed access points, supports the deployment of multistatic radar sensing, while providing high spectral efficiency for data communication services. To this end, the distributed access points communicate over fronthaul links with a central processing unit acting as a cloud processor. This work explores four different types of PMN uplink solutions based on Cell-free multiple-input multiple-output, in which the sensing and decoding functionalities are carried out at either cloud or edge. Accordingly, we investigate and compare joint cloud-based decoding and sensing (CDCS), hybrid cloud-based decoding and edge-based sensing (CDES), hybrid edge-based decoding and cloud-based sensing (EDCS) and edge-based decoding and sensing (EDES). In all cases, we target a unified design problem formulation whereby the fronthaul quantization of signals received in the training and data phases are jointly designed to maximize the achievable rate under sensing requirements and fronthaul capacity constraints. Via numerical results, the four implementation scenarios are compared as a function of the available fronthaul resources by highlighting the relative merits of edge- and cloud-based sensing and communications. This study provides guidelines on the optimal functional allocation in fronthaul-constrained networks implementing integrated sensing and communications.
△ Less
Submitted 28 March, 2024;
originally announced March 2024.
-
CSI Transfer From Sub-6G to mmWave: Reduced-Overhead Multi-User Hybrid Beamforming
Authors:
Weicao Deng,
Min Li,
Ming-Min Zhao,
Min-Jian Zhao,
Osvaldo Simeone
Abstract:
Hybrid beamforming is vital in modern wireless systems, especially for massive MIMO and millimeter-wave deployments, offering efficient directional transmission with reduced hardware complexity. However, effective beamforming in multi-user scenarios relies heavily on accurate channel state information, the acquisition of which often incurs excessive pilot overhead, degrading system performance. To…
▽ More
Hybrid beamforming is vital in modern wireless systems, especially for massive MIMO and millimeter-wave deployments, offering efficient directional transmission with reduced hardware complexity. However, effective beamforming in multi-user scenarios relies heavily on accurate channel state information, the acquisition of which often incurs excessive pilot overhead, degrading system performance. To address this and inspired by the spatial congruence between sub-6GHz (sub-6G) and mmWave channels, we propose a Sub-6G information Aided Multi-User Hybrid Beamforming (SA-MUHBF) framework, avoiding excessive use of pilots. SA-MUHBF employs a convolutional neural network to predict mmWave beamspace from sub-6G channel estimate, followed by a novel multi-layer graph neural network for analog beam selection and a linear minimum mean-square error algorithm for digital beamforming. Numerical results demonstrate that SA-MUHBF efficiently predicts the mmWave beamspace representation and achieves superior spectrum efficiency over state-of-the-art benchmarks. Moreover, SA-MUHBF demonstrates robust performance across varied sub-6G system configurations and exhibits strong generalization to unseen scenarios.
△ Less
Submitted 16 March, 2024;
originally announced March 2024.
-
Multi-Fidelity Bayesian Optimization With Across-Task Transferable Max-Value Entropy Search
Authors:
Yunchuan Zhang,
Sangwoo Park,
Osvaldo Simeone
Abstract:
In many applications, ranging from logistics to engineering, a designer is faced with a sequence of optimization tasks for which the objectives are in the form of black-box functions that are costly to evaluate. For example, the designer may need to tune the hyperparameters of neural network models for different learning tasks over time. Rather than evaluating the objective function for each candi…
▽ More
In many applications, ranging from logistics to engineering, a designer is faced with a sequence of optimization tasks for which the objectives are in the form of black-box functions that are costly to evaluate. For example, the designer may need to tune the hyperparameters of neural network models for different learning tasks over time. Rather than evaluating the objective function for each candidate solution, the designer may have access to approximations of the objective functions, for which higher-fidelity evaluations entail a larger cost. Existing multi-fidelity black-box optimization strategies select candidate solutions and fidelity levels with the goal of maximizing the information accrued about the optimal value or solution for the current task. Assuming that successive optimization tasks are related, this paper introduces a novel information-theoretic acquisition function that balances the need to acquire information about the current task with the goal of collecting information transferable to future tasks. The proposed method includes shared inter-task latent variables, which are transferred across tasks by implementing particle-based variational Bayesian updates. Experimental results across synthetic and real-world examples reveal that the proposed provident acquisition strategy that caters to future tasks can significantly improve the optimization efficiency as soon as a sufficient number of tasks is processed.
△ Less
Submitted 24 April, 2024; v1 submitted 14 March, 2024;
originally announced March 2024.
-
On the Impact of Uncertainty and Calibration on Likelihood-Ratio Membership Inference Attacks
Authors:
Meiyi Zhu,
Caili Guo,
Chunyan Feng,
Osvaldo Simeone
Abstract:
In a membership inference attack (MIA), an attacker exploits the overconfidence exhibited by typical machine learning models to determine whether a specific data point was used to train a target model. In this paper, we analyze the performance of the state-of-the-art likelihood ratio attack (LiRA) within an information-theoretical framework that allows the investigation of the impact of the aleato…
▽ More
In a membership inference attack (MIA), an attacker exploits the overconfidence exhibited by typical machine learning models to determine whether a specific data point was used to train a target model. In this paper, we analyze the performance of the state-of-the-art likelihood ratio attack (LiRA) within an information-theoretical framework that allows the investigation of the impact of the aleatoric uncertainty in the true data generation process, of the epistemic uncertainty caused by a limited training data set, and of the calibration level of the target model. We compare three different settings, in which the attacker receives decreasingly informative feedback from the target model: confidence vector (CV) disclosure, in which the output probability vector is released; true label confidence (TLC) disclosure, in which only the probability assigned to the true label is made available by the model; and decision set (DS) disclosure, in which an adaptive prediction set is produced as in conformal prediction. We derive bounds on the advantage of an MIA adversary with the aim of offering insights into the impact of uncertainty and calibration on the effectiveness of MIAs. Simulation results demonstrate that the derived analytical bounds predict well the effectiveness of MIAs.
△ Less
Submitted 15 August, 2024; v1 submitted 16 February, 2024;
originally announced February 2024.
-
Stochastic Spiking Attention: Accelerating Attention with Stochastic Computing in Spiking Networks
Authors:
Zihang Song,
Prabodh Katti,
Osvaldo Simeone,
Bipin Rajendran
Abstract:
Spiking Neural Networks (SNNs) have been recently integrated into Transformer architectures due to their potential to reduce computational demands and to improve power efficiency. Yet, the implementation of the attention mechanism using spiking signals on general-purpose computing platforms remains inefficient. In this paper, we propose a novel framework leveraging stochastic computing (SC) to eff…
▽ More
Spiking Neural Networks (SNNs) have been recently integrated into Transformer architectures due to their potential to reduce computational demands and to improve power efficiency. Yet, the implementation of the attention mechanism using spiking signals on general-purpose computing platforms remains inefficient. In this paper, we propose a novel framework leveraging stochastic computing (SC) to effectively execute the dot-product attention for SNN-based Transformers. We demonstrate that our approach can achieve high classification accuracy ($83.53\%$) on CIFAR-10 within 10 time steps, which is comparable to the performance of a baseline artificial neural network implementation ($83.66\%$). We estimate that the proposed SC approach can lead to over $6.3\times$ reduction in computing energy and $1.7\times$ reduction in memory access costs for a digital CMOS-based ASIC design. We experimentally validate our stochastic attention block design through an FPGA implementation, which is shown to achieve $48\times$ lower latency as compared to a GPU implementation, while consuming $15\times$ less power.
△ Less
Submitted 14 February, 2024;
originally announced February 2024.
-
Conservative and Risk-Aware Offline Multi-Agent Reinforcement Learning for Digital Twins
Authors:
Eslam Eldeeb,
Houssem Sifaou,
Osvaldo Simeone,
Mohammad Shehab,
Hirley Alves
Abstract:
Digital twin (DT) platforms are increasingly regarded as a promising technology for controlling, optimizing, and monitoring complex engineering systems such as next-generation wireless networks. An important challenge in adopting DT solutions is their reliance on data collected offline, lacking direct access to the physical environment. This limitation is particularly severe in multi-agent systems…
▽ More
Digital twin (DT) platforms are increasingly regarded as a promising technology for controlling, optimizing, and monitoring complex engineering systems such as next-generation wireless networks. An important challenge in adopting DT solutions is their reliance on data collected offline, lacking direct access to the physical environment. This limitation is particularly severe in multi-agent systems, for which conventional multi-agent reinforcement (MARL) requires online interactions with the environment. A direct application of online MARL schemes to an offline setting would generally fail due to the epistemic uncertainty entailed by the limited availability of data. In this work, we propose an offline MARL scheme for DT-based wireless networks that integrates distributional RL and conservative Q-learning to address the environment's inherent aleatoric uncertainty and the epistemic uncertainty arising from limited data. To further exploit the offline data, we adapt the proposed scheme to the centralized training decentralized execution framework, allowing joint training of the agents' policies. The proposed MARL scheme, referred to as multi-agent conservative quantile regression (MA-CQR) addresses general risk-sensitive design criteria and is applied to the trajectory planning problem in drone networks, showcasing its advantages.
△ Less
Submitted 13 February, 2024;
originally announced February 2024.
-
Adversarial Quantum Machine Learning: An Information-Theoretic Generalization Analysis
Authors:
Petros Georgiou,
Sharu Theresa Jose,
Osvaldo Simeone
Abstract:
In a manner analogous to their classical counterparts, quantum classifiers are vulnerable to adversarial attacks that perturb their inputs. A promising countermeasure is to train the quantum classifier by adopting an attack-aware, or adversarial, loss function. This paper studies the generalization properties of quantum classifiers that are adversarially trained against bounded-norm white-box atta…
▽ More
In a manner analogous to their classical counterparts, quantum classifiers are vulnerable to adversarial attacks that perturb their inputs. A promising countermeasure is to train the quantum classifier by adopting an attack-aware, or adversarial, loss function. This paper studies the generalization properties of quantum classifiers that are adversarially trained against bounded-norm white-box attacks. Specifically, a quantum adversary maximizes the classifier's loss by transforming an input state $ρ(x)$ into a state $λ$ that is $ε$-close to the original state $ρ(x)$ in $p$-Schatten distance. Under suitable assumptions on the quantum embedding $ρ(x)$, we derive novel information-theoretic upper bounds on the generalization error of adversarially trained quantum classifiers for $p = 1$ and $p = \infty$. The derived upper bounds consist of two terms: the first is an exponential function of the 2-Rényi mutual information between classical data and quantum embedding, while the second term scales linearly with the adversarial perturbation size $ε$. Both terms are shown to decrease as $1/\sqrt{T}$ over the training set size $T$ . An extension is also considered in which the adversary assumed during training has different parameters $p$ and $ε$ as compared to the adversary affecting the test inputs. Finally, we validate our theoretical findings with numerical experiments for a synthetic setting.
△ Less
Submitted 15 February, 2024; v1 submitted 31 January, 2024;
originally announced February 2024.
-
Cross-Validation Conformal Risk Control
Authors:
Kfir M. Cohen,
Sangwoo Park,
Osvaldo Simeone,
Shlomo Shamai
Abstract:
Conformal risk control (CRC) is a recently proposed technique that applies post-hoc to a conventional point predictor to provide calibration guarantees. Generalizing conformal prediction (CP), with CRC, calibration is ensured for a set predictor that is extracted from the point predictor to control a risk function such as the probability of miscoverage or the false negative rate. The original CRC…
▽ More
Conformal risk control (CRC) is a recently proposed technique that applies post-hoc to a conventional point predictor to provide calibration guarantees. Generalizing conformal prediction (CP), with CRC, calibration is ensured for a set predictor that is extracted from the point predictor to control a risk function such as the probability of miscoverage or the false negative rate. The original CRC requires the available data set to be split between training and validation data sets. This can be problematic when data availability is limited, resulting in inefficient set predictors. In this paper, a novel CRC method is introduced that is based on cross-validation, rather than on validation as the original CRC. The proposed cross-validation CRC (CV-CRC) extends a version of the jackknife-minmax from CP to CRC, allowing for the control of a broader range of risk functions. CV-CRC is proved to offer theoretical guarantees on the average risk of the set predictor. Furthermore, numerical experiments show that CV-CRC can reduce the average set size with respect to CRC when the available data are limited.
△ Less
Submitted 1 May, 2024; v1 submitted 22 January, 2024;
originally announced January 2024.
-
Generalization and Informativeness of Conformal Prediction
Authors:
Matteo Zecchin,
Sangwoo Park,
Osvaldo Simeone,
Fredrik Hellström
Abstract:
The safe integration of machine learning modules in decision-making processes hinges on their ability to quantify uncertainty. A popular technique to achieve this goal is conformal prediction (CP), which transforms an arbitrary base predictor into a set predictor with coverage guarantees. While CP certifies the predicted set to contain the target quantity with a user-defined tolerance, it does not…
▽ More
The safe integration of machine learning modules in decision-making processes hinges on their ability to quantify uncertainty. A popular technique to achieve this goal is conformal prediction (CP), which transforms an arbitrary base predictor into a set predictor with coverage guarantees. While CP certifies the predicted set to contain the target quantity with a user-defined tolerance, it does not provide control over the average size of the predicted sets, i.e., over the informativeness of the prediction. In this work, a theoretical connection is established between the generalization properties of the base predictor and the informativeness of the resulting CP prediction sets. To this end, an upper bound is derived on the expected size of the CP set predictor that builds on generalization error bounds for the base predictor. The derived upper bound provides insights into the dependence of the average size of the CP set predictor on the amount of calibration data, the target reliability, and the generalization performance of the base predictor. The theoretical insights are validated using simple numerical regression and classification tasks.
△ Less
Submitted 22 January, 2024;
originally announced January 2024.
-
Low-Rank Gradient Compression with Error Feedback for MIMO Wireless Federated Learning
Authors:
Mingzhao Guo,
Dongzhu Liu,
Osvaldo Simeone,
Dingzhu Wen
Abstract:
This paper presents a novel approach to enhance the communication efficiency of federated learning (FL) in multiple input and multiple output (MIMO) wireless systems. The proposed method centers on a low-rank matrix factorization strategy for local gradient compression based on alternating least squares, along with over-the-air computation and error feedback. The proposed protocol, termed over-the…
▽ More
This paper presents a novel approach to enhance the communication efficiency of federated learning (FL) in multiple input and multiple output (MIMO) wireless systems. The proposed method centers on a low-rank matrix factorization strategy for local gradient compression based on alternating least squares, along with over-the-air computation and error feedback. The proposed protocol, termed over-the-air low-rank compression (Ota-LC), is demonstrated to have lower computation cost and lower communication overhead as compared to existing benchmarks while guaranteeing the same inference performance. As an example, when targeting a test accuracy of 80% on the Cifar-10 dataset, Ota-LC achieves a reduction in total communication costs of at least 30% when contrasted with benchmark schemes, while also reducing the computational complexity order by a factor equal to the sum of the dimension of the gradients.
△ Less
Submitted 15 January, 2024;
originally announced January 2024.
-
Calibrating Wireless Ray Tracing for Digital Twinning using Local Phase Error Estimates
Authors:
Clement Ruah,
Osvaldo Simeone,
Jakob Hoydis,
Bashir Al-Hashimi
Abstract:
Embodying the principle of simulation intelligence, digital twin (DT) systems construct and maintain a high-fidelity virtual model of a physical system. This paper focuses on ray tracing (RT), which is widely seen as an enabling technology for DTs of the radio access network (RAN) segment of next-generation disaggregated wireless systems. RT makes it possible to simulate channel conditions, enabli…
▽ More
Embodying the principle of simulation intelligence, digital twin (DT) systems construct and maintain a high-fidelity virtual model of a physical system. This paper focuses on ray tracing (RT), which is widely seen as an enabling technology for DTs of the radio access network (RAN) segment of next-generation disaggregated wireless systems. RT makes it possible to simulate channel conditions, enabling data augmentation and prediction-based transmission. However, the effectiveness of RT hinges on the adaptation of the electromagnetic properties assumed by the RT to actual channel conditions, a process known as calibration. The main challenge of RT calibration is the fact that small discrepancies in the geometric model fed to the RT software hinder the accuracy of the predicted phases of the simulated propagation paths. Existing solutions to this problem either rely on the channel power profile, hence disregarding phase information, or they operate on the channel responses by assuming the simulated phases to be sufficiently accurate for calibration. This paper proposes a novel channel response-based scheme that, unlike the state of the art, estimates and compensates for the phase errors in the RT-generated channel responses. The proposed approach builds on the variational expectation maximization algorithm with a flexible choice of the prior phase-error distribution that bridges between a deterministic model with no phase errors and a stochastic model with uniform phase errors. The algorithm is computationally efficient, and is demonstrated, by leveraging the open-source differentiable RT software available within the Sionna library, to outperform existing methods in terms of the accuracy of RT predictions.
△ Less
Submitted 14 May, 2024; v1 submitted 19 December, 2023;
originally announced December 2023.
-
Optimizing Information Freshness over a Channel that Wears Out
Authors:
George J. Stamatakis,
Osvaldo Simeone,
Nikolaos Pappas
Abstract:
A sensor samples and transmits status updates to a destination through a wireless channel that wears out over time and with every use. At each time slot, the sensor can decide to sample and transmit a fresh status update, restore the initial quality of the channel, or remain silent. The actions impose different costs on the operation of the system, and we study the problem of optimally selecting t…
▽ More
A sensor samples and transmits status updates to a destination through a wireless channel that wears out over time and with every use. At each time slot, the sensor can decide to sample and transmit a fresh status update, restore the initial quality of the channel, or remain silent. The actions impose different costs on the operation of the system, and we study the problem of optimally selecting the actions at the transmitter so as to maximize the freshness of the information at the receiver, while minimizing the communication cost. Freshness is measured by the age of information (AoI). The problem is addressed using dynamic programming, and numerical results are presented to provide insights into the optimal transmission policy.
△ Less
Submitted 1 December, 2023;
originally announced December 2023.
-
Batch Selection and Communication for Active Learning with Edge Labeling
Authors:
Victor Croisfelt,
Shashi Raj Pandey,
Osvaldo Simeone,
Petar Popovski
Abstract:
Conventional retransmission (ARQ) protocols are designed with the goal of ensuring the correct reception of all the individual transmitter's packets at the receiver. When the transmitter is a learner communicating with a teacher, this goal is at odds with the actual aim of the learner, which is that of eliciting the most relevant label information from the teacher. Taking an active learning perspe…
▽ More
Conventional retransmission (ARQ) protocols are designed with the goal of ensuring the correct reception of all the individual transmitter's packets at the receiver. When the transmitter is a learner communicating with a teacher, this goal is at odds with the actual aim of the learner, which is that of eliciting the most relevant label information from the teacher. Taking an active learning perspective, this paper addresses the following key protocol design questions: (i) Active batch selection: Which batch of inputs should be sent to the teacher to acquire the most useful information and thus reduce the number of required communication rounds? (ii) Batch encoding: Can batches of data points be combined to reduce the communication resources required at each communication round? Specifically, this work introduces Communication-Constrained Bayesian Active Knowledge Distillation (CC-BAKD), a novel protocol that integrates Bayesian active learning with compression via a linear mix-up mechanism. Comparisons with existing active learning protocols demonstrate the advantages of the proposed approach.
△ Less
Submitted 22 May, 2024; v1 submitted 14 November, 2023;
originally announced November 2023.
-
In-Context Learning for MIMO Equalization Using Transformer-Based Sequence Models
Authors:
Matteo Zecchin,
Kai Yu,
Osvaldo Simeone
Abstract:
Large pre-trained sequence models, such as transformer-based architectures, have been recently shown to have the capacity to carry out in-context learning (ICL). In ICL, a decision on a new input is made via a direct mapping of the input and of a few examples from the given task, serving as the task's context, to the output variable. No explicit updates of the model parameters are needed to tailor…
▽ More
Large pre-trained sequence models, such as transformer-based architectures, have been recently shown to have the capacity to carry out in-context learning (ICL). In ICL, a decision on a new input is made via a direct mapping of the input and of a few examples from the given task, serving as the task's context, to the output variable. No explicit updates of the model parameters are needed to tailor the decision to a new task. Pre-training, which amounts to a form of meta-learning, is based on the observation of examples from several related tasks. Prior work has shown ICL capabilities for linear regression. In this study, we leverage ICL to address the inverse problem of multiple-input and multiple-output (MIMO) equalization based on a context given by pilot symbols. A task is defined by the unknown fading channel and by the signal-to-noise ratio (SNR) level, which may be known. To highlight the practical potential of the approach, we allow the presence of quantization of the received signals. We demonstrate via numerical results that transformer-based ICL has a threshold behavior, whereby, as the number of pre-training tasks grows, the performance switches from that of a minimum mean squared error (MMSE) equalizer with a prior determined by the pre-trained tasks to that of an MMSE equalizer with the true data-generating prior.
△ Less
Submitted 22 January, 2024; v1 submitted 10 November, 2023;
originally announced November 2023.
-
Agreeing to Stop: Reliable Latency-Adaptive Decision Making via Ensembles of Spiking Neural Networks
Authors:
Jiechen Chen,
Sangwoo Park,
Osvaldo Simeone
Abstract:
Spiking neural networks (SNNs) are recurrent models that can leverage sparsity in input time series to efficiently carry out tasks such as classification. Additional efficiency gains can be obtained if decisions are taken as early as possible as a function of the complexity of the input time series. The decision on when to stop inference and produce a decision must rely on an estimate of the curre…
▽ More
Spiking neural networks (SNNs) are recurrent models that can leverage sparsity in input time series to efficiently carry out tasks such as classification. Additional efficiency gains can be obtained if decisions are taken as early as possible as a function of the complexity of the input time series. The decision on when to stop inference and produce a decision must rely on an estimate of the current accuracy of the decision. Prior work demonstrated the use of conformal prediction (CP) as a principled way to quantify uncertainty and support adaptive-latency decisions in SNNs. In this paper, we propose to enhance the uncertainty quantification capabilities of SNNs by implementing ensemble models for the purpose of improving the reliability of stopping decisions. Intuitively, an ensemble of multiple models can decide when to stop more reliably by selecting times at which most models agree that the current accuracy level is sufficient. The proposed method relies on different forms of information pooling from ensemble models, and offers theoretical reliability guarantees. We specifically show that variational inference-based ensembles with p-variable pooling significantly reduce the average latency of state-of-the-art methods, while maintaining reliability guarantees.
△ Less
Submitted 16 December, 2023; v1 submitted 25 October, 2023;
originally announced October 2023.
-
AirFL-Mem: Improving Communication-Learning Trade-Off by Long-Term Memory
Authors:
Haifeng Wen,
Hong Xing,
Osvaldo Simeone
Abstract:
Addressing the communication bottleneck inherent in federated learning (FL), over-the-air FL (AirFL) has emerged as a promising solution, which is, however, hampered by deep fading conditions. In this paper, we propose AirFL-Mem, a novel scheme designed to mitigate the impact of deep fading by implementing a \emph{long-term} memory mechanism. Convergence bounds are provided that account for long-t…
▽ More
Addressing the communication bottleneck inherent in federated learning (FL), over-the-air FL (AirFL) has emerged as a promising solution, which is, however, hampered by deep fading conditions. In this paper, we propose AirFL-Mem, a novel scheme designed to mitigate the impact of deep fading by implementing a \emph{long-term} memory mechanism. Convergence bounds are provided that account for long-term memory, as well as for existing AirFL variants with short-term memory, for general non-convex objectives. The theory demonstrates that AirFL-Mem exhibits the same convergence rate of federated averaging (FedAvg) with ideal communication, while the performance of existing schemes is generally limited by error floors. The theoretical results are also leveraged to propose a novel convex optimization strategy for the truncation threshold used for power control in the presence of Rayleigh fading channels. Experimental results validate the analysis, confirming the advantages of a long-term memory mechanism for the mitigation of deep fading.
△ Less
Submitted 27 October, 2023; v1 submitted 25 October, 2023;
originally announced October 2023.
-
Forking Uncertainties: Reliable Prediction and Model Predictive Control with Sequence Models via Conformal Risk Control
Authors:
Matteo Zecchin,
Sangwoo Park,
Osvaldo Simeone
Abstract:
In many real-world problems, predictions are leveraged to monitor and control cyber-physical systems, demanding guarantees on the satisfaction of reliability and safety requirements. However, predictions are inherently uncertain, and managing prediction uncertainty presents significant challenges in environments characterized by complex dynamics and forking trajectories. In this work, we assume ac…
▽ More
In many real-world problems, predictions are leveraged to monitor and control cyber-physical systems, demanding guarantees on the satisfaction of reliability and safety requirements. However, predictions are inherently uncertain, and managing prediction uncertainty presents significant challenges in environments characterized by complex dynamics and forking trajectories. In this work, we assume access to a pre-designed probabilistic implicit or explicit sequence model, which may have been obtained using model-based or model-free methods. We introduce probabilistic time series-conformal risk prediction (PTS-CRC), a novel post-hoc calibration procedure that operates on the predictions produced by any pre-designed probabilistic forecaster to yield reliable error bars. In contrast to existing art, PTS-CRC produces predictive sets based on an ensemble of multiple prototype trajectories sampled from the sequence model, supporting the efficient representation of forking uncertainties. Furthermore, unlike the state of the art, PTS-CRC can satisfy reliability definitions beyond coverage. This property is leveraged to devise a novel model predictive control (MPC) framework that addresses open-loop and closed-loop control problems under general average constraints on the quality or safety of the control policy. We experimentally validate the performance of PTS-CRC prediction and control by studying a number of use cases in the context of wireless networking. Across all the considered tasks, PTS-CRC predictors are shown to provide more informative predictive sets, as well as safe control policies with larger returns.
△ Less
Submitted 16 October, 2023;
originally announced October 2023.
-
Towards Efficient and Trustworthy AI Through Hardware-Algorithm-Communication Co-Design
Authors:
Bipin Rajendran,
Osvaldo Simeone,
Bashir M. Al-Hashimi
Abstract:
Artificial intelligence (AI) algorithms based on neural networks have been designed for decades with the goal of maximising some measure of accuracy. This has led to two undesired effects. First, model complexity has risen exponentially when measured in terms of computation and memory requirements. Second, state-of-the-art AI models are largely incapable of providing trustworthy measures of their…
▽ More
Artificial intelligence (AI) algorithms based on neural networks have been designed for decades with the goal of maximising some measure of accuracy. This has led to two undesired effects. First, model complexity has risen exponentially when measured in terms of computation and memory requirements. Second, state-of-the-art AI models are largely incapable of providing trustworthy measures of their uncertainty, possibly `hallucinating' their answers and discouraging their adoption for decision-making in sensitive applications.
With the goal of realising efficient and trustworthy AI, in this paper we highlight research directions at the intersection of hardware and software design that integrate physical insights into computational substrates, neuroscientific principles concerning efficient information processing, information-theoretic results on optimal uncertainty quantification, and communication-theoretic guidelines for distributed processing. Overall, the paper advocates for novel design methodologies that target not only accuracy but also uncertainty quantification, while leveraging emerging computing hardware architectures that move beyond the traditional von Neumann digital computing paradigm to embrace in-memory, neuromorphic, and quantum computing technologies. An important overarching principle of the proposed approach is to view the stochasticity inherent in the computational substrate and in the communication channels between processors as a resource to be leveraged for the purpose of representing and processing classical and quantum uncertainty.
△ Less
Submitted 27 September, 2023;
originally announced September 2023.
-
Statistical Complexity of Quantum Learning
Authors:
Leonardo Banchi,
Jason Luke Pereira,
Sharu Theresa Jose,
Osvaldo Simeone
Abstract:
Recent years have seen significant activity on the problem of using data for the purpose of learning properties of quantum systems or of processing classical or quantum data via quantum computing. As in classical learning, quantum learning problems involve settings in which the mechanism generating the data is unknown, and the main goal of a learning algorithm is to ensure satisfactory accuracy le…
▽ More
Recent years have seen significant activity on the problem of using data for the purpose of learning properties of quantum systems or of processing classical or quantum data via quantum computing. As in classical learning, quantum learning problems involve settings in which the mechanism generating the data is unknown, and the main goal of a learning algorithm is to ensure satisfactory accuracy levels when only given access to data and, possibly, side information such as expert knowledge. This article reviews the complexity of quantum learning using information-theoretic techniques by focusing on data complexity, copy complexity, and model complexity. Copy complexity arises from the destructive nature of quantum measurements, which irreversibly alter the state to be processed, limiting the information that can be extracted about quantum data. For example, in a quantum system, unlike in classical machine learning, it is generally not possible to evaluate the training loss simultaneously on multiple hypotheses using the same quantum data. To make the paper self-contained and approachable by different research communities, we provide extensive background material on classical results from statistical learning theory, as well as on the distinguishability of quantum states. Throughout, we highlight the differences between quantum and classical learning by addressing both supervised and unsupervised learning, and we provide extensive pointers to the literature.
△ Less
Submitted 16 April, 2024; v1 submitted 20 September, 2023;
originally announced September 2023.
-
Energy-Efficient On-Board Radio Resource Management for Satellite Communications via Neuromorphic Computing
Authors:
Flor Ortiz,
Nicolas Skatchkovsky,
Eva Lagunas,
Wallace A. Martins,
Geoffrey Eappen,
Saed Daoud,
Osvaldo Simeone,
Bipin Rajendran,
Symeon Chatzinotas
Abstract:
The latest satellite communication (SatCom) missions are characterized by a fully reconfigurable on-board software-defined payload, capable of adapting radio resources to the temporal and spatial variations of the system traffic. As pure optimization-based solutions have shown to be computationally tedious and to lack flexibility, machine learning (ML)-based methods have emerged as promising alter…
▽ More
The latest satellite communication (SatCom) missions are characterized by a fully reconfigurable on-board software-defined payload, capable of adapting radio resources to the temporal and spatial variations of the system traffic. As pure optimization-based solutions have shown to be computationally tedious and to lack flexibility, machine learning (ML)-based methods have emerged as promising alternatives. We investigate the application of energy-efficient brain-inspired ML models for on-board radio resource management. Apart from software simulation, we report extensive experimental results leveraging the recently released Intel Loihi 2 chip. To benchmark the performance of the proposed model, we implement conventional convolutional neural networks (CNN) on a Xilinx Versal VCK5000, and provide a detailed comparison of accuracy, precision, recall, and energy efficiency for different traffic demands. Most notably, for relevant workloads, spiking neural networks (SNNs) implemented on Loihi 2 yield higher accuracy, while reducing power consumption by more than 100$\times$ as compared to the CNN-based reference platform. Our findings point to the significant potential of neuromorphic computing and SNNs in supporting on-board SatCom operations, paving the way for enhanced efficiency and sustainability in future SatCom systems.
△ Less
Submitted 21 August, 2023;
originally announced August 2023.
-
Federated Inference with Reliable Uncertainty Quantification over Wireless Channels via Conformal Prediction
Authors:
Meiyi Zhu,
Matteo Zecchin,
Sangwoo Park,
Caili Guo,
Chunyan Feng,
Osvaldo Simeone
Abstract:
In this paper, we consider a wireless federated inference scenario in which devices and a server share a pre-trained machine learning model. The devices communicate statistical information about their local data to the server over a common wireless channel, aiming to enhance the quality of the inference decision at the server. Recent work has introduced federated conformal prediction (CP), which l…
▽ More
In this paper, we consider a wireless federated inference scenario in which devices and a server share a pre-trained machine learning model. The devices communicate statistical information about their local data to the server over a common wireless channel, aiming to enhance the quality of the inference decision at the server. Recent work has introduced federated conformal prediction (CP), which leverages devices-to-server communication to improve the reliability of the server's decision. With federated CP, devices communicate to the server information about the loss accrued by the shared pre-trained model on the local data, and the server leverages this information to calibrate a decision interval, or set, so that it is guaranteed to contain the correct answer with a pre-defined target reliability level. Previous work assumed noise-free communication, whereby devices can communicate a single real number to the server. In this paper, we study for the first time federated CP in a wireless setting. We introduce a novel protocol, termed wireless federated conformal prediction (WFCP), which builds on type-based multiple access (TBMA) and on a novel quantile correction strategy. WFCP is proved to provide formal reliability guarantees in terms of coverage of the predicted set produced by the server. Using numerical results, we demonstrate the significant advantages of WFCP against digital implementations of existing federated CP schemes, especially in regimes with limited communication resources and/or large number of devices.
△ Less
Submitted 15 December, 2023; v1 submitted 8 August, 2023;
originally announced August 2023.
-
Bayesian Optimization with Formal Safety Guarantees via Online Conformal Prediction
Authors:
Yunchuan Zhang,
Sangwoo Park,
Osvaldo Simeone
Abstract:
Black-box zero-th order optimization is a central primitive for applications in fields as diverse as finance, physics, and engineering. In a common formulation of this problem, a designer sequentially attempts candidate solutions, receiving noisy feedback on the value of each attempt from the system. In this paper, we study scenarios in which feedback is also provided on the safety of the attempte…
▽ More
Black-box zero-th order optimization is a central primitive for applications in fields as diverse as finance, physics, and engineering. In a common formulation of this problem, a designer sequentially attempts candidate solutions, receiving noisy feedback on the value of each attempt from the system. In this paper, we study scenarios in which feedback is also provided on the safety of the attempted solution, and the optimizer is constrained to limit the number of unsafe solutions that are tried throughout the optimization process. Focusing on methods based on Bayesian optimization (BO), prior art has introduced an optimization scheme -- referred to as SAFEOPT -- that is guaranteed not to select any unsafe solution with a controllable probability over feedback noise as long as strict assumptions on the safety constraint function are met. In this paper, a novel BO-based approach is introduced that satisfies safety requirements irrespective of properties of the constraint function. This strong theoretical guarantee is obtained at the cost of allowing for an arbitrary, controllable but non-zero, rate of violation of the safety constraint. The proposed method, referred to as SAFE-BOCP, builds on online conformal prediction (CP) and is specialized to the cases in which feedback on the safety constraint is either noiseless or noisy. Experimental results on synthetic and real-world data validate the advantages and flexibility of the proposed SAFE-BOCP.
△ Less
Submitted 4 July, 2024; v1 submitted 30 June, 2023;
originally announced June 2023.
-
Knowing When to Stop: Delay-Adaptive Spiking Neural Network Classifiers with Reliability Guarantees
Authors:
Jiechen Chen,
Sangwoo Park,
Osvaldo Simeone
Abstract:
Spiking neural networks (SNNs) process time-series data via internal event-driven neural dynamics. The energy consumption of an SNN depends on the number of spikes exchanged between neurons over the course of the input presentation. Typically, decisions are produced after the entire input sequence has been processed. This results in latency and energy consumption levels that are fairly uniform acr…
▽ More
Spiking neural networks (SNNs) process time-series data via internal event-driven neural dynamics. The energy consumption of an SNN depends on the number of spikes exchanged between neurons over the course of the input presentation. Typically, decisions are produced after the entire input sequence has been processed. This results in latency and energy consumption levels that are fairly uniform across inputs. However, as explored in recent work, SNNs can produce an early decision when the SNN model is sufficiently ``confident'', adapting delay and energy consumption to the difficulty of each example. Existing techniques are based on heuristic measures of confidence that do not provide reliability guarantees, potentially exiting too early. In this paper, we introduce a novel delay-adaptive SNN-based inference methodology that, wrapping around any pre-trained SNN classifier, provides guaranteed reliability for the decisions produced at input-dependent stopping times. The approach, dubbed SpikeCP, leverages tools from conformal prediction (CP). It entails minimal complexity increase as compared to the underlying SNN, requiring only additional thresholding and counting operations at run time. SpikeCP is also extended to integrate a CP-aware training phase that targets delay performance. Variants of CP based on alternative confidence correction schemes, from Bonferroni to Simes, are explored, and extensive experiments are described using the MNIST-DVS data set, DVS128 Gesture dataset, and CIFAR-10 dataset.
△ Less
Submitted 29 June, 2024; v1 submitted 18 May, 2023;
originally announced May 2023.
-
Convergence Analysis of Over-the-Air FL with Compression and Power Control via Clipping
Authors:
Haifeng Wen,
Hong Xing,
Osvaldo Simeone
Abstract:
One of the key challenges towards the deployment of over-the-air federated learning (AirFL) is the design of mechanisms that can comply with the power and bandwidth constraints of the shared channel, while causing minimum deterioration to the learning performance as compared to baseline noiseless implementations. For additive white Gaussian noise (AWGN) channels with instantaneous per-device power…
▽ More
One of the key challenges towards the deployment of over-the-air federated learning (AirFL) is the design of mechanisms that can comply with the power and bandwidth constraints of the shared channel, while causing minimum deterioration to the learning performance as compared to baseline noiseless implementations. For additive white Gaussian noise (AWGN) channels with instantaneous per-device power constraints, prior work has demonstrated the optimality of a power control mechanism based on norm clipping. This was done through the minimization of an upper bound on the optimality gap for smooth learning objectives satisfying the Polyak-Łojasiewicz (PL) condition. In this paper, we make two contributions to the development of AirFL based on norm clipping, which we refer to as AirFL-Clip. First, we provide a convergence bound for AirFLClip that applies to general smooth and non-convex learning objectives. Unlike existing results, the derived bound is free from run-specific parameters, thus supporting an offline evaluation. Second, we extend AirFL-Clip to include Top-k sparsification and linear compression. For this generalized protocol, referred to as AirFL-Clip-Comp, we derive a convergence bound for general smooth and non-convex learning objectives. We argue, and demonstrate via experiments, that the only time-varying quantities present in the bound can be efficiently estimated offline by leveraging the well-studied properties of sparse recovery algorithms.
△ Less
Submitted 18 May, 2023;
originally announced May 2023.
-
Calibration-Aware Bayesian Learning
Authors:
Jiayi Huang,
Sangwoo Park,
Osvaldo Simeone
Abstract:
Deep learning models, including modern systems like large language models, are well known to offer unreliable estimates of the uncertainty of their decisions. In order to improve the quality of the confidence levels, also known as calibration, of a model, common approaches entail the addition of either data-dependent or data-independent regularization terms to the training loss. Data-dependent reg…
▽ More
Deep learning models, including modern systems like large language models, are well known to offer unreliable estimates of the uncertainty of their decisions. In order to improve the quality of the confidence levels, also known as calibration, of a model, common approaches entail the addition of either data-dependent or data-independent regularization terms to the training loss. Data-dependent regularizers have been recently introduced in the context of conventional frequentist learning to penalize deviations between confidence and accuracy. In contrast, data-independent regularizers are at the core of Bayesian learning, enforcing adherence of the variational distribution in the model parameter space to a prior density. The former approach is unable to quantify epistemic uncertainty, while the latter is severely affected by model misspecification. In light of the limitations of both methods, this paper proposes an integrated framework, referred to as calibration-aware Bayesian neural networks (CA-BNNs), that applies both regularizers while optimizing over a variational distribution as in Bayesian learning. Numerical results validate the advantages of the proposed approach in terms of expected calibration error (ECE) and reliability diagrams.
△ Less
Submitted 12 April, 2024; v1 submitted 12 May, 2023;
originally announced May 2023.
-
Adaptive and Flexible Model-Based AI for Deep Receivers in Dynamic Channels
Authors:
Tomer Raviv,
Sangwoo Park,
Osvaldo Simeone,
Yonina C. Eldar,
Nir Shlezinger
Abstract:
Artificial intelligence (AI) is envisioned to play a key role in future wireless technologies, with deep neural networks (DNNs) enabling digital receivers to learn to operate in challenging communication scenarios. However, wireless receiver design poses unique challenges that fundamentally differ from those encountered in traditional deep learning domains. The main challenges arise from the limit…
▽ More
Artificial intelligence (AI) is envisioned to play a key role in future wireless technologies, with deep neural networks (DNNs) enabling digital receivers to learn to operate in challenging communication scenarios. However, wireless receiver design poses unique challenges that fundamentally differ from those encountered in traditional deep learning domains. The main challenges arise from the limited power and computational resources of wireless devices, as well as from the dynamic nature of wireless communications, which causes continual changes to the data distribution. These challenges impair conventional AI based on highly-parameterized DNNs, motivating the development of adaptive, flexible, and light-weight AI for wireless communications, which is the focus of this article. Here, we propose that AI-based design of wireless receivers requires rethinking of the three main pillars of AI: architecture, data, and training algorithms. In terms of architecture, we review how to design compact DNNs via model-based deep learning. Then, we discuss how to acquire training data for deep receivers without compromising spectral efficiency. Finally, we review efficient, reliable, and robust training algorithms via meta-learning and generalized Bayesian learning. Numerical results are presented to demonstrate the complementary effectiveness of each of the surveyed methods. We conclude by presenting opportunities for future research on the development of practical deep receivers
△ Less
Submitted 12 May, 2023;
originally announced May 2023.
-
Bayesian Over-the-Air FedAvg via Channel Driven Stochastic Gradient Langevin Dynamics
Authors:
Boning Zhang,
Dongzhu Liu,
Osvaldo Simeone,
Guangxu Zhu
Abstract:
The recent development of scalable Bayesian inference methods has renewed interest in the adoption of Bayesian learning as an alternative to conventional frequentist learning that offers improved model calibration via uncertainty quantification. Recently, federated averaging Langevin dynamics (FALD) was introduced as a variant of federated averaging that can efficiently implement distributed Bayes…
▽ More
The recent development of scalable Bayesian inference methods has renewed interest in the adoption of Bayesian learning as an alternative to conventional frequentist learning that offers improved model calibration via uncertainty quantification. Recently, federated averaging Langevin dynamics (FALD) was introduced as a variant of federated averaging that can efficiently implement distributed Bayesian learning in the presence of noiseless communications. In this paper, we propose wireless FALD (WFALD), a novel protocol that realizes FALD in wireless systems by integrating over-the-air computation and channel-driven sampling for Monte Carlo updates. Unlike prior work on wireless Bayesian learning, WFALD enables (\emph{i}) multiple local updates between communication rounds; and (\emph{ii}) stochastic gradients computed by mini-batch. A convergence analysis is presented in terms of the 2-Wasserstein distance between the samples produced by WFALD and the targeted global posterior distribution. Analysis and experiments show that, when the signal-to-noise ratio is sufficiently large, channel noise can be fully repurposed for Monte Carlo sampling, thus entailing no loss in performance.
△ Less
Submitted 9 May, 2023; v1 submitted 6 May, 2023;
originally announced May 2023.
-
Quantum Conformal Prediction for Reliable Uncertainty Quantification in Quantum Machine Learning
Authors:
Sangwoo Park,
Osvaldo Simeone
Abstract:
In this work, we aim at augmenting the decisions output by quantum models with "error bars" that provide finite-sample coverage guarantees. Quantum models implement implicit probabilistic predictors that produce multiple random decisions for each input through measurement shots. Randomness arises not only from the inherent stochasticity of quantum measurements, but also from quantum gate noise and…
▽ More
In this work, we aim at augmenting the decisions output by quantum models with "error bars" that provide finite-sample coverage guarantees. Quantum models implement implicit probabilistic predictors that produce multiple random decisions for each input through measurement shots. Randomness arises not only from the inherent stochasticity of quantum measurements, but also from quantum gate noise and quantum measurement noise caused by noisy hardware. Furthermore, quantum noise may be correlated across shots and it may present drifts in time. This paper proposes to leverage such randomness to define prediction sets for both classification and regression that provably capture the uncertainty of the model. The approach builds on probabilistic conformal prediction (PCP), while accounting for the unique features of quantum models. Among the key technical innovations, we introduce a new general class of non-conformity scores that address the presence of quantum noise, including possible drifts. Experimental results, using both simulators and current quantum computers, confirm the theoretical calibration guarantees of the proposed framework.
△ Less
Submitted 22 October, 2023; v1 submitted 6 April, 2023;
originally announced April 2023.
-
Guaranteed Dynamic Scheduling of Ultra-Reliable Low-Latency Traffic via Conformal Prediction
Authors:
Kfir M. Cohen,
Sangwoo Park,
Osvaldo Simeone,
Petar Popovski,
Shlomo Shamai
Abstract:
The dynamic scheduling of ultra-reliable and low-latency traffic (URLLC) in the uplink can significantly enhance the efficiency of coexisting services, such as enhanced mobile broadband (eMBB) devices, by only allocating resources when necessary. The main challenge is posed by the uncertainty in the process of URLLC packet generation, which mandates the use of predictors for URLLC traffic in the c…
▽ More
The dynamic scheduling of ultra-reliable and low-latency traffic (URLLC) in the uplink can significantly enhance the efficiency of coexisting services, such as enhanced mobile broadband (eMBB) devices, by only allocating resources when necessary. The main challenge is posed by the uncertainty in the process of URLLC packet generation, which mandates the use of predictors for URLLC traffic in the coming frames. In practice, such prediction may overestimate or underestimate the amount of URLLC data to be generated, yielding either an excessive or an insufficient amount of resources to be pre-emptively allocated for URLLC packets. In this paper, we introduce a novel scheduler for URLLC packets that provides formal guarantees on reliability and latency irrespective of the quality of the URLLC traffic predictor. The proposed method leverages recent advances in online conformal prediction (CP), and follows the principle of dynamically adjusting the amount of allocated resources so as to meet reliability and latency requirements set by the designer.
△ Less
Submitted 3 April, 2023; v1 submitted 15 February, 2023;
originally announced February 2023.
-
Uncertainty-Aware and Reliable Neural MIMO Receivers via Modular Bayesian Deep Learning
Authors:
Tomer Raviv,
Sangwoo Park,
Osvaldo Simeone,
Nir Shlezinger
Abstract:
Deep learning is envisioned to play a key role in the design of future wireless receivers. A popular approach to design learning-aided receivers combines deep neural networks (DNNs) with traditional model-based receiver algorithms, realizing hybrid model-based data-driven architectures. Such architectures typically include multiple modules, each carrying out a different functionality dictated by t…
▽ More
Deep learning is envisioned to play a key role in the design of future wireless receivers. A popular approach to design learning-aided receivers combines deep neural networks (DNNs) with traditional model-based receiver algorithms, realizing hybrid model-based data-driven architectures. Such architectures typically include multiple modules, each carrying out a different functionality dictated by the model-based receiver workflow. Conventionally trained DNN-based modules are known to produce poorly calibrated, typically overconfident, decisions. Consequently, incorrect decisions may propagate through the architecture without any indication of their insufficient accuracy. To address this problem, we present a novel combination of Bayesian deep learning with hybrid model-based data-driven architectures for wireless receiver design. The proposed methodology, referred to as modular Bayesian deep learning, is designed to yield calibrated modules, which in turn improves both accuracy and calibration of the overall receiver. We specialize this approach for two fundamental tasks in multiple-input multiple-output (MIMO) receivers - equalization and decoding. In the presence of scarce data, the ability of modular Bayesian deep learning to produce reliable uncertainty measures is consistently shown to directly translate into improved performance of the overall MIMO receiver chain.
△ Less
Submitted 14 March, 2024; v1 submitted 5 February, 2023;
originally announced February 2023.
-
Bayesian Inference on Binary Spiking Networks Leveraging Nanoscale Device Stochasticity
Authors:
Prabodh Katti,
Nicolas Skatchkovsky,
Osvaldo Simeone,
Bipin Rajendran,
Bashir M. Al-Hashimi
Abstract:
Bayesian Neural Networks (BNNs) can overcome the problem of overconfidence that plagues traditional frequentist deep neural networks, and are hence considered to be a key enabler for reliable AI systems. However, conventional hardware realizations of BNNs are resource intensive, requiring the implementation of random number generators for synaptic sampling. Owing to their inherent stochasticity du…
▽ More
Bayesian Neural Networks (BNNs) can overcome the problem of overconfidence that plagues traditional frequentist deep neural networks, and are hence considered to be a key enabler for reliable AI systems. However, conventional hardware realizations of BNNs are resource intensive, requiring the implementation of random number generators for synaptic sampling. Owing to their inherent stochasticity during programming and read operations, nanoscale memristive devices can be directly leveraged for sampling, without the need for additional hardware resources. In this paper, we introduce a novel Phase Change Memory (PCM)-based hardware implementation for BNNs with binary synapses. The proposed architecture consists of separate weight and noise planes, in which PCM cells are configured and operated to represent the nominal values of weights and to generate the required noise for sampling, respectively. Using experimentally observed PCM noise characteristics, for the exemplary Breast Cancer Dataset classification problem, we obtain hardware accuracy and expected calibration error matching that of an 8-bit fixed-point (FxP8) implementation, with projected savings of over 9$\times$ in terms of core area transistor count.
△ Less
Submitted 2 February, 2023;
originally announced February 2023.
-
Time-Warping Invariant Quantum Recurrent Neural Networks via Quantum-Classical Adaptive Gating
Authors:
Ivana Nikoloska,
Osvaldo Simeone,
Leonardo Banchi,
Petar Veličković
Abstract:
Adaptive gating plays a key role in temporal data processing via classical recurrent neural networks (RNN), as it facilitates retention of past information necessary to predict the future, providing a mechanism that preserves invariance to time warping transformations. This paper builds on quantum recurrent neural networks (QRNNs), a dynamic model with quantum memory, to introduce a novel class of…
▽ More
Adaptive gating plays a key role in temporal data processing via classical recurrent neural networks (RNN), as it facilitates retention of past information necessary to predict the future, providing a mechanism that preserves invariance to time warping transformations. This paper builds on quantum recurrent neural networks (QRNNs), a dynamic model with quantum memory, to introduce a novel class of temporal data processing quantum models that preserve invariance to time-warping transformations of the (classical) input-output sequences. The model, referred to as time warping-invariant QRNN (TWI-QRNN), augments a QRNN with a quantum-classical adaptive gating mechanism that chooses whether to apply a parameterized unitary transformation at each time step as a function of the past samples of the input sequence via a classical recurrent model. The TWI-QRNN model class is derived from first principles, and its capacity to successfully implement time-warping transformations is experimentally demonstrated on examples with classical or quantum dynamics.
△ Less
Submitted 9 June, 2023; v1 submitted 19 January, 2023;
originally announced January 2023.
-
Bayesian and Multi-Armed Contextual Meta-Optimization for Efficient Wireless Radio Resource Management
Authors:
Yunchuan Zhang,
Osvaldo Simeone,
Sharu Theresa Jose,
Lorenzo Maggi,
Alvaro Valcarce
Abstract:
Optimal resource allocation in modern communication networks calls for the optimization of objective functions that are only accessible via costly separate evaluations for each candidate solution. The conventional approach carries out the optimization of resource-allocation parameters for each system configuration, characterized, e.g., by topology and traffic statistics, using global search method…
▽ More
Optimal resource allocation in modern communication networks calls for the optimization of objective functions that are only accessible via costly separate evaluations for each candidate solution. The conventional approach carries out the optimization of resource-allocation parameters for each system configuration, characterized, e.g., by topology and traffic statistics, using global search methods such as Bayesian optimization (BO). These methods tend to require a large number of iterations, and hence a large number of key performance indicator (KPI) evaluations. In this paper, we propose the use of meta-learning to transfer knowledge from data collected from related, but distinct, configurations in order to speed up optimization on new network configurations. Specifically, we combine meta-learning with BO, as well as with multi-armed bandit (MAB) optimization, with the latter having the potential advantage of operating directly on a discrete search space. Furthermore, we introduce novel contextual meta-BO and meta-MAB algorithms, in which transfer of knowledge across configurations occurs at the level of a mapping from graph-based contextual information to resource-allocation parameters. Experiments for the problem of open loop power control (OLPC) parameter optimization for the uplink of multi-cell multi-antenna systems provide insights into the potential benefits of meta-learning and contextual optimization.
△ Less
Submitted 19 May, 2023; v1 submitted 16 January, 2023;
originally announced January 2023.
-
Information Bottleneck-Inspired Type Based Multiple Access for Remote Estimation in IoT Systems
Authors:
Meiyi Zhu,
Chunyan Feng,
Caili Guo,
Nan Jiang,
Osvaldo Simeone
Abstract:
Type-based multiple access (TBMA) is a semantics-aware multiple access protocol for remote inference. In TBMA, codewords are reused across transmitting sensors, with each codeword being assigned to a different observation value. Existing TBMA protocols are based on fixed shared codebooks and on conventional maximum-likelihood or Bayesian decoders, which require knowledge of the distributions of ob…
▽ More
Type-based multiple access (TBMA) is a semantics-aware multiple access protocol for remote inference. In TBMA, codewords are reused across transmitting sensors, with each codeword being assigned to a different observation value. Existing TBMA protocols are based on fixed shared codebooks and on conventional maximum-likelihood or Bayesian decoders, which require knowledge of the distributions of observations and channels. In this letter, we propose a novel design principle for TBMA based on the information bottleneck (IB). In the proposed IB-TBMA protocol, the shared codebook is jointly optimized with a decoder based on artificial neural networks (ANNs), so as to adapt to source, observations, and channel statistics based on data only. We also introduce the Compressed IB-TBMA (CIB-TBMA) protocol, which improves IB-TBMA by enabling a reduction in the number of codewords via an IB-inspired clustering phase. Numerical results demonstrate the importance of a joint design of codebook and neural decoder, and validate the benefits of codebook compression.
△ Less
Submitted 5 April, 2023; v1 submitted 19 December, 2022;
originally announced December 2022.
-
Calibrating AI Models for Wireless Communications via Conformal Prediction
Authors:
Kfir M. Cohen,
Sangwoo Park,
Osvaldo Simeone,
Shlomo Shamai
Abstract:
When used in complex engineered systems, such as communication networks, artificial intelligence (AI) models should be not only as accurate as possible, but also well calibrated. A well-calibrated AI model is one that can reliably quantify the uncertainty of its decisions, assigning high confidence levels to decisions that are likely to be correct and low confidence levels to decisions that are li…
▽ More
When used in complex engineered systems, such as communication networks, artificial intelligence (AI) models should be not only as accurate as possible, but also well calibrated. A well-calibrated AI model is one that can reliably quantify the uncertainty of its decisions, assigning high confidence levels to decisions that are likely to be correct and low confidence levels to decisions that are likely to be erroneous. This paper investigates the application of conformal prediction as a general framework to obtain AI models that produce decisions with formal calibration guarantees. Conformal prediction transforms probabilistic predictors into set predictors that are guaranteed to contain the correct answer with a probability chosen by the designer. Such formal calibration guarantees hold irrespective of the true, unknown, distribution underlying the generation of the variables of interest, and can be defined in terms of ensemble or time-averaged probabilities. In this paper, conformal prediction is applied for the first time to the design of AI for communication systems in conjunction to both frequentist and Bayesian learning, focusing on demodulation, modulation classification, and channel prediction.
△ Less
Submitted 15 December, 2022;
originally announced December 2022.
-
Online Convex Optimization of Programmable Quantum Computers to Simulate Time-Varying Quantum Channels
Authors:
Hari Hara Suthan Chittoor,
Osvaldo Simeone,
Leonardo Banchi,
Stefano Pirandola
Abstract:
Simulating quantum channels is a fundamental primitive in quantum computing, since quantum channels define general (trace-preserving) quantum operations. An arbitrary quantum channel cannot be exactly simulated using a finite-dimensional programmable quantum processor, making it important to develop optimal approximate simulation techniques. In this paper, we study the challenging setting in which…
▽ More
Simulating quantum channels is a fundamental primitive in quantum computing, since quantum channels define general (trace-preserving) quantum operations. An arbitrary quantum channel cannot be exactly simulated using a finite-dimensional programmable quantum processor, making it important to develop optimal approximate simulation techniques. In this paper, we study the challenging setting in which the channel to be simulated varies adversarially with time. We propose the use of matrix exponentiated gradient descent (MEGD), an online convex optimization method, and analytically show that it achieves a sublinear regret in time. Through experiments, we validate the main results for time-varying dephasing channels using a programmable generalized teleportation processor.
△ Less
Submitted 9 December, 2022;
originally announced December 2022.
-
A Bayesian Framework for Digital Twin-Based Control, Monitoring, and Data Collection in Wireless Systems
Authors:
Clement Ruah,
Osvaldo Simeone,
Bashir Al-Hashimi
Abstract:
Commonly adopted in the manufacturing and aerospace sectors, digital twin (DT) platforms are increasingly seen as a promising paradigm to control, monitor, and analyze software-based, "open", communication systems. Notably, DT platforms provide a sandbox in which to test artificial intelligence (AI) solutions for communication systems, potentially reducing the need to collect data and test algorit…
▽ More
Commonly adopted in the manufacturing and aerospace sectors, digital twin (DT) platforms are increasingly seen as a promising paradigm to control, monitor, and analyze software-based, "open", communication systems. Notably, DT platforms provide a sandbox in which to test artificial intelligence (AI) solutions for communication systems, potentially reducing the need to collect data and test algorithms in the field, i.e., on the physical twin (PT). A key challenge in the deployment of DT systems is to ensure that virtual control optimization, monitoring, and analysis at the DT are safe and reliable, avoiding incorrect decisions caused by "model exploitation". To address this challenge, this paper presents a general Bayesian framework with the aim of quantifying and accounting for model uncertainty at the DT that is caused by limitations in the amount and quality of data available at the DT from the PT. In the proposed framework, the DT builds a Bayesian model of the communication system, which is leveraged to enable core DT functionalities such as control via multi-agent reinforcement learning (MARL), monitoring of the PT for anomaly detection, prediction, data-collection optimization, and counterfactual analysis. To exemplify the application of the proposed framework, we specifically investigate a case-study system encompassing multiple sensing devices that report to a common receiver. Experimental results validate the effectiveness of the proposed Bayesian framework as compared to standard frequentist model-based solutions.
△ Less
Submitted 29 August, 2023; v1 submitted 2 December, 2022;
originally announced December 2022.
-
Digital Twin-Based Multiple Access Optimization and Monitoring via Model-Driven Bayesian Learning
Authors:
Clement Ruah,
Osvaldo Simeone,
Bashir Al-Hashimi
Abstract:
Commonly adopted in the manufacturing and aerospace sectors, digital twin (DT) platforms are increasingly seen as a promising paradigm to control and monitor software-based, "open", communication systems, which play the role of the physical twin (PT). In the general framework presented in this work, the DT builds a Bayesian model of the communication system, which is leveraged to enable core DT fu…
▽ More
Commonly adopted in the manufacturing and aerospace sectors, digital twin (DT) platforms are increasingly seen as a promising paradigm to control and monitor software-based, "open", communication systems, which play the role of the physical twin (PT). In the general framework presented in this work, the DT builds a Bayesian model of the communication system, which is leveraged to enable core DT functionalities such as control via multi-agent reinforcement learning (MARL) and monitoring of the PT for anomaly detection. We specifically investigate the application of the proposed framework to a simple case-study system encompassing multiple sensing devices that report to a common receiver. The Bayesian model trained at the DT has the key advantage of capturing epistemic uncertainty regarding the communication system, e.g., regarding current traffic conditions, which arise from limited PT-to-DT data transfer. Experimental results validate the effectiveness of the proposed Bayesian framework as compared to standard frequentist model-based solutions.
△ Less
Submitted 27 January, 2023; v1 submitted 11 October, 2022;
originally announced October 2022.
-
Network Topology Inference based on Timing Meta-Data
Authors:
Wenbo Du,
Tao Tan,
Haijun Zhang,
Xianbin Cao,
Gang Yan,
Osvaldo Simeone
Abstract:
Consider a processor having access only to meta-data consisting of the timings of data packets and acknowledgment (ACK) packets from all nodes in a network. The meta-data report the source node of each packet, but not the destination nodes or the contents of the packets. The goal of the processor is to infer the network topology based solely on such information. Prior work leveraged causality metr…
▽ More
Consider a processor having access only to meta-data consisting of the timings of data packets and acknowledgment (ACK) packets from all nodes in a network. The meta-data report the source node of each packet, but not the destination nodes or the contents of the packets. The goal of the processor is to infer the network topology based solely on such information. Prior work leveraged causality metrics to identify which links are active. If the data timings and ACK timings of two nodes -- say node 1 and node 2, respectively -- are causally related, this may be taken as evidence that node 1 is communicating to node 2 (which sends back ACK packets to node 1). This paper starts with the observation that packet losses can weaken the causality relationship between data and ACK timing streams. To obviate this problem, a new Expectation Maximization (EM)-based algorithm is introduced -- EM-causality discovery algorithm (EM-CDA) -- which treats packet losses as latent variables. EM-CDA iterates between the estimation of packet losses and the evaluation of causality metrics. The method is validated through extensive experiments in wireless sensor networks on the NS-3 simulation platform.
△ Less
Submitted 11 October, 2022;
originally announced October 2022.
-
Calibrating AI Models for Few-Shot Demodulation via Conformal Prediction
Authors:
Kfir M. Cohen,
Sangwoo Park,
Osvaldo Simeone,
Shlomo Shamai
Abstract:
AI tools can be useful to address model deficits in the design of communication systems. However, conventional learning-based AI algorithms yield poorly calibrated decisions, unabling to quantify their outputs uncertainty. While Bayesian learning can enhance calibration by capturing epistemic uncertainty caused by limited data availability, formal calibration guarantees only hold under strong assu…
▽ More
AI tools can be useful to address model deficits in the design of communication systems. However, conventional learning-based AI algorithms yield poorly calibrated decisions, unabling to quantify their outputs uncertainty. While Bayesian learning can enhance calibration by capturing epistemic uncertainty caused by limited data availability, formal calibration guarantees only hold under strong assumptions about the ground-truth, unknown, data generation mechanism. We propose to leverage the conformal prediction framework to obtain data-driven set predictions whose calibration properties hold irrespective of the data distribution. Specifically, we investigate the design of baseband demodulators in the presence of hard-to-model nonlinearities such as hardware imperfections, and propose set-based demodulators based on conformal prediction. Numerical results confirm the theoretical validity of the proposed demodulators, and bring insights into their average prediction set size efficiency.
△ Less
Submitted 10 October, 2022;
originally announced October 2022.