Search | arXiv e-print repository

Seq-to-Final: A Benchmark for Tuning from Sequential Distributions to a Final Time Point

Authors: Christina X Ji, Ahmed M Alaa, David Sontag

Abstract: Distribution shift over time occurs in many settings. Leveraging historical data is necessary to learn a model for the last time point when limited data is available in the final period, yet few methods have been developed specifically for this purpose. In this work, we construct a benchmark with different sequences of synthetic shifts to evaluate the effectiveness of 3 classes of methods that 1)… ▽ More Distribution shift over time occurs in many settings. Leveraging historical data is necessary to learn a model for the last time point when limited data is available in the final period, yet few methods have been developed specifically for this purpose. In this work, we construct a benchmark with different sequences of synthetic shifts to evaluate the effectiveness of 3 classes of methods that 1) learn from all data without adapting to the final period, 2) learn from historical data with no regard to the sequential nature and then adapt to the final period, and 3) leverage the sequential nature of historical data when tailoring a model to the final period. We call this benchmark Seq-to-Final to highlight the focus on using a sequence of time periods to learn a model for the final time point. Our synthetic benchmark allows users to construct sequences with different types of shift and compare different methods. We focus on image classification tasks using CIFAR-10 and CIFAR-100 as the base images for the synthetic sequences. We also evaluate the same methods on the Portraits dataset to explore the relevance to real-world shifts over time. Finally, we create a visualization to contrast the initializations and updates from different methods at the final time step. Our results suggest that, for the sequences in our benchmark, methods that disregard the sequential structure and adapt to the final time point tend to perform well. The approaches we evaluate that leverage the sequential nature do not offer any improvement. We hope that this benchmark will inspire the development of new algorithms that are better at leveraging sequential historical data or a deeper understanding of why methods that disregard the sequential nature are able to perform well. △ Less

Submitted 12 July, 2024; originally announced July 2024.

arXiv:2403.00177 [pdf, other]

Med-Real2Sim: Non-Invasive Medical Digital Twins using Physics-Informed Self-Supervised Learning

Authors: Keying Kuang, Frances Dean, Jack B. Jedlicki, David Ouyang, Anthony Philippakis, David Sontag, Ahmed M. Alaa

Abstract: A digital twin is a virtual replica of a real-world physical phenomena that uses mathematical modeling to characterize and simulate its defining features. By constructing digital twins for disease processes, we can perform in-silico simulations that mimic patients' health conditions and counterfactual outcomes under hypothetical interventions in a virtual setting. This eliminates the need for inva… ▽ More A digital twin is a virtual replica of a real-world physical phenomena that uses mathematical modeling to characterize and simulate its defining features. By constructing digital twins for disease processes, we can perform in-silico simulations that mimic patients' health conditions and counterfactual outcomes under hypothetical interventions in a virtual setting. This eliminates the need for invasive procedures or uncertain treatment decisions. In this paper, we propose a method to identify digital twin model parameters using only noninvasive patient health data. We approach the digital twin modeling as a composite inverse problem, and observe that its structure resembles pretraining and finetuning in self-supervised learning (SSL). Leveraging this, we introduce a physics-informed SSL algorithm that initially pretrains a neural network on the pretext task of learning a differentiable simulator of a physiological process. Subsequently, the model is trained to reconstruct physiological measurements from noninvasive modalities while being constrained by the physical equations learned in pretraining. We apply our method to identify digital twins of cardiac hemodynamics using noninvasive echocardiogram videos, and demonstrate its utility in unsupervised disease detection and in-silico clinical trials. △ Less

Submitted 28 May, 2024; v1 submitted 29 February, 2024; originally announced March 2024.

arXiv:2402.07307 [pdf, other]

Self-Consistent Conformal Prediction

Authors: Lars van der Laan, Ahmed M. Alaa

Abstract: In decision-making guided by machine learning, decision-makers may take identical actions in contexts with identical predicted outcomes. Conformal prediction helps decision-makers quantify uncertainty in point predictions of outcomes, allowing for better risk management for actions. Motivated by this perspective, we introduce \textit{Self-Consistent Conformal Prediction} for regression, which comb… ▽ More In decision-making guided by machine learning, decision-makers may take identical actions in contexts with identical predicted outcomes. Conformal prediction helps decision-makers quantify uncertainty in point predictions of outcomes, allowing for better risk management for actions. Motivated by this perspective, we introduce \textit{Self-Consistent Conformal Prediction} for regression, which combines two post-hoc approaches -- Venn-Abers calibration and conformal prediction -- to provide calibrated point predictions and compatible prediction intervals that are valid conditional on model predictions. Our procedure can be applied post-hoc to any black-box model to provide predictions and inferences with finite-sample prediction-conditional guarantees. Numerical experiments show our approach strikes a balance between interval efficiency and conditional validity. △ Less

Submitted 22 April, 2024; v1 submitted 11 February, 2024; originally announced February 2024.

arXiv:2310.09926 [pdf, other]

Estimating Uncertainty in Multimodal Foundation Models using Public Internet Data

Authors: Shiladitya Dutta, Hongbo Wei, Lars van der Laan, Ahmed M. Alaa

Abstract: Foundation models are trained on vast amounts of data at scale using self-supervised learning, enabling adaptation to a wide range of downstream tasks. At test time, these models exhibit zero-shot capabilities through which they can classify previously unseen (user-specified) categories. In this paper, we address the problem of quantifying uncertainty in these zero-shot predictions. We propose a h… ▽ More Foundation models are trained on vast amounts of data at scale using self-supervised learning, enabling adaptation to a wide range of downstream tasks. At test time, these models exhibit zero-shot capabilities through which they can classify previously unseen (user-specified) categories. In this paper, we address the problem of quantifying uncertainty in these zero-shot predictions. We propose a heuristic approach for uncertainty estimation in zero-shot settings using conformal prediction with web data. Given a set of classes at test time, we conduct zero-shot classification with CLIP-style models using a prompt template, e.g., "an image of a <category>", and use the same template as a search query to source calibration data from the open web. Given a web-based calibration set, we apply conformal prediction with a novel conformity score that accounts for potential errors in retrieved web data. We evaluate the utility of our proposed method in Biomedical foundation models; our preliminary results show that web-based conformal prediction sets achieve the target coverage with satisfactory efficiency on a variety of biomedical datasets. △ Less

Submitted 26 November, 2023; v1 submitted 15 October, 2023; originally announced October 2023.

arXiv:2310.00390 [pdf, other]

InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists

Authors: Yulu Gan, Sungwoo Park, Alexander Schubert, Anthony Philippakis, Ahmed M. Alaa

Abstract: Recent advances in generative diffusion models have enabled text-controlled synthesis of realistic and diverse images with impressive quality. Despite these remarkable advances, the application of text-to-image generative models in computer vision for standard visual recognition tasks remains limited. The current de facto approach for these tasks is to design model architectures and loss functions… ▽ More Recent advances in generative diffusion models have enabled text-controlled synthesis of realistic and diverse images with impressive quality. Despite these remarkable advances, the application of text-to-image generative models in computer vision for standard visual recognition tasks remains limited. The current de facto approach for these tasks is to design model architectures and loss functions that are tailored to the task at hand. In this paper, we develop a unified language interface for computer vision tasks that abstracts away task-specific design choices and enables task execution by following natural language instructions. Our approach involves casting multiple computer vision tasks as text-to-image generation problems. Here, the text represents an instruction describing the task, and the resulting image is a visually-encoded task output. To train our model, we pool commonly-used computer vision datasets covering a range of tasks, including segmentation, object detection, depth estimation, and classification. We then use a large language model to paraphrase prompt templates that convey the specific tasks to be conducted on each image, and through this process, we create a multi-modal and multi-task training dataset comprising input and output images along with annotated instructions. Following the InstructPix2Pix architecture, we apply instruction-tuning to a text-to-image diffusion model using our constructed dataset, steering its functionality from a generative model to an instruction-guided multi-task vision learner. Experiments demonstrate that our model, dubbed InstructCV, performs competitively compared to other generalist and task-specific vision models. Moreover, it exhibits compelling generalization capabilities to unseen data, categories, and user instructions. △ Less

Submitted 16 March, 2024; v1 submitted 30 September, 2023; originally announced October 2023.

Comments: ICLR 2024; Code is available at https://github.com/AlaaLab/InstructCV

arXiv:2306.12438 [pdf, other]

Aligning Synthetic Medical Images with Clinical Knowledge using Human Feedback

Authors: Shenghuan Sun, Gregory M. Goldgof, Atul Butte, Ahmed M. Alaa

Abstract: Generative models capable of capturing nuanced clinical features in medical images hold great promise for facilitating clinical data sharing, enhancing rare disease datasets, and efficiently synthesizing annotated medical images at scale. Despite their potential, assessing the quality of synthetic medical images remains a challenge. While modern generative models can synthesize visually-realistic… ▽ More Generative models capable of capturing nuanced clinical features in medical images hold great promise for facilitating clinical data sharing, enhancing rare disease datasets, and efficiently synthesizing annotated medical images at scale. Despite their potential, assessing the quality of synthetic medical images remains a challenge. While modern generative models can synthesize visually-realistic medical images, the clinical validity of these images may be called into question. Domain-agnostic scores, such as FID score, precision, and recall, cannot incorporate clinical knowledge and are, therefore, not suitable for assessing clinical sensibility. Additionally, there are numerous unpredictable ways in which generative models may fail to synthesize clinically plausible images, making it challenging to anticipate potential failures and manually design scores for their detection. To address these challenges, this paper introduces a pathologist-in-the-loop framework for generating clinically-plausible synthetic medical images. Starting with a diffusion model pretrained using real images, our framework comprises three steps: (1) evaluating the generated images by expert pathologists to assess whether they satisfy clinical desiderata, (2) training a reward model that predicts the pathologist feedback on new samples, and (3) incorporating expert knowledge into the diffusion model by using the reward model to inform a finetuning objective. We show that human feedback significantly improves the quality of synthetic images in terms of fidelity, diversity, utility in downstream applications, and plausibility as evaluated by experts. △ Less

Submitted 16 June, 2023; originally announced June 2023.

arXiv:2305.05087 [pdf, other]

Large-Scale Study of Temporal Shift in Health Insurance Claims

Authors: Christina X Ji, Ahmed M Alaa, David Sontag

Abstract: Most machine learning models for predicting clinical outcomes are developed using historical data. Yet, even if these models are deployed in the near future, dataset shift over time may result in less than ideal performance. To capture this phenomenon, we consider a task--that is, an outcome to be predicted at a particular time point--to be non-stationary if a historical model is no longer optimal… ▽ More Most machine learning models for predicting clinical outcomes are developed using historical data. Yet, even if these models are deployed in the near future, dataset shift over time may result in less than ideal performance. To capture this phenomenon, we consider a task--that is, an outcome to be predicted at a particular time point--to be non-stationary if a historical model is no longer optimal for predicting that outcome. We build an algorithm to test for temporal shift either at the population level or within a discovered sub-population. Then, we construct a meta-algorithm to perform a retrospective scan for temporal shift on a large collection of tasks. Our algorithms enable us to perform the first comprehensive evaluation of temporal shift in healthcare to our knowledge. We create 1,010 tasks by evaluating 242 healthcare outcomes for temporal shift from 2015 to 2020 on a health insurance claims dataset. 9.7% of the tasks show temporal shifts at the population level, and 93.0% have some sub-population affected by shifts. We dive into case studies to understand the clinical implications. Our analysis highlights the widespread prevalence of temporal shifts in healthcare. △ Less

Submitted 18 June, 2023; v1 submitted 8 May, 2023; originally announced May 2023.

Comments: To appear as an oral spotlight and poster at Conference on Health, Inference, and Learning (CHIL) 2023

arXiv:2304.01426 [pdf, other]

Conformalized Unconditional Quantile Regression

Authors: Ahmed M. Alaa, Zeshan Hussain, David Sontag

Abstract: We develop a predictive inference procedure that combines conformal prediction (CP) with unconditional quantile regression (QR) -- a commonly used tool in econometrics that involves regressing the recentered influence function (RIF) of the quantile functional over input covariates. Unlike the more widely-known conditional QR, unconditional QR explicitly captures the impact of changes in covariate… ▽ More We develop a predictive inference procedure that combines conformal prediction (CP) with unconditional quantile regression (QR) -- a commonly used tool in econometrics that involves regressing the recentered influence function (RIF) of the quantile functional over input covariates. Unlike the more widely-known conditional QR, unconditional QR explicitly captures the impact of changes in covariate distribution on the quantiles of the marginal distribution of outcomes. Leveraging this property, our procedure issues adaptive predictive intervals with localized frequentist coverage guarantees. It operates by fitting a machine learning model for the RIFs using training data, and then applying the CP procedure for any test covariate with respect to a ``hypothetical'' covariate distribution localized around the new instance. Experiments show that our procedure is adaptive to heteroscedasticity, provides transparent coverage guarantees that are relevant to the test instance at hand, and performs competitively with existing methods in terms of efficiency. △ Less

Submitted 3 April, 2023; originally announced April 2023.

arXiv:2102.08921 [pdf, other]

How Faithful is your Synthetic Data? Sample-level Metrics for Evaluating and Auditing Generative Models

Authors: Ahmed M. Alaa, Boris van Breugel, Evgeny Saveliev, Mihaela van der Schaar

Abstract: Devising domain- and model-agnostic evaluation metrics for generative models is an important and as yet unresolved problem. Most existing metrics, which were tailored solely to the image synthesis setup, exhibit a limited capacity for diagnosing the different modes of failure of generative models across broader application domains. In this paper, we introduce a 3-dimensional evaluation metric, (… ▽ More Devising domain- and model-agnostic evaluation metrics for generative models is an important and as yet unresolved problem. Most existing metrics, which were tailored solely to the image synthesis setup, exhibit a limited capacity for diagnosing the different modes of failure of generative models across broader application domains. In this paper, we introduce a 3-dimensional evaluation metric, ($α$-Precision, $β$-Recall, Authenticity), that characterizes the fidelity, diversity and generalization performance of any generative model in a domain-agnostic fashion. Our metric unifies statistical divergence measures with precision-recall analysis, enabling sample- and distribution-level diagnoses of model fidelity and diversity. We introduce generalization as an additional, independent dimension (to the fidelity-diversity trade-off) that quantifies the extent to which a model copies training data -- a crucial performance indicator when modeling sensitive data with requirements on privacy. The three metric components correspond to (interpretable) probabilistic quantities, and are estimated via sample-level binary classification. The sample-level nature of our metric inspires a novel use case which we call model auditing, wherein we judge the quality of individual samples generated by a (black-box) model, discarding low-quality samples and hence improving the overall model performance in a post-hoc manner. △ Less

Submitted 13 July, 2022; v1 submitted 17 February, 2021; originally announced February 2021.

arXiv:2101.11769 [pdf, other]

Learning Matching Representations for Individualized Organ Transplantation Allocation

Authors: Can Xu, Ahmed M. Alaa, Ioana Bica, Brent D. Ershoff, Maxime Cannesson, Mihaela van der Schaar

Abstract: Organ transplantation is often the last resort for treating end-stage illness, but the probability of a successful transplantation depends greatly on compatibility between donors and recipients. Current medical practice relies on coarse rules for donor-recipient matching, but is short of domain knowledge regarding the complex factors underlying organ compatibility. In this paper, we formulate the… ▽ More Organ transplantation is often the last resort for treating end-stage illness, but the probability of a successful transplantation depends greatly on compatibility between donors and recipients. Current medical practice relies on coarse rules for donor-recipient matching, but is short of domain knowledge regarding the complex factors underlying organ compatibility. In this paper, we formulate the problem of learning data-driven rules for organ matching using observational data for organ allocations and transplant outcomes. This problem departs from the standard supervised learning setup in that it involves matching the two feature spaces (i.e., donors and recipients), and requires estimating transplant outcomes under counterfactual matches not observed in the data. To address these problems, we propose a model based on representation learning to predict donor-recipient compatibility; our model learns representations that cluster donor features, and applies donor-invariant transformations to recipient features to predict outcomes for a given donor-recipient feature instance. Experiments on semi-synthetic and real-world datasets show that our model outperforms state-of-art allocation methods and policies executed by human experts. △ Less

Submitted 1 February, 2021; v1 submitted 27 January, 2021; originally announced January 2021.

Comments: Accepted to AISTATS 2021

arXiv:2007.13825 [pdf, other]

CPAS: the UK's National Machine Learning-based Hospital Capacity Planning System for COVID-19

Authors: Zhaozhi Qian, Ahmed M. Alaa, Mihaela van der Schaar

Abstract: The coronavirus disease 2019 (COVID-19) global pandemic poses the threat of overwhelming healthcare systems with unprecedented demands for intensive care resources. Managing these demands cannot be effectively conducted without a nationwide collective effort that relies on data to forecast hospital demands on the national, regional, hospital and individual levels. To this end, we developed the COV… ▽ More The coronavirus disease 2019 (COVID-19) global pandemic poses the threat of overwhelming healthcare systems with unprecedented demands for intensive care resources. Managing these demands cannot be effectively conducted without a nationwide collective effort that relies on data to forecast hospital demands on the national, regional, hospital and individual levels. To this end, we developed the COVID-19 Capacity Planning and Analysis System (CPAS) - a machine learning-based system for hospital resource planning that we have successfully deployed at individual hospitals and across regions in the UK in coordination with NHS Digital. In this paper, we discuss the main challenges of deploying a machine learning-based decision support system at national scale, and explain how CPAS addresses these challenges by (1) defining the appropriate learning problem, (2) combining bottom-up and top-down analytical approaches, (3) using state-of-the-art machine learning algorithms, (4) integrating heterogeneous data sources, and (5) presenting the result with an interactive and transparent interface. CPAS is one of the first machine learning-based systems to be deployed in hospitals on a national scale to address the COVID-19 pandemic - we conclude the paper with a summary of the lessons learned from this experience. △ Less

Submitted 27 July, 2020; originally announced July 2020.

arXiv:2007.13481 [pdf, other]

Discriminative Jackknife: Quantifying Uncertainty in Deep Learning via Higher-Order Influence Functions

Authors: Ahmed M. Alaa, Mihaela van der Schaar

Abstract: Deep learning models achieve high predictive accuracy across a broad spectrum of tasks, but rigorously quantifying their predictive uncertainty remains challenging. Usable estimates of predictive uncertainty should (1) cover the true prediction targets with high probability, and (2) discriminate between high- and low-confidence prediction instances. Existing methods for uncertainty quantification… ▽ More Deep learning models achieve high predictive accuracy across a broad spectrum of tasks, but rigorously quantifying their predictive uncertainty remains challenging. Usable estimates of predictive uncertainty should (1) cover the true prediction targets with high probability, and (2) discriminate between high- and low-confidence prediction instances. Existing methods for uncertainty quantification are based predominantly on Bayesian neural networks; these may fall short of (1) and (2) -- i.e., Bayesian credible intervals do not guarantee frequentist coverage, and approximate posterior inference undermines discriminative accuracy. In this paper, we develop the discriminative jackknife (DJ), a frequentist procedure that utilizes influence functions of a model's loss functional to construct a jackknife (or leave-one-out) estimator of predictive confidence intervals. The DJ satisfies (1) and (2), is applicable to a wide range of deep learning models, is easy to implement, and can be applied in a post-hoc fashion without interfering with model training or compromising its accuracy. Experiments demonstrate that DJ performs competitively compared to existing Bayesian and non-Bayesian regression baselines. △ Less

Submitted 29 June, 2020; originally announced July 2020.

arXiv:2006.14988 [pdf, other]

Unlabelled Data Improves Bayesian Uncertainty Calibration under Covariate Shift

Authors: Alex J. Chan, Ahmed M. Alaa, Zhaozhi Qian, Mihaela van der Schaar

Abstract: Modern neural networks have proven to be powerful function approximators, providing state-of-the-art performance in a multitude of applications. They however fall short in their ability to quantify confidence in their predictions - this is crucial in high-stakes applications that involve critical decision-making. Bayesian neural networks (BNNs) aim at solving this problem by placing a prior distri… ▽ More Modern neural networks have proven to be powerful function approximators, providing state-of-the-art performance in a multitude of applications. They however fall short in their ability to quantify confidence in their predictions - this is crucial in high-stakes applications that involve critical decision-making. Bayesian neural networks (BNNs) aim at solving this problem by placing a prior distribution over the network's parameters, thereby inducing a posterior distribution that encapsulates predictive uncertainty. While existing variants of BNNs based on Monte Carlo dropout produce reliable (albeit approximate) uncertainty estimates over in-distribution data, they tend to exhibit over-confidence in predictions made on target data whose feature distribution differs from the training data, i.e., the covariate shift setup. In this paper, we develop an approximate Bayesian inference scheme based on posterior regularisation, wherein unlabelled target data are used as "pseudo-labels" of model confidence that are used to regularise the model's loss on labelled source data. We show that this approach significantly improves the accuracy of uncertainty quantification on covariate-shifted data sets, with minimal modification to the underlying model architecture. We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations. △ Less

Submitted 26 June, 2020; originally announced June 2020.

arXiv:2006.13707 [pdf, other]

Frequentist Uncertainty in Recurrent Neural Networks via Blockwise Influence Functions

Authors: Ahmed M. Alaa, Mihaela van der Schaar

Abstract: Recurrent neural networks (RNNs) are instrumental in modelling sequential and time-series data. Yet, when using RNNs to inform decision-making, predictions by themselves are not sufficient; we also need estimates of predictive uncertainty. Existing approaches for uncertainty quantification in RNNs are based predominantly on Bayesian methods; these are computationally prohibitive, and require major… ▽ More Recurrent neural networks (RNNs) are instrumental in modelling sequential and time-series data. Yet, when using RNNs to inform decision-making, predictions by themselves are not sufficient; we also need estimates of predictive uncertainty. Existing approaches for uncertainty quantification in RNNs are based predominantly on Bayesian methods; these are computationally prohibitive, and require major alterations to the RNN architecture and training. Capitalizing on ideas from classical jackknife resampling, we develop a frequentist alternative that: (a) does not interfere with model training or compromise its accuracy, (b) applies to any RNN architecture, and (c) provides theoretical coverage guarantees on the estimated uncertainty intervals. Our method derives predictive uncertainty from the variability of the (jackknife) sampling distribution of the RNN outputs, which is estimated by repeatedly deleting blocks of (temporally-correlated) training data, and collecting the predictions of the RNN re-trained on the remaining data. To avoid exhaustive re-training, we utilize influence functions to estimate the effect of removing training data blocks on the learned RNN parameters. Using data from a critical care setting, we demonstrate the utility of uncertainty quantification in sequential decision-making. △ Less

Submitted 27 June, 2020; v1 submitted 20 June, 2020; originally announced June 2020.

arXiv:2005.08837 [pdf, other]

When and How to Lift the Lockdown? Global COVID-19 Scenario Analysis and Policy Assessment using Compartmental Gaussian Processes

Authors: Zhaozhi Qian, Ahmed M. Alaa, Mihaela van der Schaar

Abstract: The coronavirus disease 2019 (COVID-19) global pandemic has led many countries to impose unprecedented lockdown measures in order to slow down the outbreak. Questions on whether governments have acted promptly enough, and whether lockdown measures can be lifted soon have since been central in public discourse. Data-driven models that predict COVID-19 fatalities under different lockdown policy scen… ▽ More The coronavirus disease 2019 (COVID-19) global pandemic has led many countries to impose unprecedented lockdown measures in order to slow down the outbreak. Questions on whether governments have acted promptly enough, and whether lockdown measures can be lifted soon have since been central in public discourse. Data-driven models that predict COVID-19 fatalities under different lockdown policy scenarios are essential for addressing these questions and informing governments on future policy directions. To this end, this paper develops a Bayesian model for predicting the effects of COVID-19 lockdown policies in a global context -- we treat each country as a distinct data point, and exploit variations of policies across countries to learn country-specific policy effects. Our model utilizes a two-layer Gaussian process (GP) prior -- the lower layer uses a compartmental SEIR (Susceptible, Exposed, Infected, Recovered) model as a prior mean function with "country-and-policy-specific" parameters that capture fatality curves under "counterfactual" policies within each country, whereas the upper layer is shared across all countries, and learns lower-layer SEIR parameters as a function of a country's features and its policy indicators. Our model combines the solid mechanistic foundations of SEIR models (Bayesian priors) with the flexible data-driven modeling and gradient-based optimization routines of machine learning (Bayesian posteriors) -- i.e., the entire model is trained end-to-end via stochastic variational inference. We compare the projections of COVID-19 fatalities by our model with other models listed by the Center for Disease Control (CDC), and provide scenario analyses for various lockdown and reopening strategies highlighting their impact on COVID-19 fatalities. △ Less

Submitted 3 June, 2020; v1 submitted 13 May, 2020; originally announced May 2020.

arXiv:2002.04083 [pdf, other]

Estimating Counterfactual Treatment Outcomes over Time Through Adversarially Balanced Representations

Authors: Ioana Bica, Ahmed M. Alaa, James Jordon, Mihaela van der Schaar

Abstract: Identifying when to give treatments to patients and how to select among multiple treatments over time are important medical problems with a few existing solutions. In this paper, we introduce the Counterfactual Recurrent Network (CRN), a novel sequence-to-sequence model that leverages the increasingly available patient observational data to estimate treatment effects over time and answer such medi… ▽ More Identifying when to give treatments to patients and how to select among multiple treatments over time are important medical problems with a few existing solutions. In this paper, we introduce the Counterfactual Recurrent Network (CRN), a novel sequence-to-sequence model that leverages the increasingly available patient observational data to estimate treatment effects over time and answer such medical questions. To handle the bias from time-varying confounders, covariates affecting the treatment assignment policy in the observational data, CRN uses domain adversarial training to build balancing representations of the patient history. At each timestep, CRN constructs a treatment invariant representation which removes the association between patient history and treatment assignments and thus can be reliably used for making counterfactual predictions. On a simulated model of tumour growth, with varying degree of time-dependent confounding, we show how our model achieves lower error in estimating counterfactuals and in choosing the correct treatment and timing of treatment than current state-of-the-art methods. △ Less

Submitted 10 February, 2020; originally announced February 2020.

Journal ref: In Proc. 8th International Conference on Learning Representations (ICLR 2020)

arXiv:2001.02585 [pdf, other]

Learning Dynamic and Personalized Comorbidity Networks from Event Data using Deep Diffusion Processes

Authors: Zhaozhi Qian, Ahmed M. Alaa, Alexis Bellot, Jem Rashbass, Mihaela van der Schaar

Abstract: Comorbid diseases co-occur and progress via complex temporal patterns that vary among individuals. In electronic health records we can observe the different diseases a patient has, but can only infer the temporal relationship between each co-morbid condition. Learning such temporal patterns from event data is crucial for understanding disease pathology and predicting prognoses. To this end, we dev… ▽ More Comorbid diseases co-occur and progress via complex temporal patterns that vary among individuals. In electronic health records we can observe the different diseases a patient has, but can only infer the temporal relationship between each co-morbid condition. Learning such temporal patterns from event data is crucial for understanding disease pathology and predicting prognoses. To this end, we develop deep diffusion processes (DDP) to model "dynamic comorbidity networks", i.e., the temporal relationships between comorbid disease onsets expressed through a dynamic graph. A DDP comprises events modelled as a multi-dimensional point process, with an intensity function parameterized by the edges of a dynamic weighted graph. The graph structure is modulated by a neural network that maps patient history to edge weights, enabling rich temporal representations for disease trajectories. The DDP parameters decouple into clinically meaningful components, which enables serving the dual purpose of accurate risk prediction and intelligible representation of disease pathology. We illustrate these features in experiments using cancer registry data. △ Less

Submitted 19 January, 2020; v1 submitted 8 January, 2020; originally announced January 2020.

arXiv:1905.12280 [pdf, other]

Lifelong Bayesian Optimization

Authors: Yao Zhang, James Jordon, Ahmed M. Alaa, Mihaela van der Schaar

Abstract: Automatic Machine Learning (Auto-ML) systems tackle the problem of automating the design of prediction models or pipelines for data science. In this paper, we present Lifelong Bayesian Optimization (LBO), an online, multitask Bayesian optimization (BO) algorithm designed to solve the problem of model selection for datasets arriving and evolving over time. To be suitable for "lifelong" Bayesian Opt… ▽ More Automatic Machine Learning (Auto-ML) systems tackle the problem of automating the design of prediction models or pipelines for data science. In this paper, we present Lifelong Bayesian Optimization (LBO), an online, multitask Bayesian optimization (BO) algorithm designed to solve the problem of model selection for datasets arriving and evolving over time. To be suitable for "lifelong" Bayesian Optimization, an algorithm needs to scale with the ever increasing number of acquisitions and should be able to leverage past optimizations in learning the current best model. We cast the problem of model selection as a black-box function optimization problem. In LBO, we exploit the correlation between functions by using components of previously learned functions to speed up the learning process for newly arriving datasets. Experiments on real and synthetic data show that LBO outperforms standard BO algorithms applied repeatedly on the data. △ Less

Submitted 21 June, 2019; v1 submitted 29 May, 2019; originally announced May 2019.

Comments: 17 pages, 8 figures

arXiv:1902.00450 [pdf, other]

Time Series Deconfounder: Estimating Treatment Effects over Time in the Presence of Hidden Confounders

Authors: Ioana Bica, Ahmed M. Alaa, Mihaela van der Schaar

Abstract: The estimation of treatment effects is a pervasive problem in medicine. Existing methods for estimating treatment effects from longitudinal observational data assume that there are no hidden confounders, an assumption that is not testable in practice and, if it does not hold, leads to biased estimates. In this paper, we develop the Time Series Deconfounder, a method that leverages the assignment o… ▽ More The estimation of treatment effects is a pervasive problem in medicine. Existing methods for estimating treatment effects from longitudinal observational data assume that there are no hidden confounders, an assumption that is not testable in practice and, if it does not hold, leads to biased estimates. In this paper, we develop the Time Series Deconfounder, a method that leverages the assignment of multiple treatments over time to enable the estimation of treatment effects in the presence of multi-cause hidden confounders. The Time Series Deconfounder uses a novel recurrent neural network architecture with multitask output to build a factor model over time and infer latent variables that render the assigned treatments conditionally independent; then, it performs causal inference using these latent variables that act as substitutes for the multi-cause unobserved confounders. We provide a theoretical analysis for obtaining unbiased causal effects of time-varying exposures using the Time Series Deconfounder. Using both simulated and real data we show the effectiveness of our method in deconfounding the estimation of treatment responses over time. △ Less

Submitted 18 September, 2020; v1 submitted 1 February, 2019; originally announced February 2019.

Journal ref: In Proc. 37th International Conference on Machine Learning (ICML 2020)

arXiv:1810.10489 [pdf, other]

Forecasting Individualized Disease Trajectories using Interpretable Deep Learning

Authors: Ahmed M. Alaa, Mihaela van der Schaar

Abstract: Disease progression models are instrumental in predicting individual-level health trajectories and understanding disease dynamics. Existing models are capable of providing either accurate predictions of patients prognoses or clinically interpretable representations of disease pathophysiology, but not both. In this paper, we develop the phased attentive state space (PASS) model of disease progressi… ▽ More Disease progression models are instrumental in predicting individual-level health trajectories and understanding disease dynamics. Existing models are capable of providing either accurate predictions of patients prognoses or clinically interpretable representations of disease pathophysiology, but not both. In this paper, we develop the phased attentive state space (PASS) model of disease progression, a deep probabilistic model that captures complex representations for disease progression while maintaining clinical interpretability. Unlike Markovian state space models which assume memoryless dynamics, PASS uses an attention mechanism to induce "memoryful" state transitions, whereby repeatedly updated attention weights are used to focus on past state realizations that best predict future states. This gives rise to complex, non-stationary state dynamics that remain interpretable through the generated attention weights, which designate the relationships between the realized state variables for individual patients. PASS uses phased LSTM units (with time gates controlled by parametrized oscillations) to generate the attention weights in continuous time, which enables handling irregularly-sampled and potentially missing medical observations. Experiments on data from a realworld cohort of patients show that PASS successfully balances the tradeoff between accuracy and interpretability: it demonstrates superior predictive accuracy and learns insightful individual-level representations of disease progression. △ Less

Submitted 24 October, 2018; originally announced October 2018.

arXiv:1802.07207 [pdf, ps, other]

AutoPrognosis: Automated Clinical Prognostic Modeling via Bayesian Optimization with Structured Kernel Learning

Authors: Ahmed M. Alaa, Mihaela van der Schaar

Abstract: Clinical prognostic models derived from largescale healthcare data can inform critical diagnostic and therapeutic decisions. To enable off-theshelf usage of machine learning (ML) in prognostic research, we developed AUTOPROGNOSIS: a system for automating the design of predictive modeling pipelines tailored for clinical prognosis. AUTOPROGNOSIS optimizes ensembles of pipeline configurations efficie… ▽ More Clinical prognostic models derived from largescale healthcare data can inform critical diagnostic and therapeutic decisions. To enable off-theshelf usage of machine learning (ML) in prognostic research, we developed AUTOPROGNOSIS: a system for automating the design of predictive modeling pipelines tailored for clinical prognosis. AUTOPROGNOSIS optimizes ensembles of pipeline configurations efficiently using a novel batched Bayesian optimization (BO) algorithm that learns a low-dimensional decomposition of the pipelines high-dimensional hyperparameter space in concurrence with the BO procedure. This is achieved by modeling the pipelines performances as a black-box function with a Gaussian process prior, and modeling the similarities between the pipelines baseline algorithms via a sparse additive kernel with a Dirichlet prior. Meta-learning is used to warmstart BO with external data from similar patient cohorts by calibrating the priors using an algorithm that mimics the empirical Bayes method. The system automatically explains its predictions by presenting the clinicians with logical association rules that link patients features to predicted risk strata. We demonstrate the utility of AUTOPROGNOSIS using 10 major patient cohorts representing various aspects of cardiovascular patient care. △ Less

Submitted 20 February, 2018; originally announced February 2018.

arXiv:1712.08914 [pdf, ps, other]

doi 10.1109/JSTSP.2018.2848230

Bayesian Nonparametric Causal Inference: Information Rates and Learning Algorithms

Authors: Ahmed M. Alaa, Mihaela van der Schaar

Abstract: We investigate the problem of estimating the causal effect of a treatment on individual subjects from observational data, this is a central problem in various application domains, including healthcare, social sciences, and online advertising. Within the Neyman Rubin potential outcomes model, we use the Kullback Leibler (KL) divergence between the estimated and true distributions as a measure of ac… ▽ More We investigate the problem of estimating the causal effect of a treatment on individual subjects from observational data, this is a central problem in various application domains, including healthcare, social sciences, and online advertising. Within the Neyman Rubin potential outcomes model, we use the Kullback Leibler (KL) divergence between the estimated and true distributions as a measure of accuracy of the estimate, and we define the information rate of the Bayesian causal inference procedure as the (asymptotic equivalence class of the) expected value of the KL divergence between the estimated and true distributions as a function of the number of samples. Using Fano method, we establish a fundamental limit on the information rate that can be achieved by any Bayesian estimator, and show that this fundamental limit is independent of the selection bias in the observational data. We characterize the Bayesian priors on the potential (factual and counterfactual) outcomes that achieve the optimal information rate. As a consequence, we show that a particular class of priors that have been widely used in the causal inference literature cannot achieve the optimal information rate. On the other hand, a broader class of priors can achieve the optimal information rate. We go on to propose a prior adaptation procedure (which we call the information based empirical Bayes procedure) that optimizes the Bayesian prior by maximizing an information theoretic criterion on the recovered causal effects rather than maximizing the marginal likelihood of the observed (factual) data. Building on our analysis, we construct an information optimal Bayesian causal inference algorithm. △ Less

Submitted 21 January, 2018; v1 submitted 24 December, 2017; originally announced December 2017.

arXiv:1706.05966 [pdf, ps, other]

Deep Counterfactual Networks with Propensity-Dropout

Authors: Ahmed M. Alaa, Michael Weisz, Mihaela van der Schaar

Abstract: We propose a novel approach for inferring the individualized causal effects of a treatment (intervention) from observational data. Our approach conceptualizes causal inference as a multitask learning problem; we model a subject's potential outcomes using a deep multitask network with a set of shared layers among the factual and counterfactual outcomes, and a set of outcome-specific layers. The imp… ▽ More We propose a novel approach for inferring the individualized causal effects of a treatment (intervention) from observational data. Our approach conceptualizes causal inference as a multitask learning problem; we model a subject's potential outcomes using a deep multitask network with a set of shared layers among the factual and counterfactual outcomes, and a set of outcome-specific layers. The impact of selection bias in the observational data is alleviated via a propensity-dropout regularization scheme, in which the network is thinned for every training example via a dropout probability that depends on the associated propensity score. The network is trained in alternating phases, where in each phase we use the training examples of one of the two potential outcomes (treated and control populations) to update the weights of the shared layers and the respective outcome-specific layers. Experiments conducted on data based on a real-world observational study show that our algorithm outperforms the state-of-the-art. △ Less

Submitted 19 June, 2017; originally announced June 2017.

arXiv:1705.07674 [pdf, ps, other]

Individualized Risk Prognosis for Critical Care Patients: A Multi-task Gaussian Process Model

Authors: Ahmed M. Alaa, Jinsung Yoon, Scott Hu, Mihaela van der Schaar

Abstract: We report the development and validation of a data-driven real-time risk score that provides timely assessments for the clinical acuity of ward patients based on their temporal lab tests and vital signs, which allows for timely intensive care unit (ICU) admissions. Unlike the existing risk scoring technologies, the proposed score is individualized; it uses the electronic health record (EHR) data t… ▽ More We report the development and validation of a data-driven real-time risk score that provides timely assessments for the clinical acuity of ward patients based on their temporal lab tests and vital signs, which allows for timely intensive care unit (ICU) admissions. Unlike the existing risk scoring technologies, the proposed score is individualized; it uses the electronic health record (EHR) data to cluster the patients based on their static covariates into subcohorts of similar patients, and then learns a separate temporal, non-stationary multi-task Gaussian Process (GP) model that captures the physiology of every subcohort. Experiments conducted on data from a heterogeneous cohort of 6,094 patients admitted to the Ronald Reagan UCLA medical center show that our risk score significantly outperforms the state-of-the-art risk scoring technologies, such as the Rothman index and MEWS, in terms of timeliness, true positive rate (TPR), and positive predictive value (PPV). In particular, the proposed score increases the AUC with 20% and 38% as compared to Rothman index and MEWS respectively, and can predict ICU admissions 8 hours before clinicians at a PPV of 35% and a TPR of 50%. Moreover, we show that the proposed risk score allows for better decisions on when to discharge clinically stable patients from the ward, thereby improving the efficiency of hospital resource utilization. △ Less

Submitted 22 May, 2017; originally announced May 2017.

arXiv:1705.05267 [pdf, ps, other]

Learning from Clinical Judgments: Semi-Markov-Modulated Marked Hawkes Processes for Risk Prognosis

Authors: Ahmed M. Alaa, Scott Hu, Mihaela van der Schaar

Abstract: Critically ill patients in regular wards are vulnerable to unanticipated adverse events which require prompt transfer to the intensive care unit (ICU). To allow for accurate prognosis of deteriorating patients, we develop a novel continuous-time probabilistic model for a monitored patient's temporal sequence of physiological data. Our model captures "informatively sampled" patient episodes: the cl… ▽ More Critically ill patients in regular wards are vulnerable to unanticipated adverse events which require prompt transfer to the intensive care unit (ICU). To allow for accurate prognosis of deteriorating patients, we develop a novel continuous-time probabilistic model for a monitored patient's temporal sequence of physiological data. Our model captures "informatively sampled" patient episodes: the clinicians' decisions on when to observe a hospitalized patient's vital signs and lab tests over time are represented by a marked Hawkes process, with intensity parameters that are modulated by the patient's latent clinical states, and with observable physiological data (mark process) modeled as a switching multi-task Gaussian process. In addition, our model captures "informatively censored" patient episodes by representing the patient's latent clinical states as an absorbing semi-Markov jump process. The model parameters are learned from offline patient episodes in the electronic health records via an EM-based algorithm. Experiments conducted on a cohort of patients admitted to a major medical center over a 3-year period show that risk prognosis based on our model significantly outperforms the currently deployed medical risk scores and other baseline machine learning algorithms. △ Less

Submitted 15 May, 2017; originally announced May 2017.

arXiv:1704.03458 [pdf]

Personalized Survival Predictions for Cardiac Transplantation via Trees of Predictors

Authors: J. Yoon, W. R. Zame, A. Banerjee, M. Cadeiras, A. M. Alaa, M. van der Schaar

Abstract: Given the limited pool of donor organs, accurate predictions of survival on the wait list and post transplantation are crucial for cardiac transplantation decisions and policy. However, current clinical risk scores do not yield accurate predictions. We develop a new methodology (ToPs, Trees of Predictors) built on the principle that specific predictors should be used for specific clusters within t… ▽ More Given the limited pool of donor organs, accurate predictions of survival on the wait list and post transplantation are crucial for cardiac transplantation decisions and policy. However, current clinical risk scores do not yield accurate predictions. We develop a new methodology (ToPs, Trees of Predictors) built on the principle that specific predictors should be used for specific clusters within the target population. ToPs discovers these specific clusters of patients and the specific predictor that perform best for each cluster. In comparison with current clinical risk scoring systems, our method provides significant improvements in the prediction of survival time on the wait list and post transplantation. For example, in terms of 3 month survival for patients who were on the US patient wait list in the period 1985 to 2015, our method achieves AUC of 0.847, the best commonly used clinical risk score (MAGGIC) achieves 0.630. In terms of 3 month survival/mortality predictions (in comparison to MAGGIC), holding specificity at 80.0 percents, our algorithm correctly predicts survival for 1,228 (26.0 percents more patients out of 4,723 who actually survived, holding sensitivity at 80.0 percents, our algorithm correctly predicts mortality for 839 (33.0 percents) more patients out of 2,542 who did not survive. Our method achieves similar improvements for other time horizons and for predictions post transplantation. Therefore, we offer a more accurate, personalized approach to survival analysis that can benefit patients, clinicians and policymakers in making clinical decisions and setting clinical policy. Because risk prediction is widely used in diagnostic and prognostic clinical decision making across diseases and clinical specialties, the implications of our methods are far reaching. △ Less

Submitted 11 April, 2017; originally announced April 2017.

Comments: Main manuscript: 20 pages, Supplementary materials: 13 pages, 5 figures, 3 tables. Submitted to Science Translational Medicine

arXiv:1704.02801 [pdf, other]

Bayesian Inference of Individualized Treatment Effects using Multi-task Gaussian Processes

Authors: Ahmed M. Alaa, Mihaela van der Schaar

Abstract: Predicated on the increasing abundance of electronic health records, we investi- gate the problem of inferring individualized treatment effects using observational data. Stemming from the potential outcomes model, we propose a novel multi- task learning framework in which factual and counterfactual outcomes are mod- eled as the outputs of a function in a vector-valued reproducing kernel Hilbert sp… ▽ More Predicated on the increasing abundance of electronic health records, we investi- gate the problem of inferring individualized treatment effects using observational data. Stemming from the potential outcomes model, we propose a novel multi- task learning framework in which factual and counterfactual outcomes are mod- eled as the outputs of a function in a vector-valued reproducing kernel Hilbert space (vvRKHS). We develop a nonparametric Bayesian method for learning the treatment effects using a multi-task Gaussian process (GP) with a linear coregion- alization kernel as a prior over the vvRKHS. The Bayesian approach allows us to compute individualized measures of confidence in our estimates via pointwise credible intervals, which are crucial for realizing the full potential of precision medicine. The impact of selection bias is alleviated via a risk-based empirical Bayes method for adapting the multi-task GP prior, which jointly minimizes the empirical error in factual outcomes and the uncertainty in (unobserved) counter- factual outcomes. We conduct experiments on observational datasets for an inter- ventional social program applied to premature infants, and a left ventricular assist device applied to cardiac patients wait-listed for a heart transplant. In both experi- ments, we show that our method significantly outperforms the state-of-the-art. △ Less

Submitted 28 May, 2017; v1 submitted 10 April, 2017; originally announced April 2017.

arXiv:1612.06007 [pdf, ps, other]

A Hidden Absorbing Semi-Markov Model for Informatively Censored Temporal Data: Learning and Inference

Authors: Ahmed M. Alaa, Mihaela van der Schaar

Abstract: Modeling continuous-time physiological processes that manifest a patient's evolving clinical states is a key step in approaching many problems in healthcare. In this paper, we develop the Hidden Absorbing Semi-Markov Model (HASMM): a versatile probabilistic model that is capable of capturing the modern electronic health record (EHR) data. Unlike exist- ing models, an HASMM accommodates irregularly… ▽ More Modeling continuous-time physiological processes that manifest a patient's evolving clinical states is a key step in approaching many problems in healthcare. In this paper, we develop the Hidden Absorbing Semi-Markov Model (HASMM): a versatile probabilistic model that is capable of capturing the modern electronic health record (EHR) data. Unlike exist- ing models, an HASMM accommodates irregularly sampled, temporally correlated, and informatively censored physiological data, and can describe non-stationary clinical state transitions. Learning an HASMM from the EHR data is achieved via a novel forward- filtering backward-sampling Monte-Carlo EM algorithm that exploits the knowledge of the end-point clinical outcomes (informative censoring) in the EHR data, and implements the E-step by sequentially sampling the patients' clinical states in the reverse-time direction while conditioning on the future states. Real-time inferences are drawn via a forward- filtering algorithm that operates on a virtually constructed discrete-time embedded Markov chain that mirrors the patient's continuous-time state trajectory. We demonstrate the di- agnostic and prognostic utility of the HASMM in a critical care prognosis setting using a real-world dataset for patients admitted to the Ronald Reagan UCLA Medical Center. △ Less

Submitted 27 December, 2016; v1 submitted 18 December, 2016; originally announced December 2016.

arXiv:1611.05146 [pdf, ps, other]

A Semi-Markov Switching Linear Gaussian Model for Censored Physiological Data

Authors: Ahmed M. Alaa, Jinsung Yoon, Scott Hu, Mihaela van der Schaar

Abstract: Critically ill patients in regular wards are vulnerable to unanticipated clinical dete- rioration which requires timely transfer to the intensive care unit (ICU). To allow for risk scoring and patient monitoring in such a setting, we develop a novel Semi- Markov Switching Linear Gaussian Model (SSLGM) for the inpatients' physiol- ogy. The model captures the patients' latent clinical states and the… ▽ More Critically ill patients in regular wards are vulnerable to unanticipated clinical dete- rioration which requires timely transfer to the intensive care unit (ICU). To allow for risk scoring and patient monitoring in such a setting, we develop a novel Semi- Markov Switching Linear Gaussian Model (SSLGM) for the inpatients' physiol- ogy. The model captures the patients' latent clinical states and their corresponding observable lab tests and vital signs. We present an efficient unsupervised learn- ing algorithm that capitalizes on the informatively censored data in the electronic health records (EHR) to learn the parameters of the SSLGM; the learned model is then used to assess the new inpatients' risk for clinical deterioration in an online fashion, allowing for timely ICU admission. Experiments conducted on a het- erogeneous cohort of 6,094 patients admitted to a large academic medical center show that the proposed model significantly outperforms the currently deployed risk scores such as Rothman index, MEWS, SOFA and APACHE. △ Less

Submitted 16 November, 2016; originally announced November 2016.

arXiv:1611.03934 [pdf, other]

Personalized Donor-Recipient Matching for Organ Transplantation

Authors: Jinsung Yoon, Ahmed M. Alaa, Martin Cadeiras, Mihaela van der Schaar

Abstract: Organ transplants can improve the life expectancy and quality of life for the recipient but carries the risk of serious post-operative complications, such as septic shock and organ rejection. The probability of a successful transplant depends in a very subtle fashion on compatibility between the donor and the recipient but current medical practice is short of domain knowledge regarding the complex… ▽ More Organ transplants can improve the life expectancy and quality of life for the recipient but carries the risk of serious post-operative complications, such as septic shock and organ rejection. The probability of a successful transplant depends in a very subtle fashion on compatibility between the donor and the recipient but current medical practice is short of domain knowledge regarding the complex nature of recipient-donor compatibility. Hence a data-driven approach for learning compatibility has the potential for significant improvements in match quality. This paper proposes a novel system (ConfidentMatch) that is trained using data from electronic health records. ConfidentMatch predicts the success of an organ transplant (in terms of the 3 year survival rates) on the basis of clinical and demographic traits of the donor and recipient. ConfidentMatch captures the heterogeneity of the donor and recipient traits by optimally dividing the feature space into clusters and constructing different optimal predictive models to each cluster. The system controls the complexity of the learned predictive model in a way that allows for assuring more granular and confident predictions for a larger number of potential recipient-donor pairs, thereby ensuring that predictions are "personalized" and tailored to individual characteristics to the finest possible granularity. Experiments conducted on the UNOS heart transplant dataset show the superiority of the prognostic value of ConfidentMatch to other competing benchmarks; ConfidentMatch can provide predictions of success with 95% confidence for 5,489 patients of a total population of 9,620 patients, which corresponds to 410 more patients than the most competitive benchmark algorithm (DeepBoost). △ Less

Submitted 11 November, 2016; originally announced November 2016.

arXiv:1610.08853 [pdf, ps, other]

Personalized Risk Scoring for Critical Care Prognosis using Mixtures of Gaussian Processes

Authors: Ahmed M. Alaa, Jinsung Yoon, Scott Hu, Mihaela van der Schaar

Abstract: Objective: In this paper, we develop a personalized real-time risk scoring algorithm that provides timely and granular assessments for the clinical acuity of ward patients based on their (temporal) lab tests and vital signs; the proposed risk scoring system ensures timely intensive care unit (ICU) admissions for clinically deteriorating patients. Methods: The risk scoring system learns a set of la… ▽ More Objective: In this paper, we develop a personalized real-time risk scoring algorithm that provides timely and granular assessments for the clinical acuity of ward patients based on their (temporal) lab tests and vital signs; the proposed risk scoring system ensures timely intensive care unit (ICU) admissions for clinically deteriorating patients. Methods: The risk scoring system learns a set of latent patient subtypes from the offline electronic health record data, and trains a mixture of Gaussian Process (GP) experts, where each expert models the physiological data streams associated with a specific patient subtype. Transfer learning techniques are used to learn the relationship between a patient's latent subtype and her static admission information (e.g. age, gender, transfer status, ICD-9 codes, etc). Results: Experiments conducted on data from a heterogeneous cohort of 6,321 patients admitted to Ronald Reagan UCLA medical center show that our risk score significantly and consistently outperforms the currently deployed risk scores, such as the Rothman index, MEWS, APACHE and SOFA scores, in terms of timeliness, true positive rate (TPR), and positive predictive value (PPV). Conclusion: Our results reflect the importance of adopting the concepts of personalized medicine in critical care settings; significant accuracy and timeliness gains can be achieved by accounting for the patients' heterogeneity. Significance: The proposed risk scoring methodology can confer huge clinical and social benefits on more than 200,000 critically ill inpatient who exhibit cardiac arrests in the US every year. △ Less

Submitted 27 October, 2016; originally announced October 2016.

arXiv:1610.07505 [pdf, ps, other]

Balancing Suspense and Surprise: Timely Decision Making with Endogenous Information Acquisition

Authors: Ahmed M. Alaa, Mihaela van der Schaar

Abstract: We develop a Bayesian model for decision-making under time pressure with endogenous information acquisition. In our model, the decision maker decides when to observe (costly) information by sampling an underlying continuous-time stochastic process (time series) that conveys information about the potential occurrence or non-occurrence of an adverse event which will terminate the decision-making pro… ▽ More We develop a Bayesian model for decision-making under time pressure with endogenous information acquisition. In our model, the decision maker decides when to observe (costly) information by sampling an underlying continuous-time stochastic process (time series) that conveys information about the potential occurrence or non-occurrence of an adverse event which will terminate the decision-making process. In her attempt to predict the occurrence of the adverse event, the decision-maker follows a policy that determines when to acquire information from the time series (continuation), and when to stop acquiring information and make a final prediction (stopping). We show that the optimal policy has a rendezvous structure, i.e. a structure in which whenever a new information sample is gathered from the time series, the optimal "date" for acquiring the next sample becomes computable. The optimal interval between two information samples balances a trade-off between the decision maker's surprise, i.e. the drift in her posterior belief after observing new information, and suspense, i.e. the probability that the adverse event occurs in the time interval between two information samples. Moreover, we characterize the continuation and stopping regions in the decision-maker's state-space, and show that they depend not only on the decision-maker's beliefs, but also on the context, i.e. the current realization of the time series. △ Less

Submitted 24 October, 2016; originally announced October 2016.

arXiv:1605.00959 [pdf, ps, other]

Personalized Risk Scoring for Critical Care Patients using Mixtures of Gaussian Process Experts

Authors: Ahmed M. Alaa, Jinsung Yoon, Scott Hu, Mihaela van der Schaar

Abstract: We develop a personalized real time risk scoring algorithm that provides timely and granular assessments for the clinical acuity of ward patients based on their (temporal) lab tests and vital signs. Heterogeneity of the patients population is captured via a hierarchical latent class model. The proposed algorithm aims to discover the number of latent classes in the patients population, and train a… ▽ More We develop a personalized real time risk scoring algorithm that provides timely and granular assessments for the clinical acuity of ward patients based on their (temporal) lab tests and vital signs. Heterogeneity of the patients population is captured via a hierarchical latent class model. The proposed algorithm aims to discover the number of latent classes in the patients population, and train a mixture of Gaussian Process (GP) experts, where each expert models the physiological data streams associated with a specific class. Self-taught transfer learning is used to transfer the knowledge of latent classes learned from the domain of clinically stable patients to the domain of clinically deteriorating patients. For new patients, the posterior beliefs of all GP experts about the patient's clinical status given her physiological data stream are computed, and a personalized risk score is evaluated as a weighted average of those beliefs, where the weights are learned from the patient's hospital admission information. Experiments on a heterogeneous cohort of 6,313 patients admitted to Ronald Regan UCLA medical center show that our risk score outperforms the currently deployed risk scores, such as MEWS and Rothman scores. △ Less

Submitted 3 May, 2016; originally announced May 2016.

arXiv:1602.00374 [pdf, ps, other]

ConfidentCare: A Clinical Decision Support System for Personalized Breast Cancer Screening

Authors: Ahmed M. Alaa, Kyeong H. Moon, William Hsu, Mihaela van der Schaar

Abstract: Breast cancer screening policies attempt to achieve timely diagnosis by the regular screening of apparently healthy women. Various clinical decisions are needed to manage the screening process; those include: selecting the screening tests for a woman to take, interpreting the test outcomes, and deciding whether or not a woman should be referred to a diagnostic test. Such decisions are currently gu… ▽ More Breast cancer screening policies attempt to achieve timely diagnosis by the regular screening of apparently healthy women. Various clinical decisions are needed to manage the screening process; those include: selecting the screening tests for a woman to take, interpreting the test outcomes, and deciding whether or not a woman should be referred to a diagnostic test. Such decisions are currently guided by clinical practice guidelines (CPGs), which represent a one-size-fits-all approach that are designed to work well on average for a population, without guaranteeing that it will work well uniformly over that population. Since the risks and benefits of screening are functions of each patients features, personalized screening policies that are tailored to the features of individuals are needed in order to ensure that the right tests are recommended to the right woman. In order to address this issue, we present ConfidentCare: a computer-aided clinical decision support system that learns a personalized screening policy from the electronic health record (EHR) data. ConfidentCare operates by recognizing clusters of similar patients, and learning the best screening policy to adopt for each cluster. A cluster of patients is a set of patients with similar features (e.g. age, breast density, family history, etc.), and the screening policy is a set of guidelines on what actions to recommend for a woman given her features and screening test scores. ConfidentCare algorithm ensures that the policy adopted for every cluster of patients satisfies a predefined accuracy requirement with a high level of confidence. We show that our algorithm outperforms the current CPGs in terms of cost-efficiency and false positive rates. △ Less

Submitted 31 January, 2016; originally announced February 2016.

arXiv:1511.02429 [pdf, ps, other]

A Micro-foundation of Social Capital in Evolving Social Networks

Authors: Ahmed M. Alaa, Kartik Ahuja, Mihaela van der Schaar

Abstract: A social network confers benefits and advantages on individuals (and on groups), the literature refers to these advantages as social capital. This paper presents a micro-founded mathematical model of the evolution of a social network and of the social capital of individuals within the network. The evolution of the network is influenced by the extent to which individuals are homophilic, structurall… ▽ More A social network confers benefits and advantages on individuals (and on groups), the literature refers to these advantages as social capital. This paper presents a micro-founded mathematical model of the evolution of a social network and of the social capital of individuals within the network. The evolution of the network is influenced by the extent to which individuals are homophilic, structurally opportunistic, socially gregarious and by the distribution of types in the society. In the analysis, we identify different kinds of social capital: bonding capital, popularity capital, and bridging capital. Bonding capital is created by forming a circle of connections, homophily increases bonding capital because it makes this circle of connections more homogeneous. Popularity capital leads to preferential attachment: individuals who become popular tend to become more popular because others are more likely to link to them. Homophily creates asymmetries in the levels of popularity attained by different social groups, more gregarious types of agents are more likely to become popular. However, in homophilic societies, individuals who belong to less gregarious, less opportunistic, or major types are likely to be more central in the network and thus acquire a bridging capital. △ Less

Submitted 7 November, 2015; originally announced November 2015.

Comments: Centrality, homophily, network formation, popularity, preferential attachment, social capital, social networks

arXiv:1508.00205 [pdf, ps, other]

Evolution of Social Networks: A Microfounded Model

Authors: Ahmed M. Alaa, Kartik Ahuja, Mihaela van der Schaar

Abstract: Many societies are organized in networks that are formed by people who meet and interact over time. In this paper, we present a first model to capture the micro-foundations of social networks evolution, where boundedly rational agents of different types join the network; meet other agents stochastically over time; and consequently decide to form social ties. A basic premise of our model is that in… ▽ More Many societies are organized in networks that are formed by people who meet and interact over time. In this paper, we present a first model to capture the micro-foundations of social networks evolution, where boundedly rational agents of different types join the network; meet other agents stochastically over time; and consequently decide to form social ties. A basic premise of our model is that in real-world networks, agents form links by reasoning about the benefits that agents they meet over time can bestow. We study the evolution of the emerging networks in terms of friendship and popularity acquisition given the following exogenous parameters: structural opportunism, type distribution, homophily, and social gregariousness. We show that the time needed for an agent to find "friends" is influenced by the exogenous parameters: agents who are more gregarious, more homophilic, less opportunistic, or belong to a type "minority" spend a longer time on average searching for friendships. Moreover, we show that preferential attachment is a consequence of an emerging doubly preferential meeting process: a process that guides agents of a certain type to meet more popular similar-type agents with a higher probability, thereby creating asymmetries in the popularity evolution of different types of agents. △ Less

Submitted 14 August, 2015; v1 submitted 2 August, 2015; originally announced August 2015.

arXiv:1503.04768 [pdf, ps, other]

Self-organizing Networks of Information Gathering Cognitive Agents

Authors: Ahmed M. Alaa, Kartik Ahuja, Mihaela Van der Schaar

Abstract: In many scenarios, networks emerge endogenously as cognitive agents establish links in order to exchange information. Network formation has been widely studied in economics, but only on the basis of simplistic models that assume that the value of each additional piece of information is constant. In this paper we present a first model and associated analysis for network formation under the much mor… ▽ More In many scenarios, networks emerge endogenously as cognitive agents establish links in order to exchange information. Network formation has been widely studied in economics, but only on the basis of simplistic models that assume that the value of each additional piece of information is constant. In this paper we present a first model and associated analysis for network formation under the much more realistic assumption that the value of each additional piece of information depends on the type of that piece of information and on the information already possessed: information may be complementary or redundant. We model the formation of a network as a non-cooperative game in which the actions are the formation of links and the benefit of forming a link is the value of the information exchanged minus the cost of forming the link. We characterize the topologies of the networks emerging at a Nash equilibrium (NE) of this game and compare the efficiency of equilibrium networks with the efficiency of centrally designed networks. To quantify the impact of information redundancy and linking cost on social information loss, we provide estimates for the Price of Anarchy (PoA); to quantify the impact on individual information loss we introduce and provide estimates for a measure we call Maximum Information Loss (MIL). Finally, we consider the setting in which agents are not endowed with information, but must produce it. We show that the validity of the well-known "law of the few" depends on how information aggregates; in particular, the "law of the few" fails when information displays complementarities. △ Less

Submitted 12 August, 2015; v1 submitted 16 March, 2015; originally announced March 2015.

arXiv:1408.6427 [pdf, ps, other]

Achievable Degrees-of-Freedom of the K-user SISO Interference Channel with Blind Interference Alignment using Staggered Antenna Switching

Authors: Ahmed M. Alaa, Mahmoud H. Ismail

Abstract: In this letter, we present the first characterization for the achievable Degrees-of-Freedom (DoF) by Blind Interference Alignment (BIA) using staggered antenna switching in the $K$-user Gaussian Interference Channel. In such scheme, each transmitter is equipped with one conventional antenna and each receiver is equipped with one reconfigurable (multi-mode) antenna. Assuming that the channel is kno… ▽ More In this letter, we present the first characterization for the achievable Degrees-of-Freedom (DoF) by Blind Interference Alignment (BIA) using staggered antenna switching in the $K$-user Gaussian Interference Channel. In such scheme, each transmitter is equipped with one conventional antenna and each receiver is equipped with one reconfigurable (multi-mode) antenna. Assuming that the channel is known to the receivers only, we show that BIA can achieve $\frac{2K}{K+2}$ DoF, which surpasses the sum DoF achieved by previously known interference alignment schemes with delayed channel state information at transmitters (CSIT). This result implies that the sum DoF is upper bounded by 2, which means that the best we can do with BIA is to double the DoF achieved by orthogonal multiple access schemes. Moreover, we propose an algorithm to generate the transmit beamforming vectors and the reconfigurable antenna switching patterns, and apply this algorithm to the 4-user SISO Interference Channel, showing that $\frac{4}{3}$ sum DoF is achievable. △ Less

Submitted 21 May, 2016; v1 submitted 27 August, 2014; originally announced August 2014.

arXiv:1408.1025 [pdf, ps, other]

Stable Throughput Region of Cognitive-Relay Networks with Imperfect Sensing and Finite Relaying Buffer

Authors: Ahmed M. Alaa

Abstract: In this letter, we obtain the stable throughput region for a cognitive relaying scheme with a finite relaying buffer and imperfect sensing. The analysis investigates the effect of the secondary user's finite relaying capabilities under different scenarios of primary, secondary and relaying links outages. Furthermore, we demonstrate the effect of miss detection and false alarm probabilities on the… ▽ More In this letter, we obtain the stable throughput region for a cognitive relaying scheme with a finite relaying buffer and imperfect sensing. The analysis investigates the effect of the secondary user's finite relaying capabilities under different scenarios of primary, secondary and relaying links outages. Furthermore, we demonstrate the effect of miss detection and false alarm probabilities on the achievable throughput for the primary and secondary users. △ Less

Submitted 18 March, 2018; v1 submitted 5 August, 2014; originally announced August 2014.

arXiv:1407.1383 [pdf, ps, other]

Opportunistic Beamforming using Dumb Basis Patterns in Multiple Access Cognitive Channels

Authors: Ahmed M. Alaa, Mahmoud H. Ismail, Hazim Tawfik

Abstract: In this paper, we investigate multiuser diversity in interference-limited Multiple Access (MAC) underlay cognitive channels with Line-of-Sight interference (LoS) from the secondary to the primary network. It is shown that for $N$ secondary users, and assuming Rician interference channels, the secondary sum capacity scales like… ▽ More In this paper, we investigate multiuser diversity in interference-limited Multiple Access (MAC) underlay cognitive channels with Line-of-Sight interference (LoS) from the secondary to the primary network. It is shown that for $N$ secondary users, and assuming Rician interference channels, the secondary sum capacity scales like $\log\left(\frac{K^{2}+K}{\mathcal{W}\left(\frac{K e^{K}}{N}\right)}\right)$, where $K$ is the $K$-factor of the Rician channels, and $\mathcal{W}(.)$ is the Lambert W function. Thus, LoS interference hinders the achievable multiuser diversity gain experienced in Rayleigh channels, where the sum capacity grows like $\log(N)$. To overcome this problem, we propose the usage of single radio Electronically Steerable Parasitic Array Radiator (ESPAR) antennas at the secondary mobile terminals. Using ESPAR antennas, we induce artificial fluctuations in the interference channels to restore the $\log(N)$ growth rate by assigning random weights to orthogonal {\it basis patterns}. We term this technique as {\it Random Aerial Beamforming} (RAB). While LoS interference is originally a source of capacity hindrance, we show that using RAB, it can actually be exploited to improve multiuser interference diversity by boosting the {\it effective number of users} with minimal hardware complexity. △ Less

Submitted 5 July, 2014; originally announced July 2014.

arXiv:1406.4162 [pdf]

Band-Sweeping M-ary PSK (BS-M-PSK) Modulation and Transceiver Design

Authors: Ahmed M. Alaa

Abstract: Channel Estimation is a major problem encountered by receiver designers for wireless communications systems. The fading channels encountered by the system are usually time variant for a mobile receiver. Besides, the frequency response of the channel is frequency selective for urban environments where the delay spread is quite large compared to the symbol duration. Estimating the channel is essenti… ▽ More Channel Estimation is a major problem encountered by receiver designers for wireless communications systems. The fading channels encountered by the system are usually time variant for a mobile receiver. Besides, the frequency response of the channel is frequency selective for urban environments where the delay spread is quite large compared to the symbol duration. Estimating the channel is essential for equalizing the received data and removing the Inter-Symbol Interference (ISI) resulting from the dispersive channel. Hence, conventional transceivers insert pilot symbols of known values and detect the changes in it in order to deduce the channel response. Because these pilots carry no information, the throughput of the system is reduced. A Novel modulation scheme is presented in this work. The technique depends on using a carrier signal that has no fixed frequency, the carrier tone sweeps the band dedicated for transmission and detects the transfer function gain within the band. A carrier signal that is Frequency Modulated (FM) by a periodic ramp signal becomes Amplitude Modulated (AM) by the channel transfer function, and thus, the receiver obtains an estimate for the channel response without using pilots that decrease the systems throughput or data rate. The carrier signal itself acts as a dynamic frequency domain pilot. The technique only works for constant energy systems, and thus it is applied to PSK transceivers. Mathematical formulation, transceiver design and performance analysis of the proposed modulation technique are presented. △ Less

Submitted 30 January, 2014; originally announced June 2014.

Comments: To appear in IEEE Potentials Magazine

arXiv:1406.1724 [pdf, ps, other]

Random Aerial Beamforming for Underlay Cognitive Radio with Exposed Secondary Users

Authors: Ahmed M. Alaa, Mahmoud H. Ismail, Hazim Tawfik

Abstract: In this paper, we introduce the exposed secondary users problem in underlay cognitive radio systems, where both the secondary-to-primary and primary-to-secondary channels have a Line-of-Sight (LoS) component. Based on a Rician model for the LoS channels, we show, analytically and numerically, that LoS interference hinders the achievable secondary user capacity when interference constraints are imp… ▽ More In this paper, we introduce the exposed secondary users problem in underlay cognitive radio systems, where both the secondary-to-primary and primary-to-secondary channels have a Line-of-Sight (LoS) component. Based on a Rician model for the LoS channels, we show, analytically and numerically, that LoS interference hinders the achievable secondary user capacity when interference constraints are imposed at the primary user receiver. This is caused by the poor dynamic range of the interference channels fluctuations when a dominant LoS component exists. In order to improve the capacity of such system, we propose the usage of an Electronically Steerable Parasitic Array Radiator (ESPAR) antennas at the secondary terminals. An ESPAR antenna involves a single RF chain and has a reconfigurable radiation pattern that is controlled by assigning arbitrary weights to M orthonormal basis radiation patterns via altering a set of reactive loads. By viewing the orthonormal patterns as multiple virtual dumb antennas, we randomly vary their weights over time creating artificial channel fluctuations that can perfectly eliminate the undesired impact of LoS interference. This scheme is termed as Random Aerial Beamforming (RAB), and is well suited for compact and low cost mobile terminals as it uses a single RF chain. Moreover, we investigate the exposed secondary users problem in a multiuser setting, showing that LoS interference hinders multiuser interference diversity and affects the growth rate of the SU capacity as a function of the number of users. Using RAB, we show that LoS interference can actually be exploited to improve multiuser diversity via opportunistic nulling. △ Less

Submitted 1 August, 2015; v1 submitted 6 June, 2014; originally announced June 2014.

arXiv:1404.6737 [pdf, ps, other]

On the Capacity of the Underwater Acoustic Channel with Dominant Noise Sources

Authors: Mustafa A. Kishk, Ahmed M. Alaa

Abstract: This paper provides an upper-bound for the capacity of the underwater acoustic (UWA) channel with dominant noise sources and generalized fading environments. Previous works have shown that UWA channel noise statistics are not necessary Gaussian, especially in a shallow water environment which is dominated by impulsive noise sources. In this case, noise is best represented by the Generalized Gaussi… ▽ More This paper provides an upper-bound for the capacity of the underwater acoustic (UWA) channel with dominant noise sources and generalized fading environments. Previous works have shown that UWA channel noise statistics are not necessary Gaussian, especially in a shallow water environment which is dominated by impulsive noise sources. In this case, noise is best represented by the Generalized Gaussian (GG) noise model with a shaping parameter $β$. On the other hand, fading in the UWA channel is generally represented using an $α$-$μ$ distribution, which is a generalization of a wide range of well known fading distributions. We show that the Additive White Generalized Gaussian Noise (AWGGN) channel capacity is upper bounded by the AWGN capacity in addition to a constant gap of $\frac{1}{2} \log \left(\frac{β^{2} πe^{1-\frac{2}β} Γ(\frac{3}β)}{2(Γ(\frac{1}β))^{3}} \right)$ bits. The same gap also exists when characterizing the ergodic capacity of AWGGN channels with $α$-$μ$ fading compared to the faded AWGN channel capacity. We justify our results by revisiting the sphere-packing problem, which represents a geometric interpertation of the channel capacity. Moreover, UWA channel secrecy rates are characterized and the dependency of UWA channel secrecy on the shaping parameters of the legitimate and eavesdropper channels is highlighted. △ Less

Submitted 27 April, 2014; originally announced April 2014.

arXiv:1403.7532 [pdf, ps, other]

Opportunistic Spectrum Sharing using Dumb Basis Patterns: The Line-of-Sight Interference Scenario

Authors: Ahmed M. Alaa, Mahmoud H. Ismail, Hazim Tawfik

Abstract: We investigate a spectrum-sharing system with non-severely faded mutual interference links, where both the secondary-to-primary and primary-to-secondary channels have a Line-of-Sight (LoS) component. Based on a Rician model for the LoS channels, we show, analytically and numerically, that LoS interference hinders the achievable secondary user capacity. This is caused by the poor dynamic range of t… ▽ More We investigate a spectrum-sharing system with non-severely faded mutual interference links, where both the secondary-to-primary and primary-to-secondary channels have a Line-of-Sight (LoS) component. Based on a Rician model for the LoS channels, we show, analytically and numerically, that LoS interference hinders the achievable secondary user capacity. This is caused by the poor dynamic range of the interference channels fluctuations when a dominant LoS component exists. In order to improve the capacity of such system, we propose the usage of an Electronically Steerable Parasitic Array Radiator (ESPAR) antenna at the secondary terminals. An ESPAR antenna requires a single RF chain and has a reconfigurable radiation pattern that is controlled by assigning arbitrary weights to M orthonormal basis radiation patterns. By viewing these orthonormal patterns as multiple virtual dumb antennas, we randomly vary their weights over time creating artificial channel fluctuations that can perfectly eliminate the undesired impact of LoS interference. Because the proposed scheme uses a single RF chain, it is well suited for compact and low cost mobile terminals. △ Less

Submitted 28 March, 2014; originally announced March 2014.

arXiv:1403.0930 [pdf, ps, other]

Spectrum Sensing Via Reconfigurable Antennas: Fundamental Limits and Potential Gains

Authors: Ahmed M. Alaa, Mahmoud H. Ismail, Hazim Tawfik

Abstract: We propose a novel paradigm for spectrum sensing in cognitive radio networks that provides diversity and capacity benefits using a single antenna at the Secondary User (SU) receiver. The proposed scheme is based on a reconfigurable antenna: an antenna that is capable of altering its radiation characteristics by changing its geometric configuration. Each configuration is designated as an antenna mo… ▽ More We propose a novel paradigm for spectrum sensing in cognitive radio networks that provides diversity and capacity benefits using a single antenna at the Secondary User (SU) receiver. The proposed scheme is based on a reconfigurable antenna: an antenna that is capable of altering its radiation characteristics by changing its geometric configuration. Each configuration is designated as an antenna mode or state and corresponds to a distinct channel realization. Based on an abstract model for the reconfigurable antenna, we tackle two different settings for the cognitive radio problem and present fundamental limits on the achievable diversity and throughput gains. First, we explore the (to cooperate or not to cooperate) tradeoff between the diversity and coding gains in conventional cooperative and noncooperative spectrum sensing schemes, showing that cooperation is not always beneficial. Based on this analysis, we propose two sensing schemes based on reconfigurable antennas that we term as state switching and state selection. It is shown that each of these schemes outperform both cooperative and non-cooperative spectrum sensing under a global energy constraint. Next, we study the (sensing-throughput) trade-off, and demonstrate that using reconfigurable antennas, the optimal sensing time is reduced allowing for a longer transmission time, and thus better throughput. Moreover, state selection can be applied to boost the capacity of SU transmission. △ Less

Submitted 4 March, 2014; originally announced March 2014.

arXiv:1402.6243 [pdf, ps, other]

Globally Optimal Cooperation in Dense Cognitive Radio Networks

Authors: Ahmed M. Alaa, Omar A. Nasr

Abstract: The problem of calculating the local and global decision thresholds in hard decisions based cooperative spectrum sensing is well known for its mathematical intractability. Previous work relied on simple suboptimal counting rules for decision fusion in order to avoid the exhaustive numerical search required for obtaining the optimal thresholds. However, these simple rules are not globally optimal a… ▽ More The problem of calculating the local and global decision thresholds in hard decisions based cooperative spectrum sensing is well known for its mathematical intractability. Previous work relied on simple suboptimal counting rules for decision fusion in order to avoid the exhaustive numerical search required for obtaining the optimal thresholds. However, these simple rules are not globally optimal as they do not maximize the overall global detection probability by jointly selecting local and global thresholds. Instead, they maximize the detection probability for a specific global threshold. In this paper, a globally optimal decision fusion rule for Primary User signal detection based on the Neyman- Pearson (NP) criterion is derived. The algorithm is based on a novel representation for the global performance metrics in terms of the regularized incomplete beta function. Based on this mathematical representation, it is shown that the globally optimal NP hard decision fusion test can be put in the form of a conventional one dimensional convex optimization problem. A binary search for the global threshold can be applied yielding a complexity of O(log2(N)), where N represents the number of cooperating users. The logarithmic complexity is appreciated because we are concerned with dense networks, and thus N is expected to be large. The proposed optimal scheme outperforms conventional counting rules, such as the OR, AND, and MAJORITY rules. It is shown via simulations that, although the optimal rule tends to the simple OR rule when the number of cooperating secondary users is small, it offers significant SNR gain in dense cognitive radio networks with large number of cooperating users. △ Less

Submitted 25 February, 2014; originally announced February 2014.

arXiv:1402.0993 [pdf, ps, other]

Defeating the Eavesdropper: On the Achievable Secrecy Capacity using Reconfigurable Antennas

Authors: Ahmed M. Alaa

Abstract: In this paper, we consider the transmission of confidential messages over slow fading wireless channels in the presence of an eavesdropper. We propose a transmission scheme that employs a single reconfigurable antenna at each of the legitimate partners, whereas the eavesdropper uses a single conventional antenna. A reconfigurable antenna can switch its propagation characteristics over time and thu… ▽ More In this paper, we consider the transmission of confidential messages over slow fading wireless channels in the presence of an eavesdropper. We propose a transmission scheme that employs a single reconfigurable antenna at each of the legitimate partners, whereas the eavesdropper uses a single conventional antenna. A reconfigurable antenna can switch its propagation characteristics over time and thus it perceives different fading channels. It is shown that without channel side information (CSI) at the legitimate partners, the main channel can be transformed into an ergodic regime offering a \textit{secrecy capacity} gain for strict outage constraints. If the legitimate partners have partial or full channel side information (CSI), a sort of selection diversity can be applied boosting the maximum secret communication rate. In this case, fading acts as a friend not a foe. △ Less

Submitted 5 February, 2014; originally announced February 2014.

arXiv:1401.7772 [pdf, ps, other]

doi 10.1109/WCNC.2014.6952239

Spectrum Sensing Via Reconfigurable Antennas: Is Cooperation of Secondary Users Indispensable?

Authors: Ahmed M. Alaa, Mahmoud H. Ismail, Hazim Tawfik

Abstract: This work presents an analytical framework for characterizing the performance of cooperative and noncooperative spectrum sensing schemes by figuring out the tradeoff between the achieved diversity and coding gains in each scheme. Based on this analysis, we try to answer the fundamental question: can we dispense with SUs cooperation and still achieve an arbitrary diversity gain? It is shown that th… ▽ More This work presents an analytical framework for characterizing the performance of cooperative and noncooperative spectrum sensing schemes by figuring out the tradeoff between the achieved diversity and coding gains in each scheme. Based on this analysis, we try to answer the fundamental question: can we dispense with SUs cooperation and still achieve an arbitrary diversity gain? It is shown that this is indeed possible via a novel technique that can offer diversity gain for a single SU using a single antenna. The technique is based on the usage of a reconfigurable antenna that changes its propagation characteristics over time, thus creating an artificial temporal diversity. It is shown that the usage of reconfigurable antennas outperforms cooperative as well as non-cooperative schemes at low and high Signal-to-Noise Ratios (SNRs). Moreover, if the channel state information is available at the SU, an additional SNR gain can also be achieved. △ Less

Submitted 30 January, 2014; originally announced January 2014.

Comments: Accepted for IEEE WCNC 2014

Showing 1–48 of 48 results for author: Alaa, A M