Zum Hauptinhalt springen

Showing 1–28 of 28 results for author: Awan, J

Searching in archive cs. Search in all archives.
.
  1. arXiv:2409.04387  [pdf, other

    stat.CO cs.CR stat.AP

    Best Linear Unbiased Estimate from Privatized Histograms

    Authors: Jordan Awan, Adam Edwards, Paul Bartholomew, Andrew Sillers

    Abstract: In differential privacy (DP) mechanisms, it can be beneficial to release "redundant" outputs, in the sense that a quantity can be estimated by combining different combinations of privatized values. Indeed, this structure is present in the DP 2020 Decennial Census products published by the U.S. Census Bureau. With this structure, the DP output can be improved by enforcing self-consistency (i.e., es… ▽ More

    Submitted 6 September, 2024; originally announced September 2024.

    Comments: 21 pages before references and appendices, 35 pages total, 2 figures and 6 tables

    MSC Class: 62-08; 62P25; 68P27

  2. arXiv:2408.14441  [pdf, other

    cs.CV cs.AI

    Attend-Fusion: Efficient Audio-Visual Fusion for Video Classification

    Authors: Mahrukh Awan, Asmar Nadeem, Muhammad Junaid Awan, Armin Mustafa, Syed Sameed Husain

    Abstract: Exploiting both audio and visual modalities for video classification is a challenging task, as the existing methods require large model architectures, leading to high computational complexity and resource requirements. Smaller architectures, on the other hand, struggle to achieve optimal performance. In this paper, we propose Attend-Fusion, an audio-visual (AV) fusion approach that introduces a co… ▽ More

    Submitted 26 August, 2024; originally announced August 2024.

  3. arXiv:2406.06231  [pdf, other

    math.ST cs.CR stat.CO

    Statistical Inference for Privatized Data with Unknown Sample Size

    Authors: Jordan Awan, Andres Felipe Barrientos, Nianqiao Ju

    Abstract: We develop both theory and algorithms to analyze privatized data in the unbounded differential privacy(DP), where even the sample size is considered a sensitive quantity that requires privacy protection. We show that the distance between the sampling distributions under unbounded DP and bounded DP goes to zero as the sample size $n$ goes to infinity, provided that the noise used to privatize $n$ i… ▽ More

    Submitted 10 June, 2024; originally announced June 2024.

    Comments: 20 pages before references, 40 pages in total, 4 figures, 3 tables

  4. arXiv:2308.08343  [pdf, other

    cs.CR math.PR math.ST

    Optimizing Noise for $f$-Differential Privacy via Anti-Concentration and Stochastic Dominance

    Authors: Jordan Awan, Aishwarya Ramasethu

    Abstract: In this paper, we establish anti-concentration inequalities for additive noise mechanisms which achieve $f$-differential privacy ($f$-DP), a notion of privacy phrased in terms of a tradeoff function $f$ which limits the ability of an adversary to determine which individuals were in the database. We show that canonical noise distributions (CNDs), proposed by Awan and Vadhan (2023), match the anti-c… ▽ More

    Submitted 6 September, 2024; v1 submitted 16 August, 2023; originally announced August 2023.

    Comments: 10 pages before appendix, 31 pages total, 6 figures

    MSC Class: 68P27; 60E15

  5. arXiv:2305.03609  [pdf, other

    stat.ML cs.CG cs.CR cs.LG math.AT

    Differentially Private Topological Data Analysis

    Authors: Taegyu Kang, Sehwan Kim, Jinwon Sohn, Jordan Awan

    Abstract: This paper is the first to attempt differentially private (DP) topological data analysis (TDA), producing near-optimal private persistence diagrams. We analyze the sensitivity of persistence diagrams in terms of the bottleneck distance, and we show that the commonly used Čech complex has sensitivity that does not decrease as the sample size $n$ increases. This makes it challenging for the persiste… ▽ More

    Submitted 3 November, 2023; v1 submitted 5 May, 2023; originally announced May 2023.

    Comments: 23 pages before references and appendices, 42 pages total, 8 figures

  6. arXiv:2303.05328  [pdf, other

    math.ST cs.CR stat.ME

    Simulation-based, Finite-sample Inference for Privatized Data

    Authors: Jordan Awan, Zhanyu Wang

    Abstract: Privacy protection methods, such as differentially private mechanisms, introduce noise into resulting statistics which often produces complex and intractable sampling distributions. In this paper, we propose a simulation-based "repro sample" approach to produce statistically valid confidence intervals and hypothesis tests, which builds on the work of Xie and Wang (2022). We show that this methodol… ▽ More

    Submitted 2 March, 2024; v1 submitted 9 March, 2023; originally announced March 2023.

    Comments: 25 pages before references and appendices, 42 pages total, 10 figures, 9 tables

  7. arXiv:2210.06140  [pdf, other

    stat.ML cs.CR cs.DS cs.LG

    Differentially Private Bootstrap: New Privacy Analysis and Inference Strategies

    Authors: Zhanyu Wang, Guang Cheng, Jordan Awan

    Abstract: Differentially private (DP) mechanisms protect individual-level information by introducing randomness into the statistical analysis procedure. Despite the availability of numerous DP tools, there remains a lack of general techniques for conducting statistical inference under DP. We examine a DP bootstrap procedure that releases multiple private bootstrap estimates to infer the sampling distributio… ▽ More

    Submitted 21 April, 2023; v1 submitted 12 October, 2022; originally announced October 2022.

  8. arXiv:2208.06236  [pdf, other

    stat.ME cs.CR

    Differentially Private Kolmogorov-Smirnov-Type Tests

    Authors: Jordan Awan, Yue Wang

    Abstract: Hypothesis testing is a central problem in statistical analysis, and there is currently a lack of differentially private tests which are both statistically valid and powerful. In this paper, we develop several new differentially private (DP) nonparametric hypothesis tests. Our tests are based on Kolmogorov-Smirnov, Kuiper, Cramér-von Mises, and Wasserstein test statistics, which can all be express… ▽ More

    Submitted 30 October, 2023; v1 submitted 12 August, 2022; originally announced August 2022.

    Comments: 19 pages before appendix and references. 3 Figures

  9. arXiv:2206.04572  [pdf, other

    cs.CR math.ST

    Log-Concave and Multivariate Canonical Noise Distributions for Differential Privacy

    Authors: Jordan Awan, Jinshuo Dong

    Abstract: A canonical noise distribution (CND) is an additive mechanism designed to satisfy $f$-differential privacy ($f$-DP), without any wasted privacy budget. $f$-DP is a hypothesis testing-based formulation of privacy phrased in terms of tradeoff functions, which captures the difficulty of a hypothesis test. In this paper, we consider the existence and construction of both log-concave CNDs and multivari… ▽ More

    Submitted 5 October, 2022; v1 submitted 9 June, 2022; originally announced June 2022.

    Comments: 10 pages before references, 1 Figure

  10. arXiv:2112.12836  [pdf, other

    cs.AR cs.DC

    Towards Hardware Support for FPGA Resource Elasticity

    Authors: Ahsan Javed Awan, Fidan Aliyeva

    Abstract: FPGAs are increasingly being deployed in the cloud to accelerate diverse applications. They are to be shared among multiple tenants to improve the total cost of ownership. Partial reconfiguration technology enables multi-tenancy on FPGA by partitioning it into regions, each hosting a specific application's accelerator. However, the region's size can not be changed once they are defined, resulting… ▽ More

    Submitted 4 July, 2022; v1 submitted 23 December, 2021; originally announced December 2021.

    Comments: Preprint of paper presented at Euromicro Conference on Digital System Design (DSD'22)

  11. arXiv:2108.04303  [pdf, other

    cs.CR math.ST

    Canonical Noise Distributions and Private Hypothesis Tests

    Authors: Jordan Awan, Salil Vadhan

    Abstract: $f$-DP has recently been proposed as a generalization of differential privacy allowing a lossless analysis of composition, post-processing, and privacy amplification via subsampling. In the setting of $f$-DP, we propose the concept of a canonical noise distribution (CND), the first mechanism designed for an arbitrary $f… ▽ More

    Submitted 13 January, 2023; v1 submitted 9 August, 2021; originally announced August 2021.

    Comments: 23 pages + references and appendix. 4 figures

  12. arXiv:2108.00965  [pdf, other

    cs.CR cs.CY stat.CO

    Privacy-Aware Rejection Sampling

    Authors: Jordan Awan, Vinayak Rao

    Abstract: Differential privacy (DP) offers strong theoretical privacy guarantees, but implementations of DP mechanisms may be vulnerable to side-channel attacks, such as timing attacks. When sampling methods such as MCMC or rejection sampling are used to implement a mechanism, the runtime can leak private information. We characterize the additional privacy cost due to the runtime of a rejection sampler in t… ▽ More

    Submitted 29 September, 2022; v1 submitted 2 August, 2021; originally announced August 2021.

    Comments: 25 pages + references, 4 figures

  13. arXiv:2106.15284  [pdf, other

    cs.AR cs.PF

    NMPO: Near-Memory Computing Profiling and Offloading

    Authors: Stefano Corda, Madhurya Kumaraswamy, Ahsan Javed Awan, Roel Jordans, Akash Kumar, Henk Corporaal

    Abstract: Real-world applications are now processing big-data sets, often bottlenecked by the data movement between the compute units and the main memory. Near-memory computing (NMC), a modern data-centric computational paradigm, can alleviate these bottlenecks, thereby improving the performance of applications. The lack of NMC system availability makes simulators the primary evaluation tool for performance… ▽ More

    Submitted 29 June, 2021; originally announced June 2021.

    Comments: Euromicro Conference on Digital System Design 2021

  14. arXiv:2006.02397  [pdf, other

    math.ST cs.CR stat.CO

    One Step to Efficient Synthetic Data

    Authors: Jordan Awan, Zhanrui Cai

    Abstract: A common approach to synthetic data is to sample from a fitted model. We show that under general assumptions, this approach results in a sample with inefficient estimators and whose joint distribution is inconsistent with the true distribution. Motivated by this, we propose a general method of producing synthetic data, which is widely applicable for parametric models, has asymptotically efficient… ▽ More

    Submitted 26 July, 2024; v1 submitted 3 June, 2020; originally announced June 2020.

    Comments: 30 pages before references and appendices

  15. arXiv:2005.04098  [pdf, other

    cs.DC

    Near Memory Acceleration on High Resolution Radio Astronomy Imaging

    Authors: Stefano Corda, Bram Veenboer, Ahsan Javed Awan, Akash Kumar, Roel Jordans, Henk Corporaal

    Abstract: Modern radio telescopes like the Square Kilometer Array (SKA) will need to process in real-time exabytes of radio-astronomical signals to construct a high-resolution map of the sky. Near-Memory Computing (NMC) could alleviate the performance bottlenecks due to frequent memory accesses in a state-of-the-art radio-astronomy imaging algorithm. In this paper, we show that a sub-module performing a two… ▽ More

    Submitted 4 May, 2020; originally announced May 2020.

  16. arXiv:1909.12183  [pdf, other

    cs.ET quant-ph

    K-Means Clustering on Noisy Intermediate Scale Quantum Computers

    Authors: Sumsam Ullah Khan, Ahsan Javed Awan, Gemma Vall-Llosera

    Abstract: Real-time clustering of big performance data generated by the telecommunication networks requires domain-specific high performance compute infrastructure to detect anomalies. In this paper, we evaluate noisy intermediate-scale quantum (NISQ) computers characterized by low decoherence times, for K-means clustering and propose three strategies to generate shorter-depth quantum circuits needed to ove… ▽ More

    Submitted 26 September, 2019; originally announced September 2019.

  17. arXiv:1909.11988  [pdf, other

    cs.ET cs.LG quant-ph

    Support Vector Machines on Noisy Intermediate Scale Quantum Computers

    Authors: Jiaying Yang, Ahsan Javed Awan, Gemma Vall-Llosera

    Abstract: Support vector machine algorithms are considered essential for the implementation of automation in a radio access network. Specifically, they are critical in the prediction of the quality of user experience for video streaming based on device and network-level metrics. Quantum SVM is the quantum analogue of the classical SVM algorithm, which utilizes the properties of quantum computers to speed up… ▽ More

    Submitted 26 September, 2019; originally announced September 2019.

  18. arXiv:1908.02640  [pdf, other

    cs.AR cs.DC cs.PF

    Near-Memory Computing: Past, Present, and Future

    Authors: Gagandeep Singh, Lorenzo Chelini, Stefano Corda, Ahsan Javed Awan, Sander Stuijk, Roel Jordans, Henk Corporaal, Albert-Jan Boonstra

    Abstract: The conventional approach of moving data to the CPU for computation has become a significant performance bottleneck for emerging scale-out data-intensive applications due to their limited data reuse. At the same time, the advancement in 3D integration technologies has made the decade-old concept of coupling compute units close to the memory --- called near-memory computing (NMC) --- more viable. P… ▽ More

    Submitted 7 August, 2019; originally announced August 2019.

    Comments: Preprint

  19. arXiv:1906.10037  [pdf, ps, other

    cs.PF cs.ET

    Platform Independent Software Analysis for Near Memory Computing

    Authors: Stefano Corda, Gagandeep Singh, Ahsan Javed Awan, Roel Jordans, Henk Corporaal

    Abstract: Near-memory Computing (NMC) promises improved performance for the applications that can exploit the features of emerging memory technologies such as 3D-stacked memory. However, it is not trivial to find such applications and specialized tools are needed to identify them. In this paper, we present PISA-NMC, which extends a state-of-the-art hardware agnostic profiling tool with metrics concerning me… ▽ More

    Submitted 24 June, 2019; originally announced June 2019.

    Journal ref: Euromicro Conference on Digital System Design (DSD) 2019

  20. arXiv:1905.09436  [pdf, other

    cs.CR stat.ML

    KNG: The K-Norm Gradient Mechanism

    Authors: Matthew Reimherr, Jordan Awan

    Abstract: This paper presents a new mechanism for producing sanitized statistical summaries that achieve \emph{differential privacy}, called the \emph{K-Norm Gradient} Mechanism, or KNG. This new approach maintains the strong flexibility of the exponential mechanism, while achieving the powerful utility performance of objective perturbation. KNG starts with an inherent objective function (often an empirical… ▽ More

    Submitted 2 August, 2021; v1 submitted 22 May, 2019; originally announced May 2019.

    Comments: 14 pages, 2 figures, published in NeurIPS 33

  21. arXiv:1905.09420  [pdf, ps, other

    cs.CR math.ST

    Elliptical Perturbations for Differential Privacy

    Authors: Matthew Reimherr, Jordan Awan

    Abstract: We study elliptical distributions in locally convex vector spaces, and determine conditions when they can or cannot be used to satisfy differential privacy (DP). A requisite condition for a sanitized statistical summary to satisfy DP is that the corresponding privacy mechanism must induce equivalent measures for all possible input databases. We show that elliptical distributions with the same disp… ▽ More

    Submitted 5 May, 2021; v1 submitted 22 May, 2019; originally announced May 2019.

    Comments: 13 pages. Published in NeurIPS 2019 (https://proceedings.neurips.cc/paper/2019/hash/b3dd760eb02d2e669c604f6b2f1e803f-Abstract.html). This Arxiv document corrects a few minor errors in the published version

    Journal ref: NeurIPS 32 (2019)

  22. arXiv:1904.08762  [pdf, other

    cs.DC cs.AR cs.PF

    Memory and Parallelism Analysis Using a Platform-Independent Approach

    Authors: Stefano Corda, Gagandeep Singh, Ahsan Javed Awan, Roel Jordans, Henk Corporaal

    Abstract: Emerging computing architectures such as near-memory computing (NMC) promise improved performance for applications by reducing the data movement between CPU and memory. However, detecting such applications is not a trivial task. In this ongoing work, we extend the state-of-the-art platform-independent software analysis tool with NMC related metrics such as memory entropy, spatial locality, data-le… ▽ More

    Submitted 18 April, 2019; originally announced April 2019.

    Comments: 22nd ACM International Workshop on Software and Compilers for Embedded Systems (SCOPES '19), May 2019

  23. arXiv:1904.00459  [pdf, other

    math.ST cs.CR

    Differentially Private Inference for Binomial Data

    Authors: Jordan Awan, Aleksandra Slavkovic

    Abstract: We derive uniformly most powerful (UMP) tests for simple and one-sided hypotheses for a population proportion within the framework of Differential Privacy (DP), optimizing finite sample performance. We show that in general, DP hypothesis tests can be written in terms of linear constraints, and for exchangeable data can always be expressed as a function of the empirical distribution. Using this str… ▽ More

    Submitted 31 March, 2019; originally announced April 2019.

    Comments: 25 pages before references; 39 pages total. 8 figures. arXiv admin note: text overlap with arXiv:1805.09236

  24. arXiv:1901.10864  [pdf, other

    cs.CR cs.LG stat.ML

    Benefits and Pitfalls of the Exponential Mechanism with Applications to Hilbert Spaces and Functional PCA

    Authors: Jordan Awan, Ana Kenney, Matthew Reimherr, Aleksandra Slavković

    Abstract: The exponential mechanism is a fundamental tool of Differential Privacy (DP) due to its strong privacy guarantees and flexibility. We study its extension to settings with summaries based on infinite dimensional outputs such as with functional data analysis, shape analysis, and nonparametric statistics. We show that one can design the mechanism with respect to a specific base measure over the outpu… ▽ More

    Submitted 30 January, 2019; originally announced January 2019.

    Comments: 13 pages, 5 images, 2 tables

    MSC Class: 46E22; 46S50; 60G15; 62H25

  25. arXiv:1707.09323  [pdf, other

    cs.DC

    Identifying the potential of Near Data Computing for Apache Spark

    Authors: Ahsan Javed Awan, Mats Brorsson, Vladimir Vlassov, Eduard Ayguade

    Abstract: While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. There is also a renewed interest is Near Data Computing (NDC) due to technological advancement in the last decade. However, it is not known if NDC… ▽ More

    Submitted 8 May, 2017; originally announced July 2017.

    Comments: position paper

  26. arXiv:1604.08484  [pdf, other

    cs.DC cs.AR cs.PF

    Architectural Impact on Performance of In-memory Data Analytics: Apache Spark Case Study

    Authors: Ahsan Javed Awan, Mats Brorsson, Vladimir Vlassov, Eduard Ayguade

    Abstract: While cluster computing frameworks are continuously evolving to provide real-time data analysis capabilities, Apache Spark has managed to be at the forefront of big data analytics for being a unified framework for both, batch and stream data processing. However, recent studies on micro-architectural characterization of in-memory data analytics are limited to only batch processing workloads. We com… ▽ More

    Submitted 28 April, 2016; originally announced April 2016.

  27. arXiv:1507.08340  [pdf, other

    cs.DC cs.AR cs.PF

    How Data Volume Affects Spark Based Data Analytics on a Scale-up Server

    Authors: Ahsan Javed Awan, Mats Brorsson, Vladimir Vlassov, Eduard Ayguade

    Abstract: Sheer increase in volume of data over the last decade has triggered research in cluster computing frameworks that enable web enterprises to extract big insights from big data. While Apache Spark is gaining popularity for exhibiting superior scale-out performance on the commodity machines, the impact of data volume on the performance of Spark based data analytics in scale-up configuration is not we… ▽ More

    Submitted 29 July, 2015; originally announced July 2015.

    Comments: accepted to 6th International Workshop on Big Data Benchmarks, Performance Optimization and Emerging Hardware (BpoE-6) held in conjunction with VLDB 2015. arXiv admin note: text overlap with arXiv:1506.07742

  28. Performance Characterization of In-Memory Data Analytics on a Modern Cloud Server

    Authors: Ahsan Javed Awan, Mats Brorsson, Vladimir Vlassov, Eduard Ayguade

    Abstract: In last decade, data analytics have rapidly progressed from traditional disk-based processing to modern in-memory processing. However, little effort has been devoted at enhancing performance at micro-architecture level. This paper characterizes the performance of in-memory data analytics using Apache Spark framework. We use a single node NUMA machine and identify the bottlenecks hampering the scal… ▽ More

    Submitted 25 June, 2015; originally announced June 2015.

    Comments: Accepted to The 5th IEEE International Conference on Big Data and Cloud Computing (BDCloud 2015)