Zum Hauptinhalt springen

Showing 1–50 of 63 results for author: Venkatasubramanian, S

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.08689  [pdf, ps, other

    cs.AI cs.CY cs.LG

    Operationalizing the Blueprint for an AI Bill of Rights: Recommendations for Practitioners, Researchers, and Policy Makers

    Authors: Alex Oesterling, Usha Bhalla, Suresh Venkatasubramanian, Himabindu Lakkaraju

    Abstract: As Artificial Intelligence (AI) tools are increasingly employed in diverse real-world applications, there has been significant interest in regulating these tools. To this end, several regulatory frameworks have been introduced by different countries worldwide. For example, the European Union recently passed the AI Act, the White House issued an Executive Order on safe, secure, and trustworthy AI,… ▽ More

    Submitted 11 July, 2024; originally announced July 2024.

    Comments: 15 pages

  2. arXiv:2406.09638  [pdf, other

    cs.LG eess.SP

    RASPNet: A Benchmark Dataset for Radar Adaptive Signal Processing Applications

    Authors: Shyam Venkatasubramanian, Bosung Kang, Ali Pezeshki, Muralidhar Rangaswamy, Vahid Tarokh

    Abstract: This work presents a large-scale dataset for radar adaptive signal processing (RASP) applications, aimed at supporting the development of data-driven models within the radar community. The dataset, called RASPNet, consists of 100 realistic scenarios compiled over a variety of topographies and land types from across the contiguous United States, designed to reflect a diverse array of real-world env… ▽ More

    Submitted 13 June, 2024; originally announced June 2024.

  3. arXiv:2406.07556  [pdf

    cs.CY

    Community Driven Approaches to Research in Technology & Society CCC Workshop Report

    Authors: Suresh Venkatasubramanian, Timnit Gebru, Ufuk Topcu, Haley Griffin, Leah Namisa Rosenbloom, Nasim Sonboli

    Abstract: Based on our workshop activities, we outlined three ways in which research can support community needs: (1) Mapping the ecosystem of both the players and ecosystem and harm landscapes, (2) Counter-Programming, which entails using the same surveillance tools that communities are subjected to observe the entities doing the surveilling, effectively protecting people from surveillance, and conducting… ▽ More

    Submitted 21 March, 2024; originally announced June 2024.

  4. arXiv:2402.18803  [pdf, other

    cs.LG cs.CY

    To Pool or Not To Pool: Analyzing the Regularizing Effects of Group-Fair Training on Shared Models

    Authors: Cyrus Cousins, I. Elizabeth Kumar, Suresh Venkatasubramanian

    Abstract: In fair machine learning, one source of performance disparities between groups is over-fitting to groups with relatively few training samples. We derive group-specific bounds on the generalization error of welfare-centric fair machine learning that benefit from the larger sample size of the majority group. We do this by considering group-specific Rademacher averages over a restricted hypothesis cl… ▽ More

    Submitted 28 February, 2024; originally announced February 2024.

  5. arXiv:2402.06609  [pdf, other

    cs.CY cs.CR

    You Still See Me: How Data Protection Supports the Architecture of ML Surveillance

    Authors: Rui-Jie Yew, Lucy Qin, Suresh Venkatasubramanian

    Abstract: Human data forms the backbone of machine learning. Data protection laws thus have strong bearing on how ML systems are governed. Given that most requirements in data protection laws accompany the processing of personal data, organizations have an incentive to keep their data out of legal scope. This makes the development and application of certain privacy-preserving techniques--data protection tec… ▽ More

    Submitted 18 February, 2024; v1 submitted 9 February, 2024; originally announced February 2024.

    Comments: A version of this work was accepted at the 2023 NeurIPS Workshop on Regulatable ML

  6. arXiv:2401.11176  [pdf, other

    eess.SP cs.LG

    Data-Driven Target Localization: Benchmarking Gradient Descent Using the Cramer-Rao Bound

    Authors: Shyam Venkatasubramanian, Sandeep Gogineni, Bosung Kang, Muralidhar Rangaswamy

    Abstract: In modern radar systems, precise target localization using azimuth and velocity estimation is paramount. Traditional unbiased estimation methods have utilized gradient descent algorithms to reach the theoretical limits of the Cramer Rao Bound (CRB) for the error of the parameter estimates. As an extension, we demonstrate on a realistic simulated example scenario that our earlier presented data-dri… ▽ More

    Submitted 22 April, 2024; v1 submitted 20 January, 2024; originally announced January 2024.

  7. arXiv:2311.12356  [pdf, other

    cs.LG

    Random Linear Projections Loss for Hyperplane-Based Optimization in Neural Networks

    Authors: Shyam Venkatasubramanian, Ahmed Aloui, Vahid Tarokh

    Abstract: Advancing loss function design is pivotal for optimizing neural network training and performance. This work introduces Random Linear Projections (RLP) loss, a novel approach that enhances training efficiency by leveraging geometric relationships within the data. Distinct from traditional loss functions that target minimizing pointwise errors, RLP loss operates by minimizing the distance between se… ▽ More

    Submitted 30 May, 2024; v1 submitted 21 November, 2023; originally announced November 2023.

  8. arXiv:2305.18159  [pdf, other

    cs.CY

    The Misuse of AUC: What High Impact Risk Assessment Gets Wrong

    Authors: Kweku Kwegyir-Aggrey, Marissa Gerchick, Malika Mohan, Aaron Horowitz, Suresh Venkatasubramanian

    Abstract: When determining which machine learning model best performs some high impact risk assessment task, practitioners commonly use the Area under the Curve (AUC) to defend and validate their model choices. In this paper, we argue that the current use and understanding of AUC as a model performance metric misunderstands the way the metric was intended to be used. To this end, we characterize the misuse… ▽ More

    Submitted 29 May, 2023; originally announced May 2023.

  9. arXiv:2303.08241  [pdf, other

    cs.CV eess.SP

    Subspace Perturbation Analysis for Data-Driven Radar Target Localization

    Authors: Shyam Venkatasubramanian, Sandeep Gogineni, Bosung Kang, Ali Pezeshki, Muralidhar Rangaswamy, Vahid Tarokh

    Abstract: Recent works exploring data-driven approaches to classical problems in adaptive radar have demonstrated promising results pertaining to the task of radar target localization. Via the use of space-time adaptive processing (STAP) techniques and convolutional neural networks, these data-driven approaches to target localization have helped benchmark the performance of neural networks for matched scena… ▽ More

    Submitted 21 March, 2023; v1 submitted 14 March, 2023; originally announced March 2023.

    Comments: 6 pages, 3 figures. Submitted to 2023 IEEE Radar Conference (RadarConf). Extension of arXiv:2209.02890

  10. arXiv:2209.07616  [pdf, other

    cs.SI

    Reducing Access Disparities in Networks using Edge Augmentation

    Authors: Ashkan Bashardoust, Sorelle A. Friedler, Carlos E. Scheidegger, Blair D. Sullivan, Suresh Venkatasubramanian

    Abstract: In social networks, a node's position is a form of \it{social capital}. Better-positioned members not only benefit from (faster) access to diverse information, but innately have more potential influence on information spread. Structural biases often arise from network formation, and can lead to significant disparities in information access based on position. Further, processes such as link recomme… ▽ More

    Submitted 15 September, 2022; originally announced September 2022.

  11. arXiv:2209.02890  [pdf, other

    cs.CV eess.SP

    Data-Driven Target Localization Using Adaptive Radar Processing and Convolutional Neural Networks

    Authors: Shyam Venkatasubramanian, Sandeep Gogineni, Bosung Kang, Ali Pezeshki, Muralidhar Rangaswamy, Vahid Tarokh

    Abstract: Leveraging the advanced functionalities of modern radio frequency (RF) modeling and simulation tools, specifically designed for adaptive radar processing applications, this paper presents a data-driven approach to improve accuracy in radar target localization post adaptive radar detection. To this end, we generate a large number of radar returns by randomly placing targets of variable strengths in… ▽ More

    Submitted 9 July, 2024; v1 submitted 6 September, 2022; originally announced September 2022.

  12. arXiv:2205.14867  [pdf, other

    cs.CY cs.LG

    Measuring and mitigating voting access disparities: a study of race and polling locations in Florida and North Carolina

    Authors: Mohsen Abbasi, Suresh Venkatasubramanian, Sorelle A. Friedler, Kristian Lum, Calvin Barrett

    Abstract: Voter suppression and associated racial disparities in access to voting are long-standing civil rights concerns in the United States. Barriers to voting have taken many forms over the decades. A history of violent explicit discouragement has shifted to more subtle access limitations that can include long lines and wait times, long travel times to reach a polling station, and other logistical barri… ▽ More

    Submitted 30 May, 2022; originally announced May 2022.

  13. arXiv:2203.07490  [pdf, other

    cs.LG cs.CY

    Repairing Regressors for Fair Binary Classification at Any Decision Threshold

    Authors: Kweku Kwegyir-Aggrey, A. Feder Cooper, Jessica Dai, John Dickerson, Keegan Hines, Suresh Venkatasubramanian

    Abstract: We study the problem of post-processing a supervised machine-learned regressor to maximize fair binary classification at all decision thresholds. By decreasing the statistical distance between each group's score distributions, we show that we can increase fair performance across all thresholds at once, and that we can do so without a large decrease in accuracy. To this end, we introduce a formal m… ▽ More

    Submitted 10 December, 2023; v1 submitted 14 March, 2022; originally announced March 2022.

  14. arXiv:2201.10712  [pdf, other

    cs.CV eess.SP

    Toward Data-Driven STAP Radar

    Authors: Shyam Venkatasubramanian, Chayut Wongkamthong, Mohammadreza Soltani, Bosung Kang, Sandeep Gogineni, Ali Pezeshki, Muralidhar Rangaswamy, Vahid Tarokh

    Abstract: Using an amalgamation of techniques from classical radar, computer vision, and deep learning, we characterize our ongoing data-driven approach to space-time adaptive processing (STAP) radar. We generate a rich example dataset of received radar signals by randomly placing targets of variable strengths in a predetermined region using RFView, a site-specific radio frequency modeling and simulation to… ▽ More

    Submitted 9 March, 2022; v1 submitted 25 January, 2022; originally announced January 2022.

    Comments: 5 pages, 4 figures. Submitted to 2022 IEEE Radar Conference (RadarConf)

  15. arXiv:2106.05498  [pdf, ps, other

    cs.CY

    It's COMPASlicated: The Messy Relationship between RAI Datasets and Algorithmic Fairness Benchmarks

    Authors: Michelle Bao, Angela Zhou, Samantha Zottola, Brian Brubach, Sarah Desmarais, Aaron Horowitz, Kristian Lum, Suresh Venkatasubramanian

    Abstract: Risk assessment instrument (RAI) datasets, particularly ProPublica's COMPAS dataset, are commonly used in algorithmic fairness papers due to benchmarking practices of comparing algorithms on datasets used in prior work. In many cases, this data is used as a benchmark to demonstrate good performance without accounting for the complexities of criminal justice (CJ) processes. However, we show that pr… ▽ More

    Submitted 28 April, 2022; v1 submitted 10 June, 2021; originally announced June 2021.

    Comments: NeurIPS 2021 Datasets and Benchmarks

  16. arXiv:2104.12037  [pdf, other

    cs.AI cs.CY

    Precarity: Modeling the Long Term Effects of Compounded Decisions on Individual Instability

    Authors: Pegah Nokhiz, Aravinda Kanchana Ruwanpathirana, Neal Patwari, Suresh Venkatasubramanian

    Abstract: When it comes to studying the impacts of decision making, the research has been largely focused on examining the fairness of the decisions, the long-term effects of the decision pipelines, and utility-based perspectives considering both the decision-maker and the individuals. However, there has hardly been any focus on precarity which is the term that encapsulates the instability in people's lives… ▽ More

    Submitted 24 April, 2021; originally announced April 2021.

    Comments: To appear at AIES 2021

  17. arXiv:2101.09962  [pdf, other

    cs.IT

    GRADE-AO: Towards Near-Optimal Spatially-Coupled Codes With High Memories

    Authors: Siyi Yang, Ahmed Hareedy, Shyam Venkatasubramanian, Robert Calderbank, Lara Dolecek

    Abstract: Spatially-coupled (SC) codes, known for their threshold saturation phenomenon and low-latency windowed decoding algorithms, are ideal for streaming applications. They also find application in various data storage systems because of their excellent performance. SC codes are constructed by partitioning an underlying block code, followed by rearranging and concatenating the partitioned components in… ▽ More

    Submitted 25 January, 2021; originally announced January 2021.

    Comments: 8 pages, 10 figures, 1 table, the shortened version has been submitted to ISIT 2021

  18. arXiv:2101.01264  [pdf

    cs.CY

    A Research Ecosystem for Secure Computing

    Authors: Nadya Bliss, Lawrence A. Gordon, Daniel Lopresti, Fred Schneider, Suresh Venkatasubramanian

    Abstract: Computing devices are vital to all areas of modern life and permeate every aspect of our society. The ubiquity of computing and our reliance on it has been accelerated and amplified by the COVID-19 pandemic. From education to work environments to healthcare to defense to entertainment - it is hard to imagine a segment of modern life that is not touched by computing. The security of computers, syst… ▽ More

    Submitted 4 January, 2021; originally announced January 2021.

    Comments: A Computing Community Consortium (CCC) white paper, 5 pages

    Report number: ccc2020whitepaper_13

  19. arXiv:2012.06057  [pdf

    cs.CY cs.AI

    Interdisciplinary Approaches to Understanding Artificial Intelligence's Impact on Society

    Authors: Suresh Venkatasubramanian, Nadya Bliss, Helen Nissenbaum, Melanie Moses

    Abstract: Innovations in AI have focused primarily on the questions of "what" and "how"-algorithms for finding patterns in web searches, for instance-without adequate attention to the possible harms (such as privacy, bias, or manipulation) and without adequate consideration of the societal context in which these systems operate. In part, this is driven by incentives and forces in the tech industry, where a… ▽ More

    Submitted 10 December, 2020; originally announced December 2020.

    Comments: A Computing Community Consortium (CCC) white paper, 5 pages

    Report number: ccc2020whitepaper_5

  20. arXiv:2010.12611  [pdf, other

    cs.SI

    Information access representations and social capital in networks

    Authors: Ashkan Bashardoust, Hannah C. Beilinson, Sorelle A. Friedler, Jiajie Ma, Jade Rousseau, Carlos E. Scheidegger, Blair D. Sullivan, Nasanbayar Ulzii-Orshikh, Suresh Venkatasubramanian

    Abstract: Social network position confers power and social capital. In the setting of online social networks that have massive reach, creating mathematical representations of social capital is an important step towards understanding how network position can differentially confer advantage to different groups and how network position can itself be a source of advantage. In this paper, we use well established… ▽ More

    Submitted 16 October, 2023; v1 submitted 23 October, 2020; originally announced October 2020.

  21. arXiv:2007.01242  [pdf

    cs.CY

    Evolving Methods for Evaluating and Disseminating Computing Research

    Authors: Benjamin Zorn, Tom Conte, Keith Marzullo, Suresh Venkatasubramanian

    Abstract: Social and technical trends have significantly changed methods for evaluating and disseminating computing research. Traditional venues for reviewing and publishing, such as conferences and journals, worked effectively in the past. Recently, trends have created new opportunities but also put new pressures on the process of review and dissemination. For example, many conferences have seen large incr… ▽ More

    Submitted 2 July, 2020; originally announced July 2020.

    Comments: A Computing Community Consortium (CCC) white paper, 12 pages

    Report number: ccc2020whitepaper_2

  22. arXiv:2006.11009  [pdf, other

    cs.LG cs.DS stat.ML

    Fair clustering via equitable group representations

    Authors: Mohsen Abbasi, Aditya Bhaskara, Suresh Venkatasubramanian

    Abstract: What does it mean for a clustering to be fair? One popular approach seeks to ensure that each cluster contains groups in (roughly) the same proportion in which they exist in the population. The normative principle at play is balance: any cluster might act as a representative of the data, and thus should reflect its diversity. But clustering also captures a different form of representativeness. A… ▽ More

    Submitted 27 January, 2021; v1 submitted 19 June, 2020; originally announced June 2020.

    Comments: 11 pages, 5 figures, ACM Conference on Fairness, Accountability, and Transparency (ACM FAccT)

  23. arXiv:2002.11097  [pdf, other

    cs.AI cs.LG stat.ML

    Problems with Shapley-value-based explanations as feature importance measures

    Authors: I. Elizabeth Kumar, Suresh Venkatasubramanian, Carlos Scheidegger, Sorelle Friedler

    Abstract: Game-theoretic formulations of feature importance have become popular as a way to "explain" machine learning models. These methods define a cooperative game between the features of a model and distribute influence among these input elements using some form of the game's unique Shapley values. Justification for these methods rests on two pillars: their desirable mathematical properties, and their a… ▽ More

    Submitted 30 June, 2020; v1 submitted 25 February, 2020; originally announced February 2020.

    Comments: Accepted to ICML 2020

  24. arXiv:1909.03166  [pdf, other

    cs.LG cs.AI cs.CY stat.ML

    Equalizing Recourse across Groups

    Authors: Vivek Gupta, Pegah Nokhiz, Chitradeep Dutta Roy, Suresh Venkatasubramanian

    Abstract: The rise in machine learning-assisted decision-making has led to concerns about the fairness of the decisions and techniques to mitigate problems of discrimination. If a negative decision is made about an individual (denying a loan, rejecting an application for housing, and so on) justice dictates that we be able to ask how we might change circumstances to get a favorable decision the next time. M… ▽ More

    Submitted 6 September, 2019; originally announced September 2019.

    Comments: 13 pages, 4 figures, 2 tables

  25. arXiv:1906.08652  [pdf, other

    cs.LG stat.ML

    Disentangling Influence: Using Disentangled Representations to Audit Model Predictions

    Authors: Charles T. Marx, Richard Lanas Phillips, Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian

    Abstract: Motivated by the need to audit complex and black box models, there has been extensive research on quantifying how data features influence model predictions. Feature influence can be direct (a direct influence on model outcomes) and indirect (model outcomes are influenced via proxy features). Feature influence can also be expressed in aggregate over the training or test data or locally with respect… ▽ More

    Submitted 20 June, 2019; originally announced June 2019.

  26. arXiv:1903.02047  [pdf, other

    cs.SI physics.soc-ph

    Gaps in Information Access in Social Networks

    Authors: Benjamin Fish, Ashkan Bashardoust, danah boyd, Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian

    Abstract: The study of influence maximization in social networks has largely ignored disparate effects these algorithms might have on the individuals contained in the social network. Individuals may place a high value on receiving information, e.g. job openings or advertisements for loans. While well-connected individuals at the center of the network are likely to receive the information that is being distr… ▽ More

    Submitted 5 March, 2019; originally announced March 2019.

    Comments: Accepted at The Web Conference 2019

  27. arXiv:1901.09565  [pdf, other

    cs.LG stat.ML

    Fairness in representation: quantifying stereotyping as a representational harm

    Authors: Mohsen Abbasi, Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian

    Abstract: While harms of allocation have been increasingly studied as part of the subfield of algorithmic fairness, harms of representation have received considerably less attention. In this paper, we formalize two notions of stereotyping and show how they manifest in later allocative harms within the machine learning pipeline. We also propose mitigation strategies and demonstrate their effectiveness on syn… ▽ More

    Submitted 28 January, 2019; originally announced January 2019.

    Comments: 9 pages, 6 figures, Siam International Conference on Data Mining

  28. arXiv:1802.06992  [pdf, ps, other

    cs.DS

    Sublinear Algorithms for MAXCUT and Correlation Clustering

    Authors: Aditya Bhaskara, Samira Daruki, Suresh Venkatasubramanian

    Abstract: We study sublinear algorithms for two fundamental graph problems, MAXCUT and correlation clustering. Our focus is on constructing core-sets as well as developing streaming algorithms for these problems. Constant space algorithms are known for dense graphs for these problems, while $Ω(n)$ lower bounds exist (in the streaming setting) for sparse graphs. Our goal in this paper is to bridge the gap… ▽ More

    Submitted 20 February, 2018; originally announced February 2018.

    Comments: 29 pages, conference

  29. arXiv:1802.04422  [pdf, other

    stat.ML cs.CY cs.LG

    A comparative study of fairness-enhancing interventions in machine learning

    Authors: Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian, Sonam Choudhary, Evan P. Hamilton, Derek Roth

    Abstract: Computers are increasingly used to make decisions that have significant impact in people's lives. Often, these predictions can affect different population subgroups disproportionately. As a result, the issue of fairness has received much recent interest, and a number of fairness-enhanced classifiers and predictors have appeared in the literature. This paper seeks to study the following questions:… ▽ More

    Submitted 12 February, 2018; originally announced February 2018.

  30. arXiv:1707.00391  [pdf, other

    cs.CY cs.LG stat.ML

    Fair Pipelines

    Authors: Amanda Bower, Sarah N. Kitchen, Laura Niss, Martin J. Strauss, Alexander Vargas, Suresh Venkatasubramanian

    Abstract: This work facilitates ensuring fairness of machine learning in the real world by decoupling fairness considerations in compound decisions. In particular, this work studies how fairness propagates through a compound decision-making processes, which we call a pipeline. Prior work in algorithmic fairness only focuses on fairness with respect to one decision. However, many decision-making processes re… ▽ More

    Submitted 2 July, 2017; originally announced July 2017.

    Comments: Presented as a poster at the 2017 Workshop on Fairness, Accountability, and Transparency in Machine Learning (FAT/ML 2017)

  31. arXiv:1706.09847  [pdf, other

    cs.CY stat.ML

    Runaway Feedback Loops in Predictive Policing

    Authors: Danielle Ensign, Sorelle A. Friedler, Scott Neville, Carlos Scheidegger, Suresh Venkatasubramanian

    Abstract: Predictive policing systems are increasingly used to determine how to allocate police across a city in order to best prevent crime. Discovered crime data (e.g., arrest counts) are used to help update the model, and the process is repeated. Such systems have been empirically shown to be susceptible to runaway feedback loops, where police are repeatedly sent back to the same neighborhoods regardless… ▽ More

    Submitted 21 December, 2017; v1 submitted 29 June, 2017; originally announced June 2017.

    Comments: Extended version accepted to the 1st Conference on Fairness, Accountability and Transparency, 2018. Adds further treatment of reported as well as discovered incidents

  32. arXiv:1609.07236  [pdf, other

    cs.CY stat.ML

    On the (im)possibility of fairness

    Authors: Sorelle A. Friedler, Carlos Scheidegger, Suresh Venkatasubramanian

    Abstract: What does it mean for an algorithm to be fair? Different papers use different notions of algorithmic fairness, and although these appear internally consistent, they also seem mutually incompatible. We present a mathematical setting in which the distinctions in previous papers can be made formal. In addition to characterizing the spaces of inputs (the "observed" space) and outputs (the "decision" s… ▽ More

    Submitted 23 September, 2016; originally announced September 2016.

  33. arXiv:1603.01374  [pdf, other

    cs.LG stat.ML

    A Unified View of Localized Kernel Learning

    Authors: John Moeller, Sarathkrishna Swaminathan, Suresh Venkatasubramanian

    Abstract: Multiple Kernel Learning, or MKL, extends (kernelized) SVM by attempting to learn not only a classifier/regressor but also the best kernel for the training task, usually from a combination of existing kernel functions. Most MKL methods seek the combined kernel that performs best over every training example, sacrificing performance in some areas to seek a global optimum. Localized kernel learning (… ▽ More

    Submitted 4 March, 2016; originally announced March 2016.

    Comments: 14 pages, 2 figures, 4 tables. Reformatted version of the accepted SDM 2016 paper

  34. arXiv:1602.08162  [pdf, other

    cs.DS

    Streaming Verification of Graph Properties

    Authors: Amirali Abdullah, Samira Daruki, Chitradeep Dutta Roy, Suresh Venkatasubramanian

    Abstract: Streaming interactive proofs (SIPs) are a framework for outsourced computation. A computationally limited streaming client (the verifier) hands over a large data set to an untrusted server (the prover) in the cloud and the two parties run a protocol to confirm the correctness of result with high probability. SIPs are particularly interesting for problems that are hard to solve (or even approximate… ▽ More

    Submitted 3 October, 2016; v1 submitted 25 February, 2016; originally announced February 2016.

    Comments: 26 pages, 2 figure, 1 table

  35. arXiv:1602.07043  [pdf, other

    stat.ML cs.LG

    Auditing Black-box Models for Indirect Influence

    Authors: Philip Adler, Casey Falk, Sorelle A. Friedler, Gabriel Rybeck, Carlos Scheidegger, Brandon Smith, Suresh Venkatasubramanian

    Abstract: Data-trained predictive models see widespread use, but for the most part they are used as black boxes which output a prediction or score. It is therefore hard to acquire a deeper understanding of model behavior, and in particular how different features influence the model prediction. This is important when interpreting the behavior of complex models, or asserting that certain problematic attribute… ▽ More

    Submitted 30 November, 2016; v1 submitted 22 February, 2016; originally announced February 2016.

    Comments: Final version of paper that appears in the IEEE International Conference on Data Mining (ICDM), 2016

  36. arXiv:1509.05514  [pdf, other

    cs.DS

    Streaming Verification in Data Analysis

    Authors: Samira Daruki, Justin Thaler, Suresh Venkatasubramanian

    Abstract: Streaming interactive proofs (SIPs) are a framework to reason about outsourced computation, where a data owner (the verifier) outsources a computation to the cloud (the prover), but wishes to verify the correctness of the solution provided by the cloud service. In this paper we present streaming interactive proofs for problems in data analysis. We present protocols for clustering and shape fitting… ▽ More

    Submitted 18 September, 2015; originally announced September 2015.

  37. arXiv:1504.02462  [pdf, other

    cs.LG cs.NE stat.ML

    A Group Theoretic Perspective on Unsupervised Deep Learning

    Authors: Arnab Paul, Suresh Venkatasubramanian

    Abstract: Why does Deep Learning work? What representations does it capture? How do higher-order representations emerge? We study these questions from the perspective of group theory, thereby opening a new approach towards a theory of Deep learning. One factor behind the recent resurgence of the subject is a key algorithmic step called {\em pretraining}: first search for a good generative model for the in… ▽ More

    Submitted 21 April, 2015; v1 submitted 8 April, 2015; originally announced April 2015.

    Comments: 2-page version of arXiv:1412.6621 prepared for presentation at ICLR 2015 workshop as required by ICLR PC). This version has some minor formatting changes as required by the conference

  38. arXiv:1503.05225  [pdf, ps, other

    cs.DS cs.CG cs.IT

    Sketching, Embedding, and Dimensionality Reduction for Information Spaces

    Authors: Amirali Abdullah, Ravi Kumar, Andrew McGregor, Sergei Vassilvitskii, Suresh Venkatasubramanian

    Abstract: Information distances like the Hellinger distance and the Jensen-Shannon divergence have deep roots in information theory and machine learning. They are used extensively in data analysis especially when the objects being compared are high dimensional empirical probability distributions built from data. However, we lack common tools needed to actually use information distances in applications effic… ▽ More

    Submitted 17 March, 2015; originally announced March 2015.

  39. arXiv:1412.6621  [pdf, other

    cs.LG cs.NE stat.ML

    Why does Deep Learning work? - A perspective from Group Theory

    Authors: Arnab Paul, Suresh Venkatasubramanian

    Abstract: Why does Deep Learning work? What representations does it capture? How do higher-order representations emerge? We study these questions from the perspective of group theory, thereby opening a new approach towards a theory of Deep learning. One factor behind the recent resurgence of the subject is a key algorithmic step called pre-training: first search for a good generative model for the input s… ▽ More

    Submitted 28 February, 2015; v1 submitted 20 December, 2014; originally announced December 2014.

    Comments: 13 pages, 5 figures

  40. arXiv:1412.3756  [pdf, other

    stat.ML cs.CY

    Certifying and removing disparate impact

    Authors: Michael Feldman, Sorelle Friedler, John Moeller, Carlos Scheidegger, Suresh Venkatasubramanian

    Abstract: What does it mean for an algorithm to be biased? In U.S. law, unintentional bias is encoded via disparate impact, which occurs when a selection process has widely different outcomes for different groups, even as it appears to be neutral. This legal determination hinges on a definition of a protected class (ethnicity, gender, religious practice) and an explicit description of the process. When th… ▽ More

    Submitted 15 July, 2015; v1 submitted 11 December, 2014; originally announced December 2014.

    Comments: Extended version of paper accepted at 2015 ACM SIGKDD Conference on Knowledge Discovery and Data Mining

  41. arXiv:1404.1191  [pdf, other

    cs.CG cs.CC

    A directed isoperimetric inequality with application to Bregman near neighbor lower bounds

    Authors: Amirali Abdullah, Suresh Venkatasubramanian

    Abstract: Bregman divergences $D_φ$ are a class of divergences parametrized by a convex function $φ$ and include well known distance functions like $\ell_2^2$ and the Kullback-Leibler divergence. There has been extensive research on algorithms for problems like clustering and near neighbor search with respect to Bregman divergences, in all cases, the algorithms depend not just on the data size $n$ and dimen… ▽ More

    Submitted 16 May, 2015; v1 submitted 4 April, 2014; originally announced April 2014.

    Comments: 27 pages

  42. arXiv:1401.3331  [pdf, other

    cs.OH

    Advanced Self-interference Cancellation and Multiantenna Techniques for Full-Duplex Radios

    Authors: Dani Korpi, Sathya Venkatasubramanian, Taneli Riihonen, Lauri Anttila, Strasdosky Otewa, Clemens Icheln, Katsuyuki Haneda, Sergei Tretyakov, Mikko Valkama, Risto Wichman

    Abstract: In an in-band full-duplex system, radios transmit and receive simultaneously in the same frequency band at the same time, providing a radical improvement in spectral efficiency over a half-duplex system. However, in order to design such a system, it is necessary to mitigate the self-interference due to simultaneous transmission and reception, which seriously limits the maximum transmit power of th… ▽ More

    Submitted 14 January, 2014; originally announced January 2014.

    Comments: Presented in 47th Annual Asilomar Conference on Signals, Systems, and Computers, 2013

  43. arXiv:1306.3295  [pdf, other

    cs.GL cs.DC

    Rethinking Abstractions for Big Data: Why, Where, How, and What

    Authors: Mary Hall, Robert M. Kirby, Feifei Li, Miriah Meyer, Valerio Pascucci, Jeff M. Phillips, Rob Ricci, Jacobus Van der Merwe, Suresh Venkatasubramanian

    Abstract: Big data refers to large and complex data sets that, under existing approaches, exceed the capacity and capability of current compute platforms, systems software, analytical tools and human understanding. Numerous lessons on the scalability of big data can already be found in asymptotic analysis of algorithms and from the high-performance computing (HPC) and applications communities. However, scal… ▽ More

    Submitted 14 June, 2013; originally announced June 2013.

    Comments: 8 pages, 1 figure

    Report number: UUCS-13-002

  44. arXiv:1305.4757  [pdf, other

    cs.LG cs.CG

    Power to the Points: Validating Data Memberships in Clusterings

    Authors: Parasaran Raman, Suresh Venkatasubramanian

    Abstract: A clustering is an implicit assignment of labels of points, based on proximity to other points. It is these labels that are then used for downstream analysis (either focusing on individual clusters, or identifying representatives of clusters and so on). Thus, in order to trust a clustering as a first step in exploratory data analysis, we must trust the labels assigned to individual data. Without s… ▽ More

    Submitted 21 May, 2013; originally announced May 2013.

    Comments: 18 pages, 9 figures, 5 tables

  45. arXiv:1302.4720  [pdf, other

    cs.NI

    Multiple Target Tracking with RF Sensor Networks

    Authors: Maurizio Bocca, Ossi Kaltiokallio, Neal Patwari, Suresh Venkatasubramanian

    Abstract: RF sensor networks are wireless networks that can localize and track people (or targets) without needing them to carry or wear any electronic device. They use the change in the received signal strength (RSS) of the links due to the movements of people to infer their locations. In this paper, we consider real-time multiple target tracking with RF sensor networks. We perform radio tomographic imagin… ▽ More

    Submitted 11 February, 2013; originally announced February 2013.

  46. arXiv:1206.5580  [pdf, other

    cs.LG stat.ML

    A Geometric Algorithm for Scalable Multiple Kernel Learning

    Authors: John Moeller, Parasaran Raman, Avishek Saha, Suresh Venkatasubramanian

    Abstract: We present a geometric formulation of the Multiple Kernel Learning (MKL) problem. To do so, we reinterpret the problem of learning kernel weights as searching for a kernel that maximizes the minimum (kernel) distance between two convex polytopes. This interpretation combined with novel structural insights from our geometric formulation allows us to reduce the MKL problem to a simple optimization r… ▽ More

    Submitted 15 March, 2014; v1 submitted 25 June, 2012; originally announced June 2012.

    Comments: 20 pages

  47. arXiv:1204.3523  [pdf, ps, other

    cs.LG stat.ML

    Efficient Protocols for Distributed Classification and Optimization

    Authors: Hal Daume III, Jeff M. Phillips, Avishek Saha, Suresh Venkatasubramanian

    Abstract: In distributed learning, the goal is to perform a learning task over data distributed across multiple nodes with minimal (expensive) communication. Prior work (Daume III et al., 2012) proposes a general model that bounds the communication required for learning classifiers while allowing for $\eps$ training error on linearly separable data adversarially distributed across nodes. In this work, we… ▽ More

    Submitted 16 April, 2012; originally announced April 2012.

  48. arXiv:1202.6078  [pdf, other

    stat.ML cs.LG

    Protocols for Learning Classifiers on Distributed Data

    Authors: Hal Daume III, Jeff M. Phillips, Avishek Saha, Suresh Venkatasubramanian

    Abstract: We consider the problem of learning classifiers for labeled data that has been distributed across several nodes. Our goal is to find a single classifier, with small approximation error, across all datasets while minimizing the communication between nodes. This setting models real-world communication bottlenecks in the processing of massive distributed datasets. We present several very general samp… ▽ More

    Submitted 27 February, 2012; originally announced February 2012.

    Comments: 19 pages, 12 figures, accepted at AISTATS 2012

  49. arXiv:1108.0835  [pdf, other

    cs.CG

    Approximate Bregman near neighbors in sublinear time: Beyond the triangle inequality

    Authors: Amirali Abdullah, John Moeller, Suresh Venkatasubramanian

    Abstract: In this paper we present the first provable approximate nearest-neighbor (ANN) algorithms for Bregman divergences. Our first algorithm processes queries in O(log^d n) time using O(n log^d n) space and only uses general properties of the underlying distance function (which includes Bregman divergences as a special case). The second algorithm processes queries in O(log n) time using O(n) space and e… ▽ More

    Submitted 15 September, 2013; v1 submitted 3 August, 2011; originally announced August 2011.

    Comments: 42 pages, including appendices and bibliography. Accepted at SOCG 2012; this version updated to remove typos and minor errata

  50. arXiv:1108.0017  [pdf, other

    cs.LG cs.DB

    Generating a Diverse Set of High-Quality Clusterings

    Authors: Jeff M. Phillips, Parasaran Raman, Suresh Venkatasubramanian

    Abstract: We provide a new framework for generating multiple good quality partitions (clusterings) of a single data set. Our approach decomposes this problem into two components, generating many high-quality partitions, and then grouping these partitions to obtain k representatives. The decomposition makes the approach extremely modular and allows us to optimize various criteria that control the choice of r… ▽ More

    Submitted 29 July, 2011; originally announced August 2011.

    Comments: 12 Pages, 5 Figures, 2nd MultiClust Workshop at ECML PKDD 2011