Zum Hauptinhalt springen

Showing 1–50 of 74 results for author: Stewart, A

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.05490  [pdf, other

    cs.DC

    Beatnik: A Novel Global Communication Mini-Application

    Authors: Jason A. Stewart, Patrick G. Bridges

    Abstract: Beatnik is a novel open source mini-application that exercises the complex communication patterns often found in production codes but rarely found in benchmarks or mini-applications. It simulates 3D Raleigh-Taylor instabilities based on Pandya and Shkoller's Z-Model formulation using the Cabana performance portability framework. This paper presents both the high-level design and important implemen… ▽ More

    Submitted 8 June, 2024; originally announced June 2024.

  2. arXiv:2405.04285  [pdf, other

    cs.AI eess.SP

    On the Foundations of Earth and Climate Foundation Models

    Authors: Xiao Xiang Zhu, Zhitong Xiong, Yi Wang, Adam J. Stewart, Konrad Heidler, Yuanyuan Wang, Zhenghang Yuan, Thomas Dujardin, Qingsong Xu, Yilei Shi

    Abstract: Foundation models have enormous potential in advancing Earth and climate sciences, however, current approaches may not be optimal as they focus on a few basic features of a desirable Earth and climate foundation model. Crafting the ideal Earth foundation model, we define eleven features which would allow such a foundation model to be beneficial for any geoscientific downstream application in an en… ▽ More

    Submitted 7 May, 2024; originally announced May 2024.

  3. arXiv:2404.13911  [pdf, other

    cs.CV

    GlobalBuildingMap -- Unveiling the Mystery of Global Buildings

    Authors: Xiao Xiang Zhu, Qingyu Li, Yilei Shi, Yuanyuan Wang, Adam Stewart, Jonathan Prexl

    Abstract: Understanding how buildings are distributed globally is crucial to revealing the human footprint on our home planet. This built environment affects local climate, land surface albedo, resource distribution, and many other key factors that influence well-being and human health. Despite this, quantitative and comprehensive data on the distribution and properties of buildings worldwide is lacking. To… ▽ More

    Submitted 22 May, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

  4. arXiv:2404.01158  [pdf, other

    cs.CL cs.RO

    Dialogue with Robots: Proposals for Broadening Participation and Research in the SLIVAR Community

    Authors: Casey Kennington, Malihe Alikhani, Heather Pon-Barry, Katherine Atwell, Yonatan Bisk, Daniel Fried, Felix Gervits, Zhao Han, Mert Inan, Michael Johnston, Raj Korpan, Diane Litman, Matthew Marge, Cynthia Matuszek, Ross Mead, Shiwali Mohan, Raymond Mooney, Natalie Parde, Jivko Sinapov, Angela Stewart, Matthew Stone, Stefanie Tellex, Tom Williams

    Abstract: The ability to interact with machines using natural human language is becoming not just commonplace, but expected. The next step is not just text interfaces, but speech interfaces and not just with computers, but with all machines including robots. In this paper, we chronicle the recent history of this growing field of spoken dialogue with robots and offer the community three proposals, the first… ▽ More

    Submitted 1 April, 2024; originally announced April 2024.

    Comments: NSF Report on the "Dialogue with Robots" Workshop held in Pittsburg, PA, April 2023

  5. arXiv:2403.15356  [pdf, other

    cs.CV

    Neural Plasticity-Inspired Multimodal Foundation Model for Earth Observation

    Authors: Zhitong Xiong, Yi Wang, Fahong Zhang, Adam J. Stewart, Joëlle Hanna, Damian Borth, Ioannis Papoutsis, Bertrand Le Saux, Gustau Camps-Valls, Xiao Xiang Zhu

    Abstract: The development of foundation models has revolutionized our ability to interpret the Earth's surface using satellite observational data. Traditional models have been siloed, tailored to specific sensors or data types like optical, radar, and hyperspectral, each with its own unique characteristics. This specialization hinders the potential for a holistic analysis that could benefit from the combine… ▽ More

    Submitted 7 June, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    Comments: 36 pages, 7 figures

  6. arXiv:2403.08538  [pdf, other

    physics.ins-det cs.HC

    Calibrating coordinate system alignment in a scanning transmission electron microscope using a digital twin

    Authors: Dieter Weber, David Landers, Chen Huang, Emanuela Liberti, Emiliya Poghosyan, Matthew Bryan, Alexander Clausen, Daniel G. Stroppa, Angus I. Kirkland, Elisabeth Müller, Andrew Stewart, Rafal E. Dunin-Borkowski

    Abstract: In four-dimensional scanning transmission electron microscopy (4D STEM) a focused beam is scanned over a specimen and a diffraction pattern is recorded at each position using a pixelated detector. During the experiment, it must be ensured that the scan coordinate system of the beam is correctly calibrated relative to the detector coordinate system. Various simplified and approximate models are use… ▽ More

    Submitted 13 March, 2024; originally announced March 2024.

  7. arXiv:2403.02198  [pdf, ps, other

    cs.DM cs.CC cs.CE

    Payment Scheduling in the Interval Debt Model

    Authors: Tom Friedetzky, David C. Kutner, George B. Mertzios, Iain A. Stewart, Amitabh Trehan

    Abstract: The network-based study of financial systems has received considerable attention in recent years but has seldom explicitly incorporated the dynamic aspects of such systems. We consider this problem setting from the temporal point of view and introduce the Interval Debt Model (IDM) and some scheduling problems based on it, namely: Bankruptcy Minimization/Maximization, in which the aim is to produce… ▽ More

    Submitted 4 March, 2024; originally announced March 2024.

    Comments: 30 pages, 17 figures

  8. arXiv:2403.00954  [pdf, other

    cs.HC

    ClassInSight: Designing Conversation Support Tools to Visualize Classroom Discussion for Personalized Teacher Professional Development

    Authors: Tricia J. Ngoon, S Sushil, Angela Stewart, Ung-Sang Lee, Saranya Venkatraman, Neil Thawani, Prasenjit Mitra, Sherice Clarke, John Zimmerman, Amy Ogan

    Abstract: Teaching is one of many professions for which personalized feedback and reflection can help improve dialogue and discussion between the professional and those they serve. However, professional development (PD) is often impersonal as human observation is labor-intensive. Data-driven PD tools in teaching are of growing interest, but open questions about how professionals engage with their data in pr… ▽ More

    Submitted 1 March, 2024; originally announced March 2024.

  9. arXiv:2401.13359  [pdf, other

    cs.CC cs.DM cs.NI

    Reconfigurable routing in data center networks

    Authors: David C. Kutner, Iain A. Stewart

    Abstract: The Reconfigurable Routing Problem (RRP) in hybrid networks is, in short, the problem of finding settings for optical switches augmenting a static network so as to achieve optimal delivery of some given workload. The problem has previously been studied in various scenarios with both tractable and NP-hardness results obtained. However, the data center and interconnection networks to which the probl… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    Comments: 30 pages, 6 figures

  10. arXiv:2308.10372  [pdf

    eess.IV cs.CV cs.LG q-bio.QM

    Developing a Machine Learning-Based Clinical Decision Support Tool for Uterine Tumor Imaging

    Authors: Darryl E. Wright, Adriana V. Gregory, Deema Anaam, Sepideh Yadollahi, Sumana Ramanathan, Kafayat A. Oyemade, Reem Alsibai, Heather Holmes, Harrison Gottlich, Cherie-Akilah G. Browne, Sarah L. Cohen Rassier, Isabel Green, Elizabeth A. Stewart, Hiroaki Takahashi, Bohyun Kim, Shannon Laughlin-Tommaso, Timothy L. Kline

    Abstract: Uterine leiomyosarcoma (LMS) is a rare but aggressive malignancy. On imaging, it is difficult to differentiate LMS from, for example, degenerated leiomyoma (LM), a prevalent but benign condition. We curated a data set of 115 axial T2-weighted MRI images from 110 patients (mean [range] age=45 [17-81] years) with UTs that included five different tumor types. These data were randomly split stratifyin… ▽ More

    Submitted 20 August, 2023; originally announced August 2023.

  11. arXiv:2306.09424  [pdf, other

    cs.LG cs.CV eess.IV

    SSL4EO-L: Datasets and Foundation Models for Landsat Imagery

    Authors: Adam J. Stewart, Nils Lehmann, Isaac A. Corley, Yi Wang, Yi-Chia Chang, Nassim Ait Ali Braham, Shradha Sehgal, Caleb Robinson, Arindam Banerjee

    Abstract: The Landsat program is the longest-running Earth observation program in history, with 50+ years of data acquisition by 8 satellites. The multispectral imagery captured by sensors onboard these satellites is critical for a wide range of scientific fields. Despite the increasing popularity of deep learning and remote sensing, the majority of researchers still use decision trees and random forests fo… ▽ More

    Submitted 22 October, 2023; v1 submitted 15 June, 2023; originally announced June 2023.

  12. arXiv:2306.03675  [pdf, other

    hep-ph cs.PL hep-ex physics.comp-ph

    Potential of the Julia programming language for high energy physics computing

    Authors: J. Eschle, T. Gal, M. Giordano, P. Gras, B. Hegner, L. Heinrich, U. Hernandez Acosta, S. Kluth, J. Ling, P. Mato, M. Mikhasenko, A. Moreno Briceño, J. Pivarski, K. Samaras-Tsakiris, O. Schulz, G. . A. Stewart, J. Strube, V. Vassilev

    Abstract: Research in high energy physics (HEP) requires huge amounts of computing and storage, putting strong constraints on the code speed and resource usage. To meet these requirements, a compiled high-performance language is typically used; while for physicists, who focus on the application when developing the code, better research productivity pleads for a high-level programming language. A popular app… ▽ More

    Submitted 6 October, 2023; v1 submitted 6 June, 2023; originally announced June 2023.

    Comments: 32 pages, 5 figures, 4 tables

    ACM Class: J.2

    Journal ref: Computing. Comput Softw Big Sci 7, 10 (2023)

  13. arXiv:2306.02183  [pdf

    cs.DC q-bio.NC q-bio.QM

    brainlife.io: A decentralized and open source cloud platform to support neuroscience research

    Authors: Soichi Hayashi, Bradley A. Caron, Anibal Sólon Heinsfeld, Sophia Vinci-Booher, Brent McPherson, Daniel N. Bullock, Giulia Bertò, Guiomar Niso, Sandra Hanekamp, Daniel Levitas, Kimberly Ray, Anne MacKenzie, Lindsey Kitchell, Josiah K. Leong, Filipi Nascimento-Silva, Serge Koudoro, Hanna Willis, Jasleen K. Jolly, Derek Pisner, Taylor R. Zuidema, Jan W. Kurzawski, Kyriaki Mikellidou, Aurore Bussalb, Christopher Rorden, Conner Victory , et al. (39 additional authors not shown)

    Abstract: Neuroscience research has expanded dramatically over the past 30 years by advancing standardization and tool development to support rigor and transparency. Consequently, the complexity of the data pipeline has also increased, hindering access to FAIR (Findable, Accessible, Interoperabile, and Reusable) data analysis to portions of the worldwide research community. brainlife.io was developed to red… ▽ More

    Submitted 11 August, 2023; v1 submitted 3 June, 2023; originally announced June 2023.

  14. arXiv:2302.13947  [pdf

    cs.HC

    Investigating Girls' Perspectives and Knowledge Gaps on Ethics and Fairness in Artificial Intelligence in a Lightweight Workshop

    Authors: Jaemarie Solyst, Alexis Axon, Angela E. B. Stewart, Motahhare Eslami, Amy Ogan

    Abstract: Artificial intelligence (AI) is everywhere, with many children having increased exposure to AI technologies in daily life. We aimed to understand middle school girls' (a group often excluded group in tech) perceptions and knowledge gaps about AI. We created and explored the feasibility of a lightweight (less than 3 hours) educational workshop in which learners considered challenges in their lives… ▽ More

    Submitted 27 February, 2023; originally announced February 2023.

    Comments: 8 pages, 2 figures (a table and a graphic with two parts)

    Journal ref: Proceedings of the 16th International Society of the Learning Sciences (ICLS) 2022, pages 807-814

  15. arXiv:2205.08193  [pdf, ps, other

    physics.comp-ph cs.SE

    The HEP Software Foundation Community

    Authors: Graeme A Stewart, Peter Elmer, Elizabeth Sexton-Kennedy

    Abstract: The HEP Software Foundation was founded in 2014 to tackle common problems of software development and sustainability for high-energy physics. In this paper we outline the motivation for the founding of the organisation and give a brief history of its development. We describe how the organisation functions today and what challenges remain to be faced in the future.

    Submitted 17 May, 2022; originally announced May 2022.

    Report number: HSF-DOC-2022-01

  16. arXiv:2205.00306  [pdf, other

    cs.CY

    Cyberinfrastructure Value: a survey on perceived importance and usage

    Authors: Praneeth Chityala, Claudia M. Costa, Julie A. Wernert, Craig A. Stewart

    Abstract: The research landscape in science and engineering is heavily reliant on computation and data storage. The intensity of computation required for many research projects illustrates the importance of the availability of high performance computing (HPC) resources and services. This paper summarizes the results of a recent study among principal investigators that attempts to measure the impact of the c… ▽ More

    Submitted 30 April, 2022; originally announced May 2022.

    Comments: 6 pages, 3 figures, PEARC22 conference

  17. arXiv:2204.10611  [pdf, other

    cs.CR

    Bridging Sapling: Private Cross-Chain Transfers

    Authors: Aleixo Sanchez, Alistair Stewart, Fatemeh Shirazi

    Abstract: Interoperability is one of the main challenges of blockchain technologies, which are generally designed as self-contained systems. Interoperability schemes for privacy-focused blockchains are particularly hard to design: they must integrate with the unique privacy features of the underlying blockchain so as to prove statements about specific transactions in protocols designed to obfuscate them. Th… ▽ More

    Submitted 22 April, 2022; originally announced April 2022.

    Comments: 8 pages, to be published in: IEEE International Conference on Blockchain and Cryptocurrency, ICBC 2022

  18. arXiv:2111.08872  [pdf, other

    cs.CV cs.LG

    TorchGeo: Deep Learning With Geospatial Data

    Authors: Adam J. Stewart, Caleb Robinson, Isaac A. Corley, Anthony Ortiz, Juan M. Lavista Ferres, Arindam Banerjee

    Abstract: Remotely sensed geospatial data are critical for applications including precision agriculture, urban planning, disaster monitoring and response, and climate change research, among others. Deep learning methods are particularly promising for modeling many remote sensing tasks given the success of deep neural networks in similar computer vision tasks and the sheer volume of remotely sensed imagery a… ▽ More

    Submitted 17 September, 2022; v1 submitted 16 November, 2021; originally announced November 2021.

  19. arXiv:2106.09689  [pdf, ps, other

    cs.DS cs.LG math.ST stat.ML

    Statistical Query Lower Bounds for List-Decodable Linear Regression

    Authors: Ilias Diakonikolas, Daniel M. Kane, Ankit Pensia, Thanasis Pittas, Alistair Stewart

    Abstract: We study the problem of list-decodable linear regression, where an adversary can corrupt a majority of the examples. Specifically, we are given a set $T$ of labeled examples $(x, y) \in \mathbb{R}^d \times \mathbb{R}$ and a parameter $0< α<1/2$ such that an $α$-fraction of the points in $T$ are i.i.d. samples from a linear regression model with Gaussian covariates, and the remaining $(1-α)$-fracti… ▽ More

    Submitted 17 June, 2021; originally announced June 2021.

  20. Ten Quick Tips for Deep Learning in Biology

    Authors: Benjamin D. Lee, Anthony Gitter, Casey S. Greene, Sebastian Raschka, Finlay Maguire, Alexander J. Titus, Michael D. Kessler, Alexandra J. Lee, Marc G. Chevrette, Paul Allen Stewart, Thiago Britto-Borges, Evan M. Cofer, Kun-Hsing Yu, Juan Jose Carmona, Elana J. Fertig, Alexandr A. Kalinin, Beth Signal, Benjamin J. Lengerich, Timothy J. Triche Jr, Simina M. Boca

    Abstract: Machine learning is a modern approach to problem-solving and task automation. In particular, machine learning is concerned with the development and applications of algorithms that can recognize patterns in data and use them for predictive modeling. Artificial neural networks are a particular class of machine learning algorithms and models that evolved into what is now described as deep learning. G… ▽ More

    Submitted 29 May, 2021; originally announced May 2021.

    Comments: 23 pages, 2 figures

  21. arXiv:2102.02171  [pdf, ps, other

    cs.LG cs.DS math.PR math.ST stat.ML

    Outlier-Robust Learning of Ising Models Under Dobrushin's Condition

    Authors: Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart, Yuxin Sun

    Abstract: We study the problem of learning Ising models satisfying Dobrushin's condition in the outlier-robust setting where a constant fraction of the samples are adversarially corrupted. Our main result is to provide the first computationally efficient robust learning algorithm for this problem with near-optimal error guarantees. Our algorithm can be seen as a special case of an algorithm for robustly lea… ▽ More

    Submitted 3 February, 2021; originally announced February 2021.

  22. Software Sustainability & High Energy Physics

    Authors: Daniel S. Katz, Sudhir Malik, Mark S. Neubauer, Graeme A. Stewart, Kétévi A. Assamagan, Erin A. Becker, Neil P. Chue Hong, Ian A. Cosden, Samuel Meehan, Edward J. W. Moyse, Adrian M. Price-Whelan, Elizabeth Sexton-Kennedy, Meirin Oan Evans, Matthew Feickert, Clemens Lange, Kilian Lieret, Rob Quick, Arturo Sánchez Pineda, Christopher Tunnell

    Abstract: New facilities of the 2020s, such as the High Luminosity Large Hadron Collider (HL-LHC), will be relevant through at least the 2030s. This means that their software efforts and those that are used to analyze their data need to consider sustainability to enable their adaptability to new challenges, longevity, and efficiency, over at least this period. This will help ensure that this software will b… ▽ More

    Submitted 16 October, 2020; v1 submitted 10 October, 2020; originally announced October 2020.

    Comments: A report from the "Sustainable Software in HEP" IRIS-HEP blueprint workshop: https://indico.cern.ch/event/930127/

  23. arXiv:2007.01560  [pdf, ps, other

    cs.DC

    GRANDPA: a Byzantine Finality Gadget

    Authors: Alistair Stewart, Eleftherios Kokoris-Kogia

    Abstract: Classic Byzantine fault-tolerant consensus protocols forfeit liveness in the face of asynchrony in order to preserve safety, whereas most deployed blockchain protocols forfeit safety in order to remain live. In this work, we achieve the best of both worlds by proposing a novel abstractions called the finality gadget. A finality gadget allows for transactions to always optimistically commit but inf… ▽ More

    Submitted 3 July, 2020; originally announced July 2020.

  24. arXiv:2005.13456  [pdf, other

    cs.CR

    Overview of Polkadot and its Design Considerations

    Authors: Jeff Burdges, Alfonso Cevallos, Peter Czaban, Rob Habermeier, Syed Hosseini, Fabio Lama, Handan Kilinc Alper, Ximin Luo, Fatemeh Shirazi, Alistair Stewart, Gavin Wood

    Abstract: In this paper we describe the design components of the heterogenous multi-chain protocol Polkadot and explain how these components help Polkadot address some of the existing shortcomings of blockchain technologies. At present, a vast number of blockchain projects have been introduced and employed with various features that are not necessarily designed to work with each other. This makes it difficu… ▽ More

    Submitted 29 May, 2020; v1 submitted 27 May, 2020; originally announced May 2020.

  25. A verifiably secure and proportional committee election rule

    Authors: Alfonso Cevallos, Alistair Stewart

    Abstract: The property of proportional representation in approval-based committee elections has appeared in the social choice literature for over a century, and is typically understood as avoiding the underrepresentation of minorities. However, we argue that the security of some distributed systems is directly linked to the opposite goal of avoiding the overrepresentation of any minority, a goal not previou… ▽ More

    Submitted 13 September, 2021; v1 submitted 27 April, 2020; originally announced April 2020.

    Comments: 33 pages, 4 figures. Conference version to appear in Advances in Financial Technologies (AFT) 2021. This is an updated version of a paper originally titled "Validator selection in nominated proof of stake"

    MSC Class: 91B12; 68W25 ACM Class: F.2.2

  26. arXiv:1911.08085  [pdf, other

    cs.DS cs.LG stat.ML

    Outlier-Robust High-Dimensional Sparse Estimation via Iterative Filtering

    Authors: Ilias Diakonikolas, Sushrut Karmalkar, Daniel Kane, Eric Price, Alistair Stewart

    Abstract: We study high-dimensional sparse estimation tasks in a robust setting where a constant fraction of the dataset is adversarially corrupted. Specifically, we focus on the fundamental problems of robust sparse mean estimation and robust sparse PCA. We give the first practically viable robust estimators for these problems. In more detail, our algorithms are sample and computationally efficient and ach… ▽ More

    Submitted 18 November, 2019; originally announced November 2019.

  27. arXiv:1910.11950  [pdf, other

    cs.LG stat.ML

    Probabilistic Surrogate Networks for Simulators with Unbounded Randomness

    Authors: Andreas Munk, Berend Zwartsenberg, Adam Ścibior, Atılım Güneş Baydin, Andrew Stewart, Goran Fernlund, Anoush Poursartip, Frank Wood

    Abstract: We present a framework for automatically structuring and training fast, approximate, deep neural surrogates of stochastic simulators. Unlike traditional approaches to surrogate modeling, our surrogates retain the interpretable structure and control flow of the reference simulator. Our surrogates target stochastic simulators where the number of random variables itself can be stochastic and potentia… ▽ More

    Submitted 20 January, 2023; v1 submitted 25 October, 2019; originally announced October 2019.

  28. arXiv:1907.08306  [pdf, other

    cs.DS stat.CO

    A Polynomial Time Algorithm for Log-Concave Maximum Likelihood via Locally Exponential Families

    Authors: Brian Axelrod, Ilias Diakonikolas, Anastasios Sidiropoulos, Alistair Stewart, Gregory Valiant

    Abstract: We consider the problem of computing the maximum likelihood multivariate log-concave distribution for a set of points. Specifically, we present an algorithm which, given $n$ points in $\mathbb{R}^d$ and an accuracy parameter $ε>0$, runs in time $poly(n,d,1/ε),$ and returns a log-concave distribution which, with high probability, has the property that the likelihood of the $n$ points under the retu… ▽ More

    Submitted 18 July, 2019; originally announced July 2019.

    Comments: The present paper is a merger of two independent works arXiv:1811.03204 and arXiv:1812.05524, proposing essentially the same algorithm to compute the log-concave MLE

  29. arXiv:1812.05524  [pdf, ps, other

    cs.DS

    A Polynomial Time Algorithm for Maximum Likelihood Estimation of Multivariate Log-concave Densities

    Authors: Ilias Diakonikolas, Anastasios Sidiropoulos, Alistair Stewart

    Abstract: We study the problem of computing the maximum likelihood estimator (MLE) of multivariate log-concave densities. Our main result is the first computationally efficient algorithm for this problem. In more detail, we give an algorithm that, on input a set of $n$ points in $\mathbb{R}^d$ and an accuracy parameter $ε>0$, it runs in time $\text{poly}(n, d, 1/ε)$, and outputs a log-concave density that w… ▽ More

    Submitted 13 December, 2018; originally announced December 2018.

  30. arXiv:1806.03907  [pdf, ps, other

    cs.GT

    Reachability for Branching Concurrent Stochastic Games

    Authors: Kousha Etessami, Emanuel Martinov, Alistair Stewart, Mihalis Yannakakis

    Abstract: We give polynomial time algorithms for deciding almost-sure and limit-sure reachability in Branching Concurrent Stochastic Games (BCSGs). These are a class of infinite-state imperfect-information stochastic games that generalize both finite-state concurrent stochastic reachability games, as well as branching simple stochastic reachability games.

    Submitted 24 April, 2019; v1 submitted 11 June, 2018; originally announced June 2018.

  31. arXiv:1806.00040  [pdf, ps, other

    cs.LG cs.CC cs.DS math.ST stat.ML

    Efficient Algorithms and Lower Bounds for Robust Linear Regression

    Authors: Ilias Diakonikolas, Weihao Kong, Alistair Stewart

    Abstract: We study the problem of high-dimensional linear regression in a robust model where an $ε$-fraction of the samples can be adversarially corrupted. We focus on the fundamental setting where the covariates of the uncorrupted samples are drawn from a Gaussian distribution $\mathcal{N}(0, Σ)$ on $\mathbb{R}^d$. We give nearly tight upper bounds and computational lower bounds for this problem. Specifica… ▽ More

    Submitted 31 May, 2018; originally announced June 2018.

  32. arXiv:1803.02815  [pdf, other

    cs.LG cs.AI cs.DS stat.ML

    Sever: A Robust Meta-Algorithm for Stochastic Optimization

    Authors: Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Jacob Steinhardt, Alistair Stewart

    Abstract: In high dimensions, most machine learning methods are brittle to even a small fraction of structured outliers. To address this, we introduce a new meta-algorithm that can take in a base learner such as least squares or stochastic gradient descent, and harden the learner to be resistant to outliers. Our method, Sever, possesses strong theoretical guarantees yet is also highly scalable -- beyond run… ▽ More

    Submitted 29 May, 2019; v1 submitted 7 March, 2018; originally announced March 2018.

    Comments: To appear in ICML 2019

  33. arXiv:1802.10575  [pdf, other

    math.ST cs.IT cs.LG

    Near-Optimal Sample Complexity Bounds for Maximum Likelihood Estimation of Multivariate Log-concave Densities

    Authors: Timothy Carpenter, Ilias Diakonikolas, Anastasios Sidiropoulos, Alistair Stewart

    Abstract: We study the problem of learning multivariate log-concave densities with respect to a global loss function. We obtain the first upper bound on the sample complexity of the maximum likelihood estimator (MLE) for a log-concave density on $\mathbb{R}^d$, for all $d \geq 4$. Prior to this work, no finite sample upper bound was known for this estimator in more than $3$ dimensions. In more detail, we… ▽ More

    Submitted 4 December, 2018; v1 submitted 28 February, 2018; originally announced February 2018.

    Journal ref: COLT 2018 proceedings version

  34. arXiv:1711.11560  [pdf, other

    cs.DS cs.CC cs.DM math.PR math.ST

    Testing Conditional Independence of Discrete Distributions

    Authors: Clément L. Canonne, Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart

    Abstract: We study the problem of testing \emph{conditional independence} for discrete distributions. Specifically, given samples from a discrete random variable $(X, Y, Z)$ on domain $[\ell_1]\times[\ell_2] \times [n]$, we want to distinguish, with probability at least $2/3$, between the case that $X$ and $Y$ are conditionally independent given $Z$ from the case that $(X, Y, Z)$ is $ε$-far, in $\ell_1$-dis… ▽ More

    Submitted 1 July, 2018; v1 submitted 30 November, 2017; originally announced November 2017.

  35. arXiv:1711.07211  [pdf, ps, other

    cs.DS cs.CC cs.IT cs.LG math.ST

    List-Decodable Robust Mean Estimation and Learning Mixtures of Spherical Gaussians

    Authors: Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart

    Abstract: We study the problem of list-decodable Gaussian mean estimation and the related problem of learning mixtures of separated spherical Gaussians. We develop a set of techniques that yield new efficient algorithms with significantly improved guarantees for these problems. {\bf List-Decodable Mean Estimation.} Fix any $d \in \mathbb{Z}_+$ and $0< α<1/2$. We design an algorithm with runtime… ▽ More

    Submitted 20 November, 2017; originally announced November 2017.

  36. arXiv:1709.02087  [pdf, ps, other

    cs.DS cs.IT cs.LG math.ST

    Sharp Bounds for Generalized Uniformity Testing

    Authors: Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart

    Abstract: We study the problem of generalized uniformity testing \cite{BC17} of a discrete probability distribution: Given samples from a probability distribution $p$ over an {\em unknown} discrete domain $\mathbfΩ$, we want to distinguish, with probability at least $2/3$, between the case that $p$ is uniform on some {\em subset} of $\mathbfΩ$ versus $ε$-far, in total variation distance, from any such unifo… ▽ More

    Submitted 7 September, 2017; originally announced September 2017.

  37. arXiv:1707.01242  [pdf, ps, other

    cs.LG cs.CC cs.DS

    Learning Geometric Concepts with Nasty Noise

    Authors: Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart

    Abstract: We study the efficient learnability of geometric concept classes - specifically, low-degree polynomial threshold functions (PTFs) and intersections of halfspaces - when a fraction of the data is adversarially corrupted. We give the first polynomial-time PAC learning algorithms for these concept classes with dimension-independent error guarantees in the presence of nasty noise under the Gaussian di… ▽ More

    Submitted 5 July, 2017; originally announced July 2017.

  38. arXiv:1706.05738  [pdf, ps, other

    cs.DS cs.CC cs.DM math.PR math.ST

    Fourier-Based Testing for Families of Distributions

    Authors: Clément L. Canonne, Ilias Diakonikolas, Alistair Stewart

    Abstract: We study the general problem of testing whether an unknown distribution belongs to a specified family of distributions. More specifically, given a distribution family $\mathcal{P}$ and sample access to an unknown discrete distribution $\mathbf{P}$, we want to distinguish (with high probability) between the case that $\mathbf{P} \in \mathcal{P}$ and the case that $\mathbf{P}$ is $ε$-far, in total v… ▽ More

    Submitted 7 August, 2017; v1 submitted 18 June, 2017; originally announced June 2017.

  39. arXiv:1704.03866  [pdf, ps, other

    cs.DS cs.IT cs.LG math.ST stat.ML

    Robustly Learning a Gaussian: Getting Optimal Error, Efficiently

    Authors: Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, Alistair Stewart

    Abstract: We study the fundamental problem of learning the parameters of a high-dimensional Gaussian in the presence of noise -- where an $\varepsilon$-fraction of our samples were chosen by an adversary. We give robust estimators that achieve estimation error $O(\varepsilon)$ in the total variation distance, which is optimal up to a universal constant that is independent of the dimension. In the case whe… ▽ More

    Submitted 5 November, 2017; v1 submitted 12 April, 2017; originally announced April 2017.

    Comments: To appear in SODA 2018

  40. arXiv:1703.00893  [pdf, other

    cs.LG cs.DS cs.IT stat.ML

    Being Robust (in High Dimensions) Can Be Practical

    Authors: Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, Alistair Stewart

    Abstract: Robust estimation is much more challenging in high dimensions than it is in one dimension: Most techniques either lead to intractable optimization problems or estimators that can tolerate only a tiny fraction of errors. Recent work in theoretical computer science has shown that, in appropriate distributional models, it is possible to robustly estimate the mean and covariance with polynomial time a… ▽ More

    Submitted 13 March, 2018; v1 submitted 2 March, 2017; originally announced March 2017.

    Comments: Appeared in ICML 2017

  41. arXiv:1701.02188  [pdf, ps, other

    cs.CC math.CO

    Surjective H-Colouring: New Hardness Results

    Authors: Petr Golovach, Matthew Johnson. Barnaby Martin, Daniel Paulusma, Anthony Stewart

    Abstract: A homomorphism from a graph G to a graph H is a vertex mapping f from the vertex set of G to the vertex set of H such that there is an edge between vertices f(u) and f(v) of H whenever there is an edge between vertices u and v of G. The H-Colouring problem is to decide whether or not a graph G allows a homomorphism to a fixed graph H. We continue a study on a variant of this problem, namely the Su… ▽ More

    Submitted 26 March, 2017; v1 submitted 9 January, 2017; originally announced January 2017.

  42. arXiv:1612.03156  [pdf, ps, other

    cs.DS cs.IT cs.LG math.ST

    Testing Bayesian Networks

    Authors: Clement Canonne, Ilias Diakonikolas, Daniel Kane, Alistair Stewart

    Abstract: This work initiates a systematic investigation of testing high-dimensional structured distributions by focusing on testing Bayesian networks -- the prototypical family of directed graphical models. A Bayesian network is defined by a directed acyclic graph, where we associate a random variable with each node. The value at any particular node is conditionally independent of all the other non-descend… ▽ More

    Submitted 24 January, 2020; v1 submitted 9 December, 2016; originally announced December 2016.

    Comments: To appear in IEEE Transactions on Information Theory

  43. arXiv:1611.03473  [pdf, ps, other

    cs.LG cs.CC cs.DS cs.IT math.ST

    Statistical Query Lower Bounds for Robust Estimation of High-dimensional Gaussians and Gaussian Mixtures

    Authors: Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart

    Abstract: We describe a general technique that yields the first {\em Statistical Query lower bounds} for a range of fundamental high-dimensional learning problems involving Gaussian distributions. Our main results are for the problems of (1) learning Gaussian mixture models (GMMs), and (2) robust (agnostic) learning of a single unknown Gaussian distribution. For each of these problems, we show a {\em super-… ▽ More

    Submitted 17 May, 2017; v1 submitted 10 November, 2016; originally announced November 2016.

    Comments: Changes from v1: Revised presentation. Added more applications of the technique (SQ lower bounds for robust sparse mean estimation and robust covariance estimation in spectral norm). Sharpened testing lower bound to linear in the dimension (compared to nearly-linear in first version)

  44. arXiv:1611.03426  [pdf, other

    cs.CY cs.IR cs.SI stat.ML

    Why is it Difficult to Detect Sudden and Unexpected Epidemic Outbreaks in Twitter?

    Authors: Avaré Stewart, Sara Romano, Nattiya Kanhabua, Sergio Di Martino, Wolf Siberski, Antonino Mazzeo, Wolfgang Nejdl, Ernesto Diaz-Aviles

    Abstract: Social media services such as Twitter are a valuable source of information for decision support systems. Many studies have shown that this also holds for the medical domain, where Twitter is considered a viable tool for public health officials to sift through relevant information for the early detection, management, and control of epidemic outbreaks. This is possible due to the inherent capability… ▽ More

    Submitted 10 November, 2016; originally announced November 2016.

    Comments: ACM CCS Concepts: Applied computing - Health informatics; Information systems - Web mining; Document filtering; Novelty in information retrieval; Recommender systems; Human-centered computing - Social media

  45. arXiv:1608.07336  [pdf, ps, other

    cs.GT cs.DS math.PR

    Playing Anonymous Games using Simple Strategies

    Authors: Yu Cheng, Ilias Diakonikolas, Alistair Stewart

    Abstract: We investigate the complexity of computing approximate Nash equilibria in anonymous games. Our main algorithmic result is the following: For any $n$-player anonymous game with a bounded number of strategies and any constant $δ>0$, an $O(1/n^{1-δ})$-approximate Nash equilibrium can be computed in polynomial time. Complementing this positive result, we show that if there exists any constant $δ>0$ su… ▽ More

    Submitted 25 August, 2016; originally announced August 2016.

  46. arXiv:1608.06142  [pdf, other

    cs.DS cs.DM math.CO

    Squares of Low Maximum Degree

    Authors: Manfred Cochefert, Jean-François Couturier, Petr A. Golovach, Dieter Kratsch, Daniël Paulusma, Anthony Stewart

    Abstract: A graph H is a square root of a graph G if G can be obtained from H by adding an edge between any two vertices in H that are of distance 2. The Square Root problem is that of deciding whether a given graph admits a square root. This problem is only known to be NP-complete for chordal graphs and polynomial-time solvable for non-trivial minor-closed graph classes and a very limited number of other g… ▽ More

    Submitted 27 August, 2016; v1 submitted 22 August, 2016; originally announced August 2016.

  47. arXiv:1608.06136  [pdf, other

    cs.DS cs.DM

    A Linear Kernel for Finding Square Roots of Almost Planar Graphs

    Authors: Petr A. Golovach, Dieter Kratsch, Daniël Paulusma, Anthony Stewart

    Abstract: A graph H is a square root of a graph G if G can be obtained from H by the addition of edges between any two vertices in H that are of distance 2 from each other. The Square Root problem is that of deciding whether a given graph admits a square root. We consider this problem for planar graphs in the context of the "distance from triviality" framework. For an integer k, a planar+kv graph (or k-apex… ▽ More

    Submitted 22 August, 2016; originally announced August 2016.

  48. arXiv:1606.07384  [pdf, ps, other

    cs.DS cs.AI cs.LG math.ST

    Robust Learning of Fixed-Structure Bayesian Networks

    Authors: Yu Cheng, Ilias Diakonikolas, Daniel Kane, Alistair Stewart

    Abstract: We investigate the problem of learning Bayesian networks in a robust model where an $ε$-fraction of the samples are adversarially corrupted. In this work, we study the fully observable discrete case where the structure of the network is given. Even in this basic setting, previous learning algorithms either run in exponential time or lose dimension-dependent factors in their error guarantees. We pr… ▽ More

    Submitted 29 October, 2018; v1 submitted 23 June, 2016; originally announced June 2016.

  49. arXiv:1606.03077  [pdf, ps, other

    cs.DS cs.LG math.ST

    Efficient Robust Proper Learning of Log-concave Distributions

    Authors: Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart

    Abstract: We study the {\em robust proper learning} of univariate log-concave distributions (over continuous and discrete domains). Given a set of samples drawn from an unknown target distribution, we want to compute a log-concave hypothesis distribution that is as close as possible to the target, in total variation distance. In this work, we give the first computationally efficient algorithm for this learn… ▽ More

    Submitted 9 June, 2016; originally announced June 2016.

  50. arXiv:1605.08188  [pdf, ps, other

    cs.LG cs.IT math.ST

    Learning Multivariate Log-concave Distributions

    Authors: Ilias Diakonikolas, Daniel M. Kane, Alistair Stewart

    Abstract: We study the problem of estimating multivariate log-concave probability density functions. We prove the first sample complexity upper bound for learning log-concave densities on $\mathbb{R}^d$, for all $d \geq 1$. Prior to our work, no upper bound on the sample complexity of this learning problem was known for the case of $d>3$. In more detail, we give an estimator that, for any $d \ge 1$ and… ▽ More

    Submitted 5 June, 2017; v1 submitted 26 May, 2016; originally announced May 2016.

    Comments: To appear in COLT 2017