Zum Hauptinhalt springen

Showing 1–50 of 104 results for author: Roy, M

Searching in archive cs. Search in all archives.
.
  1. arXiv:2407.00950  [pdf, other

    cs.LG stat.ML

    Causal Bandits: The Pareto Optimal Frontier of Adaptivity, a Reduction to Linear Bandits, and Limitations around Unknown Marginals

    Authors: Ziyi Liu, Idan Attias, Daniel M. Roy

    Abstract: In this work, we investigate the problem of adapting to the presence or absence of causal structure in multi-armed bandit problems. In addition to the usual reward signal, we assume the learner has access to additional variables, observed in each round after acting. When these variables $d$-separate the action from the reward, existing work in causal bandits demonstrates that one can achieve stric… ▽ More

    Submitted 1 July, 2024; originally announced July 2024.

    Comments: Accepted to ICML 2024

  2. arXiv:2404.06498  [pdf, other

    cs.LG stat.ML

    Simultaneous linear connectivity of neural networks modulo permutation

    Authors: Ekansh Sharma, Devin Kwok, Tom Denton, Daniel M. Roy, David Rolnick, Gintare Karolina Dziugaite

    Abstract: Neural networks typically exhibit permutation symmetries which contribute to the non-convexity of the networks' loss landscapes, since linearly interpolating between two permuted versions of a trained network tends to encounter a high loss barrier. Recent work has argued that permutation symmetries are the only sources of non-convexity, meaning there are essentially no such barriers between traine… ▽ More

    Submitted 9 April, 2024; originally announced April 2024.

    Comments: 11 pages, 6 figures

  3. arXiv:2403.17218  [pdf, other

    cs.SE cs.CR cs.LG

    A Comprehensive Study of the Capabilities of Large Language Models for Vulnerability Detection

    Authors: Benjamin Steenhoek, Md Mahbubur Rahman, Monoshi Kumar Roy, Mirza Sanjida Alam, Earl T. Barr, Wei Le

    Abstract: Large Language Models (LLMs) have demonstrated great potential for code generation and other software engineering tasks. Vulnerability detection is of crucial importance to maintaining the security, integrity, and trustworthiness of software systems. Precise vulnerability detection requires reasoning about the code, making it a good case study for exploring the limits of LLMs' reasoning capabiliti… ▽ More

    Submitted 25 March, 2024; originally announced March 2024.

  4. arXiv:2402.09327  [pdf, other

    cs.LG

    Information Complexity of Stochastic Convex Optimization: Applications to Generalization and Memorization

    Authors: Idan Attias, Gintare Karolina Dziugaite, Mahdi Haghifam, Roi Livni, Daniel M. Roy

    Abstract: In this work, we investigate the interplay between memorization and learning in the context of \emph{stochastic convex optimization} (SCO). We define memorization via the information a learning algorithm reveals about its training data points. We then quantify this information using the framework of conditional mutual information (CMI) proposed by Steinke and Zakynthinou (2020). Our main result is… ▽ More

    Submitted 18 July, 2024; v1 submitted 14 February, 2024; originally announced February 2024.

    Comments: 41 Pages, To appear in ICML 2024

  5. arXiv:2312.17127  [pdf, other

    cs.PL cs.LO math.PR

    Probabilistic programming interfaces for random graphs: Markov categories, graphons, and nominal sets

    Authors: Nathanael L. Ackerman, Cameron E. Freer, Younesse Kaddar, Jacek Karwowski, Sean K. Moss, Daniel M. Roy, Sam Staton, Hongseok Yang

    Abstract: We study semantic models of probabilistic programming languages over graphs, and establish a connection to graphons from graph theory and combinatorics. We show that every well-behaved equational theory for our graph probabilistic programming language corresponds to a graphon, and conversely, every graphon arises in this way. We provide three constructions for showing that every graphon arises f… ▽ More

    Submitted 28 December, 2023; originally announced December 2023.

    Comments: Accepted for POPL 2024

    Journal ref: Proc. ACM Program. Lang. 8, POPL, Article 61 (2024), pp 1819-1849

  6. arXiv:2308.02796  [pdf

    cs.LG

    OBESEYE: Interpretable Diet Recommender for Obesity Management using Machine Learning and Explainable AI

    Authors: Mrinmoy Roy, Srabonti Das, Anica Tasnim Protity

    Abstract: Obesity, the leading cause of many non-communicable diseases, occurs mainly for eating more than our body requirements and lack of proper activity. So, being healthy requires heathy diet plans, especially for patients with comorbidities. But it is difficult to figure out the exact quantity of each nutrient because nutrients requirement varies based on physical and disease conditions. In our study… ▽ More

    Submitted 5 August, 2023; originally announced August 2023.

    Report number: Roy, M.(2023).OBESEYE: Interpretable Diet Recommender for Obesity Management using Machine Learning and Explainable AI.IJRAMT, 4(6), 1-7. https://journal.ijramt.com/ijramt/article/view/2733

  7. arXiv:2308.02662  [pdf, ps, other

    cs.CC

    Linear isomorphism testing of Boolean functions with small approximate spectral norm

    Authors: Arijit Ghosh, Chandrima Kayal, Manaswi Paraashar, Manmatha Roy

    Abstract: Two Boolean functions f, g : F_2^{n} \to {-1, 1} are called linearly isomorphic if there exists an invertible matrix M \in F_2^{n\times n} such that f\circ M = g. Testing linear isomorphism is a generalization of, now classical in the context of property testing, isomorphism testing between Boolean functions. Linear-invariance of Boolean functions has also been extensively studied in other areas l… ▽ More

    Submitted 4 August, 2023; originally announced August 2023.

  8. arXiv:2307.14067  [pdf

    cs.LG

    Machine Learning Applications In Healthcare: The State Of Knowledge and Future Directions

    Authors: Mrinmoy Roy, Sarwar J. Minar, Porarthi Dhar, A T M Omor Faruq

    Abstract: Detection of easily missed hidden patterns with fast processing power makes machine learning (ML) indispensable to today's healthcare system. Though many ML applications have already been discovered and many are still under investigation, only a few have been adopted by current healthcare systems. As a result, there exists an enormous opportunity in healthcare system for ML but distributed informa… ▽ More

    Submitted 26 July, 2023; originally announced July 2023.

    Journal ref: BJMHR, 10(6), 24-54 (2023)

  9. arXiv:2307.10060  [pdf, other

    physics.flu-dyn cs.AI cs.LG cs.NE physics.comp-ph

    Accurate deep learning sub-grid scale models for large eddy simulations

    Authors: Rikhi Bose, Arunabha M. Roy

    Abstract: We present two families of sub-grid scale (SGS) turbulence models developed for large-eddy simulation (LES) purposes. Their development required the formulation of physics-informed robust and efficient Deep Learning (DL) algorithms which, unlike state-of-the-art analytical modeling techniques can produce high-order complex non-linear relations between inputs and outputs. Explicit filtering of data… ▽ More

    Submitted 19 July, 2023; originally announced July 2023.

  10. arXiv:2306.17759  [pdf, other

    stat.ML cs.LG

    The Shaped Transformer: Attention Models in the Infinite Depth-and-Width Limit

    Authors: Lorenzo Noci, Chuning Li, Mufan Bill Li, Bobby He, Thomas Hofmann, Chris Maddison, Daniel M. Roy

    Abstract: In deep learning theory, the covariance matrix of the representations serves as a proxy to examine the network's trainability. Motivated by the success of Transformers, we study the covariance matrix of a modified Softmax-based attention model with skip connections in the proportional limit of infinite-depth-and-width. We show that at initialization the limiting distribution can be described by a… ▽ More

    Submitted 9 December, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

  11. arXiv:2303.14697  [pdf, ps, other

    math.GR cs.CC

    The central tree property and algorithmic problems on subgroups of free groups

    Authors: Mallika Roy, Enric Ventura, Pascal Weil

    Abstract: We study the average case complexity of the uniform membership problem for subgroups of free groups, and we show that it is orders of magnitude smaller than the worst case complexity of the best known algorithms. This applies to subgroups given by a fixed number of generators as well as to subgroups given by an exponential number of generators. The main idea behind this result is to exploit a gene… ▽ More

    Submitted 19 October, 2023; v1 submitted 26 March, 2023; originally announced March 2023.

    Comments: 28 pages. Inaccuracies corrected. To appear in Journal of Group Theory

    MSC Class: 20E05; 20F10; 68Q17

  12. arXiv:2303.04808  [pdf

    q-bio.QM cs.LG stat.AP

    Prevalence and Major Risk Factors of Non-communicable Diseases: A Machine Learning based Cross-Sectional Study

    Authors: Mrinmoy Roy, Anica Tasnim Protity, Srabonti Das, Porarthi Dhar

    Abstract: Objective: The study aimed to determine the prevalence of several non-communicable diseases (NCD) and analyze risk factors among adult patients seeking nutritional guidance in Dhaka, Bangladesh. Result: Our study observed the relationships between gender, age groups, obesity, and NCDs (DM, CKD, IBS, CVD, CRD, thyroid). The most frequently reported NCD was cardiovascular issues (CVD), which was pre… ▽ More

    Submitted 18 May, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

    Comments: 25 pages, 10 figures, 3 tables

  13. arXiv:2303.04275  [pdf, other

    cs.CV cs.AI cs.CY cs.LG cs.NE

    A Computer Vision Enabled damage detection model with improved YOLOv5 based on Transformer Prediction Head

    Authors: Arunabha M. Roy, Jayabrata Bhaduri

    Abstract: Objective:Computer vision-based up-to-date accurate damage classification and localization are of decisive importance for infrastructure monitoring, safety, and the serviceability of civil infrastructure. Current state-of-the-art deep learning (DL)-based damage detection models, however, often lack superior feature extraction capability in complex and noisy environments, limiting the development o… ▽ More

    Submitted 7 March, 2023; originally announced March 2023.

  14. arXiv:2302.09668  [pdf, other

    cs.LG cs.NE math.AP

    Physics-aware deep learning framework for linear elasticity

    Authors: Arunabha M. Roy, Rikhi Bose

    Abstract: The paper presents an efficient and robust data-driven deep learning (DL) computational framework developed for linear continuum elasticity problems. The methodology is based on the fundamentals of the Physics Informed Neural Networks (PINNs). For an accurate representation of the field variables, a multi-objective loss function is proposed. It consists of terms corresponding to the residual of th… ▽ More

    Submitted 19 February, 2023; originally announced February 2023.

  15. arXiv:2301.00948  [pdf, other

    eess.SP cs.HC q-bio.NC

    Understanding EEG signals for subject-wise Definition of Armoni Activities

    Authors: Kislay Raj, Aditya Singh, Abhishek Mandal, Teerath Kumar, Arunabha M. Roy

    Abstract: In a growing world of technology, psychological disorders became a challenge to be solved. The methods used for cognitive stimulation are very conventional and based on one-way communication, which only relies on the material or method used for training of an individual. It doesn't use any kind of feedback from the individual to analyze the progress of the training process. We have proposed a clos… ▽ More

    Submitted 26 April, 2023; v1 submitted 3 January, 2023; originally announced January 2023.

    Comments: Submitted to SN Computer Science journal

  16. Hair and Scalp Disease Detection using Machine Learning and Image Processing

    Authors: Mrinmoy Roy, Anica Tasnim Protity

    Abstract: Almost 80 million Americans suffer from hair loss due to aging, stress, medication, or genetic makeup. Hair and scalp-related diseases often go unnoticed in the beginning. Sometimes, a patient cannot differentiate between hair loss and regular hair fall. Diagnosing hair-related diseases is time-consuming as it requires professional dermatologists to perform visual and medical tests. Because of tha… ▽ More

    Submitted 30 May, 2023; v1 submitted 30 December, 2022; originally announced January 2023.

    Journal ref: EJ-Compute.2023;3(1):7-13

  17. arXiv:2212.13556  [pdf, other

    cs.LG stat.ML

    Limitations of Information-Theoretic Generalization Bounds for Gradient Descent Methods in Stochastic Convex Optimization

    Authors: Mahdi Haghifam, Borja Rodríguez-Gálvez, Ragnar Thobaben, Mikael Skoglund, Daniel M. Roy, Gintare Karolina Dziugaite

    Abstract: To date, no "information-theoretic" frameworks for reasoning about generalization error have been shown to establish minimax rates for gradient descent in the setting of stochastic convex optimization. In this work, we consider the prospect of establishing such rates via several existing information-theoretic frameworks: input-output mutual information bounds, conditional mutual information bounds… ▽ More

    Submitted 13 July, 2023; v1 submitted 27 December, 2022; originally announced December 2022.

    Comments: 49 pages, 2 figures. This version corrects a mistake in the proof of Theorem 17. Proc. International Conference on Algorithmic Learning Theory (ALT), 2023

  18. arXiv:2210.13738  [pdf, other

    cs.LG stat.ML

    Pruning's Effect on Generalization Through the Lens of Training and Regularization

    Authors: Tian Jin, Michael Carbin, Daniel M. Roy, Jonathan Frankle, Gintare Karolina Dziugaite

    Abstract: Practitioners frequently observe that pruning improves model generalization. A long-standing hypothesis based on bias-variance trade-off attributes this generalization improvement to model size reduction. However, recent studies on over-parameterization characterize a new model size regime, in which larger models achieve better generalization. Pruning models in this over-parameterized regime leads… ▽ More

    Submitted 24 October, 2022; originally announced October 2022.

    Comments: 49 pages, 20 figures

    Journal ref: Advances in Neural Information Processing Systems 2022

  19. arXiv:2210.04252  [pdf, other

    cs.CV cs.AI cs.LG

    Precise Single-stage Detector

    Authors: Aisha Chandio, Gong Gui, Teerath Kumar, Irfan Ullah, Ramin Ranjbarzadeh, Arunabha M Roy, Akhtar Hussain, Yao Shen

    Abstract: There are still two problems in SDD causing some inaccurate results: (1) In the process of feature extraction, with the layer-by-layer acquisition of semantic information, local information is gradually lost, resulting into less representative feature maps; (2) During the Non-Maximum Suppression (NMS) algorithm due to inconsistency in classification and regression tasks, the classification confide… ▽ More

    Submitted 9 October, 2022; originally announced October 2022.

    Comments: We will submit it soon to the IEEE transaction. Due to characters limitation, we can not upload the full abstract. Please read the pdf file for more detail

  20. Artificial Intelligence in Material Engineering: A review on applications of AI in Material Engineering

    Authors: Lipichanda Goswami, Manoj Deka, Mohendra Roy

    Abstract: The role of artificial intelligence (AI) in material science and engineering (MSE) is becoming increasingly important as AI technology advances. The development of high-performance computing has made it possible to test deep learning (DL) models with significant parameters, providing an opportunity to overcome the limitation of traditional computational methods, such as density functional theory (… ▽ More

    Submitted 27 April, 2023; v1 submitted 15 September, 2022; originally announced September 2022.

    Comments: V3

  21. arXiv:2209.06977  [pdf

    cs.DB cs.AI cs.PL

    SQL and NoSQL Databases Software architectures performance analysis and assessments -- A Systematic Literature review

    Authors: Wisal Khan, Teerath Kumar, Zhang Cheng, Kislay Raj, Arunabha M Roy, Bin Luo

    Abstract: Context: The efficient processing of Big Data is a challenging task for SQL and NoSQL Databases, where competent software architecture plays a vital role. The SQL Databases are designed for structuring data and supporting vertical scalability. In contrast, horizontal scalability is backed by NoSQL Databases and can process sizeable unstructured Data efficiently. One can choose the right paradigm a… ▽ More

    Submitted 14 September, 2022; originally announced September 2022.

    Comments: 57 pages systematic literature review, already submitted to Big Data Research; More importantly, we can not add method, result and conclusion section in the abstract here due to characters limitations. Please check pdf file

  22. arXiv:2208.00788  [pdf, other

    cs.CV cs.AI

    A Hybrid CNN-LSTM model for Video Deepfake Detection by Leveraging Optical Flow Features

    Authors: Pallabi Saikia, Dhwani Dholaria, Priyanka Yadav, Vaidehi Patel, Mohendra Roy

    Abstract: Deepfakes are the synthesized digital media in order to create ultra-realistic fake videos to trick the spectator. Deep generative algorithms, such as, Generative Adversarial Networks(GAN) are widely used to accomplish such tasks. This approach synthesizes pseudo-realistic contents that are very difficult to distinguish by traditional detection methods. In most cases, Convolutional Neural Network(… ▽ More

    Submitted 28 July, 2022; originally announced August 2022.

    Journal ref: Copyright is with IEEE, Paper No: 832, IJCNN, 2022 IEEE World Congress on Computational Intelligence

  23. arXiv:2207.13500  [pdf, other

    cs.SI cs.CL cs.IR

    Modelling Social Context for Fake News Detection: A Graph Neural Network Based Approach

    Authors: Pallabi Saikia, Kshitij Gundale, Ankit Jain, Dev Jadeja, Harvi Patel, Mohendra Roy

    Abstract: Detection of fake news is crucial to ensure the authenticity of information and maintain the news ecosystems reliability. Recently, there has been an increase in fake news content due to the recent proliferation of social media and fake content generation techniques such as Deep Fake. The majority of the existing modalities of fake news detection focus on content based approaches. However, most of… ▽ More

    Submitted 27 July, 2022; originally announced July 2022.

    Journal ref: copyright with IEEE, Paper No: 834, IJCNN, 2022 IEEE World Congress on Computational Intelligence

  24. arXiv:2207.12395  [pdf, other

    stat.CO cs.LG stat.ME stat.ML

    Tuning Stochastic Gradient Algorithms for Statistical Inference via Large-Sample Asymptotics

    Authors: Jeffrey Negrea, Jun Yang, Haoyue Feng, Daniel M. Roy, Jonathan H. Huggins

    Abstract: The tuning of stochastic gradient algorithms (SGAs) for optimization and sampling is often based on heuristics and trial-and-error rather than generalizable theory. We address this theory--practice gap by characterizing the large-sample statistical asymptotics of SGAs via a joint step-size--sample-size scaling limit. We show that iterate averaging with a large fixed step size is robust to the choi… ▽ More

    Submitted 20 July, 2023; v1 submitted 25 July, 2022; originally announced July 2022.

    Comments: 42 pgs

  25. arXiv:2206.14800  [pdf, other

    cs.LG

    Understanding Generalization via Leave-One-Out Conditional Mutual Information

    Authors: Mahdi Haghifam, Shay Moran, Daniel M. Roy, Gintare Karolina Dziugaite

    Abstract: We study the mutual information between (certain summaries of) the output of a learning algorithm and its $n$ training data, conditional on a supersample of $n+1$ i.i.d. data from which the training data is chosen at random without replacement. These leave-one-out variants of the conditional mutual information (CMI) of an algorithm (Steinke and Zakynthinou, 2020) are also seen to control the mean… ▽ More

    Submitted 29 June, 2022; originally announced June 2022.

    Comments: 18 pages

  26. arXiv:2206.02768  [pdf, other

    stat.ML cs.LG

    The Neural Covariance SDE: Shaped Infinite Depth-and-Width Networks at Initialization

    Authors: Mufan Bill Li, Mihai Nica, Daniel M. Roy

    Abstract: The logit outputs of a feedforward neural network at initialization are conditionally Gaussian, given a random covariance matrix defined by the penultimate layer. In this work, we study the distribution of this random matrix. Recent work has shown that shaping the activation function as network depth grows large is necessary for this covariance matrix to be non-degenerate. However, the current inf… ▽ More

    Submitted 14 June, 2023; v1 submitted 6 June, 2022; originally announced June 2022.

    Comments: 48 pages, 10 figures. Advances in Neural Information Processing Systems (2022)

  27. Machine learning based lens-free imaging technique for field-portable cytometry

    Authors: Rajkumar Vaghashiya, Sanghoon Shin, Varun Chauhan, Kaushal Kapadiya, Smit Sanghavi, Sungkyu Seo, Mohendra Roy

    Abstract: Lens-free Shadow Imaging Technique (LSIT) is a well-established technique for the characterization of microparticles and biological cells. Due to its simplicity and cost-effectiveness, various low-cost solutions have been evolved, such as automatic analysis of complete blood count (CBC), cell viability, 2D cell morphology, 3D cell tomography, etc. The developed auto characterization algorithm so f… ▽ More

    Submitted 2 March, 2022; v1 submitted 2 March, 2022; originally announced March 2022.

    Comments: Published in Biosensors Journal

    Journal ref: https://www.mdpi.com/2079-6374/12/3/144

  28. arXiv:2202.05100  [pdf, other

    stat.ML cs.LG

    Adaptively Exploiting d-Separators with Causal Bandits

    Authors: Blair Bilodeau, Linbo Wang, Daniel M. Roy

    Abstract: Multi-armed bandit problems provide a framework to identify the optimal intervention over a sequence of repeated experiments. Without additional assumptions, minimax optimal performance (measured by cumulative regret) is well-understood. With access to additional observed variables that d-separate the intervention from the outcome (i.e., they are a d-separator), recent "causal bandit" algorithms p… ▽ More

    Submitted 26 October, 2022; v1 submitted 10 February, 2022; originally announced February 2022.

    Comments: 29 pages, 3 figures. Camera ready version

    Journal ref: NeurIPS 2022

  29. arXiv:2202.01916  [pdf, other

    cond-mat.mtrl-sci cs.LG

    Artificial Intelligence Powered Material Search Engine

    Authors: Mohendra Roy

    Abstract: Many data-driven applications in material science have been made possible because of recent breakthroughs in artificial intelligence(AI). The use of AI in material engineering is becoming more viable as the number of material data such as X-Ray diffraction, various spectroscopy, and microscope data grows. In this work, we have reported a material search engine that uses the interatomic space (d va… ▽ More

    Submitted 19 January, 2022; originally announced February 2022.

    Comments: 4 pages

    Report number: Article reference No: MATPR29663

    Journal ref: Materials Today Journal, Elsevier; Article reference No: MATPR29663, 2022

  30. CovidAlert -- A Wristwatch-based System to Alert Users from Face Touching

    Authors: Mrinmoy Roy, Venkata Devesh Reddy Seethi, Rami Lake, Pratool Bharti

    Abstract: Worldwide 2019 million people have been infected and 4.5 million have lost their lives in the ongoing Covid-19 pandemic. Until vaccines became widely available, precautions and safety measures like wearing masks, physical distancing, avoiding face touching were some of the primary means to curb the spread of virus. Face touching is a compulsive human begavior that can not be prevented without maki… ▽ More

    Submitted 11 April, 2022; v1 submitted 30 November, 2021; originally announced December 2021.

    Comments: 17 pages, 9 figures, PervasiveHealth2021 conference

  31. arXiv:2111.09109  [pdf, other

    cs.LG

    Physics-guided Loss Functions Improve Deep Learning Performance in Inverse Scattering

    Authors: Zicheng Liu, Mayank Roy, Dilip K. Prasad, Krishna Agarwal

    Abstract: Solving electromagnetic inverse scattering problems (ISPs) is challenging due to the intrinsic nonlinearity, ill-posedness, and expensive computational cost. Recently, deep neural network (DNN) techniques have been successfully applied on ISPs and shown potential of superior imaging over conventional methods. In this paper, we analyse the analogy between DNN solvers and traditional iterative algor… ▽ More

    Submitted 13 November, 2021; originally announced November 2021.

  32. arXiv:2111.05275  [pdf, other

    cs.IT cs.LG stat.ML

    Towards a Unified Information-Theoretic Framework for Generalization

    Authors: Mahdi Haghifam, Gintare Karolina Dziugaite, Shay Moran, Daniel M. Roy

    Abstract: In this work, we investigate the expressiveness of the "conditional mutual information" (CMI) framework of Steinke and Zakynthinou (2020) and the prospect of using it to provide a unified framework for proving generalization bounds in the realizable setting. We first demonstrate that one can use this framework to express non-trivial (but sub-optimal) bounds for any learning algorithm that outputs… ▽ More

    Submitted 17 November, 2021; v1 submitted 9 November, 2021; originally announced November 2021.

    Comments: 22 Pages, NeurIPS 2021, This submission subsumes [arXiv:2011.02970] ("On the Information Complexity of Proper Learners for VC Classes in the Realizable Case")

  33. arXiv:2111.00298  [pdf, other

    cs.CV cs.LG

    A fast accurate fine-grain object detection model based on YOLOv4 deep neural network

    Authors: Arunabha M. Roy, Rikhi Bose, Jayabrata Bhaduri

    Abstract: Early identification and prevention of various plant diseases in commercial farms and orchards is a key feature of precision agriculture technology. This paper presents a high-performance real-time fine-grain object detection framework that addresses several obstacles in plant disease detection that hinder the performance of traditional methods, such as, dense distribution, irregular morphology, m… ▽ More

    Submitted 30 October, 2021; originally announced November 2021.

    MSC Class: 68T01; 68T05; 68T07; 68T10; 68T40; 68T45; 68U10; ACM Class: I.4.9; I.5.2; I.5.4; I.2.1; I.2.9; I.2.m; J.7

  34. arXiv:2110.14804  [pdf, other

    stat.ML cs.LG

    Minimax Optimal Quantile and Semi-Adversarial Regret via Root-Logarithmic Regularizers

    Authors: Jeffrey Negrea, Blair Bilodeau, Nicolò Campolongo, Francesco Orabona, Daniel M. Roy

    Abstract: Quantile (and, more generally, KL) regret bounds, such as those achieved by NormalHedge (Chaudhuri, Freund, and Hsu 2009) and its variants, relax the goal of competing against the best individual expert to only competing against a majority of experts on adversarial data. More recently, the semi-adversarial paradigm (Bilodeau, Negrea, and Roy 2020) provides an alternative relaxation of adversarial… ▽ More

    Submitted 7 November, 2021; v1 submitted 27 October, 2021; originally announced October 2021.

    Comments: 30 pages, 2 figures. Jeffrey Negrea and Blair Bilodeau are equal-contribution authors. Updated citations

    Journal ref: NeurIPS 2021

  35. arXiv:2110.13967  [pdf, other

    cs.DC

    Evaluating Serverless Architecture for Big Data Enterprise Applications

    Authors: Aimer Bhat, Madhumonti Roy, Heeki Park

    Abstract: In this paper, we investigate serverless computing for performing large scale data processing with cloudnative primitives.

    Submitted 26 October, 2021; originally announced October 2021.

    Comments: 8 pages

    Journal ref: BDCAT 2021

  36. arXiv:2107.10653  [pdf, other

    cs.RO cs.AI

    Dialogue Object Search

    Authors: Monica Roy, Kaiyu Zheng, Jason Liu, Stefanie Tellex

    Abstract: We envision robots that can collaborate and communicate seamlessly with humans. It is necessary for such robots to decide both what to say and how to act, while interacting with humans. To this end, we introduce a new task, dialogue object search: A robot is tasked to search for a target object (e.g. fork) in a human environment (e.g., kitchen), while engaging in a "video call" with a remote human… ▽ More

    Submitted 22 July, 2021; originally announced July 2021.

    Comments: 3 pages, 1 figure. Robotics: Science and Systems (RSS) 2021 Workshop on Robotics for People (R4P): Perspectives on Interaction, Learning and Safety. Extended Abstract

  37. arXiv:2106.04013  [pdf, other

    stat.ML cs.LG

    The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization

    Authors: Mufan Bill Li, Mihai Nica, Daniel M. Roy

    Abstract: Theoretical results show that neural networks can be approximated by Gaussian processes in the infinite-width limit. However, for fully connected networks, it has been previously shown that for any fixed network width, $n$, the Gaussian approximation gets worse as the network depth, $d$, increases. Given that modern networks are deep, this raises the question of how well modern architectures, like… ▽ More

    Submitted 27 October, 2021; v1 submitted 7 June, 2021; originally announced June 2021.

  38. arXiv:2104.13818   

    cs.LG math.OC stat.ML

    NUQSGD: Provably Communication-efficient Data-parallel SGD via Nonuniform Quantization

    Authors: Ali Ramezani-Kebrya, Fartash Faghri, Ilya Markov, Vitalii Aksenov, Dan Alistarh, Daniel M. Roy

    Abstract: As the size and complexity of models and datasets grow, so does the need for communication-efficient variants of stochastic gradient descent that can be deployed to perform parallel model training. One popular communication-compression method for data-parallel SGD is QSGD (Alistarh et al., 2017), which quantizes and encodes gradients to reduce communication costs. The baseline variant of QSGD prov… ▽ More

    Submitted 1 May, 2021; v1 submitted 28 April, 2021; originally announced April 2021.

    Comments: This entry is redundant and was created in error. See arXiv:1908.06077 for the latest version

  39. arXiv:2104.03252  [pdf

    cs.AI

    Leaving Goals on the Pitch: Evaluating Decision Making in Soccer

    Authors: Maaike Van Roy, Pieter Robberechts, Wen-Chi Yang, Luc De Raedt, Jesse Davis

    Abstract: Analysis of the popular expected goals (xG) metric in soccer has determined that a (slightly) smaller number of high-quality attempts will likely yield more goals than a slew of low-quality ones. This observation has driven a change in shooting behavior. Teams are passing up on shots from outside the penalty box, in the hopes of generating a better shot closer to goal later on. This paper evaluate… ▽ More

    Submitted 16 February, 2023; v1 submitted 7 April, 2021; originally announced April 2021.

    Comments: Add missing funding

    Journal ref: 2021 MIT Sloan Sports Analytics Conference

  40. arXiv:2102.00931  [pdf, other

    cs.LG stat.ML

    Information-Theoretic Generalization Bounds for Stochastic Gradient Descent

    Authors: Gergely Neu, Gintare Karolina Dziugaite, Mahdi Haghifam, Daniel M. Roy

    Abstract: We study the generalization properties of the popular stochastic optimization method known as stochastic gradient descent (SGD) for optimizing general non-convex loss functions. Our main contribution is providing upper bounds on the generalization error that depend on local statistics of the stochastic gradients evaluated along the path of iterates calculated by SGD. The key factors our bounds dep… ▽ More

    Submitted 15 August, 2021; v1 submitted 1 February, 2021; originally announced February 2021.

    Comments: COLT 2021

  41. arXiv:2101.10589  [pdf, other

    eess.IV cs.CV stat.AP

    Glioblastoma Multiforme Patient Survival Prediction

    Authors: Snehal Rajput, Rupal Agravat, Mohendra Roy, Mehul S Raval

    Abstract: Glioblastoma Multiforme is a very aggressive type of brain tumor. Due to spatial and temporal intra-tissue inhomogeneity, location and the extent of the cancer tissue, it is difficult to detect and dissect the tumor regions. In this paper, we propose survival prognosis models using four regressors operating on handcrafted image-based and radiomics features. We hypothesize that the radiomics shape… ▽ More

    Submitted 26 January, 2021; originally announced January 2021.

    Comments: 10 pages, 9 figures

    Journal ref: 2021 International Conference on Medical Imaging and Computer-Aided Diagnosis (MICAD 2021)

  42. arXiv:2012.07976  [pdf, other

    cs.LG stat.ML

    NeurIPS 2020 Competition: Predicting Generalization in Deep Learning

    Authors: Yiding Jiang, Pierre Foret, Scott Yak, Daniel M. Roy, Hossein Mobahi, Gintare Karolina Dziugaite, Samy Bengio, Suriya Gunasekar, Isabelle Guyon, Behnam Neyshabur

    Abstract: Understanding generalization in deep learning is arguably one of the most important questions in deep learning. Deep learning has been successfully adopted to a large number of problems ranging from pattern recognition to complex decision making, but many recent researchers have raised many concerns about deep learning, among which the most important is generalization. Despite numerous attempts, c… ▽ More

    Submitted 14 December, 2020; originally announced December 2020.

    Comments: 20 pages, 2 figures. Accepted for NeurIPS 2020 Competitions Track. Lead organizer: Yiding Jiang

  43. arXiv:2011.02970  [pdf, other

    cs.LG cs.IT

    On the Information Complexity of Proper Learners for VC Classes in the Realizable Case

    Authors: Mahdi Haghifam, Gintare Karolina Dziugaite, Shay Moran, Daniel M. Roy

    Abstract: We provide a negative resolution to a conjecture of Steinke and Zakynthinou (2020a), by showing that their bound on the conditional mutual information (CMI) of proper learners of Vapnik--Chervonenkis (VC) classes cannot be improved from $d \log n +2$ to $O(d)$, where $n$ is the number of i.i.d. training examples. In fact, we exhibit VC classes for which the CMI of any proper learner cannot be boun… ▽ More

    Submitted 5 November, 2020; originally announced November 2020.

    Comments: 5 Pages

  44. arXiv:2010.15110  [pdf, other

    cs.LG stat.ML

    Deep learning versus kernel learning: an empirical study of loss landscape geometry and the time evolution of the Neural Tangent Kernel

    Authors: Stanislav Fort, Gintare Karolina Dziugaite, Mansheej Paul, Sepideh Kharaghani, Daniel M. Roy, Surya Ganguli

    Abstract: In suitably initialized wide networks, small learning rates transform deep neural networks (DNNs) into neural tangent kernel (NTK) machines, whose training dynamics is well-approximated by a linear weight expansion of the network at initialization. Standard training, however, diverges from its linearization in ways that are poorly understood. We study the relationship between the training dynamics… ▽ More

    Submitted 28 October, 2020; originally announced October 2020.

    Comments: 19 pages, 19 figures, In Advances in Neural Information Processing Systems 34 (NeurIPS 2020)

  45. arXiv:2010.13764  [pdf, other

    cs.LG stat.ML

    Enforcing Interpretability and its Statistical Impacts: Trade-offs between Accuracy and Interpretability

    Authors: Gintare Karolina Dziugaite, Shai Ben-David, Daniel M. Roy

    Abstract: To date, there has been no formal study of the statistical cost of interpretability in machine learning. As such, the discourse around potential trade-offs is often informal and misconceptions abound. In this work, we aim to initiate a formal study of these trade-offs. A seemingly insurmountable roadblock is the lack of any agreed upon definition of interpretability. Instead, we propose a shift in… ▽ More

    Submitted 28 October, 2020; v1 submitted 26 October, 2020; originally announced October 2020.

    Comments: 12 pages; minor edits

  46. arXiv:2010.11924  [pdf, other

    cs.LG stat.ML

    In Search of Robust Measures of Generalization

    Authors: Gintare Karolina Dziugaite, Alexandre Drouin, Brady Neal, Nitarshan Rajkumar, Ethan Caballero, Linbo Wang, Ioannis Mitliagkas, Daniel M. Roy

    Abstract: One of the principal scientific challenges in deep learning is explaining generalization, i.e., why the particular way the community now trains networks to achieve small training error also leads to small error on held-out data from the same population. It is widely appreciated that some worst-case theories -- such as those based on the VC dimension of the class of predictors induced by modern neu… ▽ More

    Submitted 20 January, 2021; v1 submitted 22 October, 2020; originally announced October 2020.

    Comments: 27 pages, 11 figures, 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, Canada

  47. arXiv:2009.08576  [pdf, other

    cs.LG cs.CV cs.NE stat.ML

    Pruning Neural Networks at Initialization: Why are We Missing the Mark?

    Authors: Jonathan Frankle, Gintare Karolina Dziugaite, Daniel M. Roy, Michael Carbin

    Abstract: Recent work has explored the possibility of pruning neural networks at initialization. We assess proposals for doing so: SNIP (Lee et al., 2019), GraSP (Wang et al., 2020), SynFlow (Tanaka et al., 2020), and magnitude pruning. Although these methods surpass the trivial baseline of random pruning, they remain below the accuracy of magnitude pruning after training, and we endeavor to understand why.… ▽ More

    Submitted 21 March, 2021; v1 submitted 17 September, 2020; originally announced September 2020.

    Comments: Published in ICLR 2021

  48. arXiv:2008.09442  [pdf, ps, other

    eess.SP cs.AR cs.LG

    ADIC: Anomaly Detection Integrated Circuit in 65nm CMOS utilizing Approximate Computing

    Authors: Bapi Kar, Pradeep Kumar Gopalakrishnan, Sumon Kumar Bose, Mohendra Roy, Arindam Basu

    Abstract: In this paper, we present a low-power anomaly detection integrated circuit (ADIC) based on a one-class classifier (OCC) neural network. The ADIC achieves low-power operation through a combination of (a) careful choice of algorithm for online learning and (b) approximate computing techniques to lower average energy. In particular, online pseudoinverse update method (OPIUM) is used to train a random… ▽ More

    Submitted 21 August, 2020; originally announced August 2020.

    Comments: 12

  49. arXiv:2007.06552  [pdf, other

    stat.ML cs.LG

    Relaxing the I.I.D. Assumption: Adaptively Minimax Optimal Regret via Root-Entropic Regularization

    Authors: Blair Bilodeau, Jeffrey Negrea, Daniel M. Roy

    Abstract: We consider prediction with expert advice when data are generated from distributions varying arbitrarily within an unknown constraint set. This semi-adversarial setting includes (at the extremes) the classical i.i.d. setting, when the unknown constraint set is restricted to be a singleton, and the unconstrained adversarial setting, when the constraint set is the set of all distributions. The Hedge… ▽ More

    Submitted 21 July, 2022; v1 submitted 13 July, 2020; originally announced July 2020.

    Comments: 71 pages, 3 figures. Blair Bilodeau and Jeffrey Negrea are equal-contribution authors; order was determined randomly

  50. arXiv:2007.01160  [pdf, ps, other

    cs.LG stat.ML

    Tight Bounds on Minimax Regret under Logarithmic Loss via Self-Concordance

    Authors: Blair Bilodeau, Dylan J. Foster, Daniel M. Roy

    Abstract: We consider the classical problem of sequential probability assignment under logarithmic loss while competing against an arbitrary, potentially nonparametric class of experts. We obtain tight bounds on the minimax regret via a new approach that exploits the self-concordance property of the logarithmic loss. We show that for any expert class with (sequential) metric entropy $\mathcal{O}(γ^{-p})$ at… ▽ More

    Submitted 3 August, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

    Comments: 25 pages

    Journal ref: Proceedings of the 37th International Conference on Machine Learning, ICML 2020