Search | arXiv e-print repository

Optimal Strategies in Ranked Choice Voting

Authors: Sanyukta Deshpande, Nikhil Garg, Sheldon Jacobson

Abstract: Ranked Choice Voting (RCV) and Single Transferable Voting (STV) are widely valued; but are complex to understand due to intricate per-round vote transfers. Questions like determining how far a candidate is from winning or identifying effective election strategies are computationally challenging as minor changes in voter rankings can lead to significant ripple effects - for example, lending support… ▽ More Ranked Choice Voting (RCV) and Single Transferable Voting (STV) are widely valued; but are complex to understand due to intricate per-round vote transfers. Questions like determining how far a candidate is from winning or identifying effective election strategies are computationally challenging as minor changes in voter rankings can lead to significant ripple effects - for example, lending support to a losing candidate can prevent their votes from transferring to a more competitive opponent. We study optimal strategies - persuading voters to change their ballots or adding new voters - both algorithmically and theoretically. Algorithmically, we develop efficient methods to reduce election instances while maintaining optimization accuracy, effectively circumventing the computational complexity barrier. Theoretically, we analyze the effectiveness of strategies under both perfect and imperfect polling information. Our algorithmic approach applies to the ranked-choice polling data on the US 2024 Republican Primary, finding, for example, that several candidates would have been optimally served by boosting another candidate instead of themselves. △ Less

Submitted 18 July, 2024; v1 submitted 18 July, 2024; originally announced July 2024.

arXiv:2407.10732 [pdf, other]

Gaussian process regression + deep neural network autoencoder for probabilistic surrogate modeling in nonlinear mechanics of solids

Authors: Saurabh Deshpande, Hussein Rappel, Mark Hobbs, Stéphane P. A. Bordas, Jakub Lengiewicz

Abstract: Many real-world applications demand accurate and fast predictions, as well as reliable uncertainty estimates. However, quantifying uncertainty on high-dimensional predictions is still a severely under-invested problem, especially when input-output relationships are non-linear. To handle this problem, the present work introduces an innovative approach that combines autoencoder deep neural networks… ▽ More Many real-world applications demand accurate and fast predictions, as well as reliable uncertainty estimates. However, quantifying uncertainty on high-dimensional predictions is still a severely under-invested problem, especially when input-output relationships are non-linear. To handle this problem, the present work introduces an innovative approach that combines autoencoder deep neural networks with the probabilistic regression capabilities of Gaussian processes. The autoencoder provides a low-dimensional representation of the solution space, while the Gaussian process is a Bayesian method that provides a probabilistic mapping between the low-dimensional inputs and outputs. We validate the proposed framework for its application to surrogate modeling of non-linear finite element simulations. Our findings highlight that the proposed framework is computationally efficient as well as accurate in predicting non-linear deformations of solid bodies subjected to external forces, all the while providing insightful uncertainty assessments. △ Less

Submitted 15 July, 2024; originally announced July 2024.

arXiv:2406.14442 [pdf, other]

Graph Representation Learning Strategies for Omics Data: A Case Study on Parkinson's Disease

Authors: Elisa Gómez de Lope, Saurabh Deshpande, Ramón Viñas Torné, Pietro Liò, Enrico Glaab, Stéphane P. A. Bordas

Abstract: Omics data analysis is crucial for studying complex diseases, but its high dimensionality and heterogeneity challenge classical statistical and machine learning methods. Graph neural networks have emerged as promising alternatives, yet the optimal strategies for their design and optimization in real-world biomedical challenges remain unclear. This study evaluates various graph representation learn… ▽ More Omics data analysis is crucial for studying complex diseases, but its high dimensionality and heterogeneity challenge classical statistical and machine learning methods. Graph neural networks have emerged as promising alternatives, yet the optimal strategies for their design and optimization in real-world biomedical challenges remain unclear. This study evaluates various graph representation learning models for case-control classification using high-throughput biological data from Parkinson's disease and control samples. We compare topologies derived from sample similarity networks and molecular interaction networks, including protein-protein and metabolite-metabolite interactions (PPI, MMI). Graph Convolutional Network (GCNs), Chebyshev spectral graph convolution (ChebyNet), and Graph Attention Network (GAT), are evaluated alongside advanced architectures like graph transformers, the graph U-net, and simpler models like multilayer perceptron (MLP). These models are systematically applied to transcriptomics and metabolomics data independently. Our comparative analysis highlights the benefits and limitations of various architectures in extracting patterns from omics data, paving the way for more accurate and interpretable models in biomedical research. △ Less

Submitted 20 June, 2024; originally announced June 2024.

Comments: Submitted to Machine Learning in Computational Biology 2024 as an extended abstract, 2 pages + 1 appendix

arXiv:2405.02664 [pdf, other]

MedPromptExtract (Medical Data Extraction Tool): Anonymization and Hi-fidelity Automated data extraction using NLP and prompt engineering

Authors: Roomani Srivastava, Suraj Prasad, Lipika Bhat, Sarvesh Deshpande, Barnali Das, Kshitij Jadhav

Abstract: A major roadblock in the seamless digitization of medical records remains the lack of interoperability of existing records. Extracting relevant medical information required for further treatment planning or even research is a time consuming labour intensive task involving expenditure of valuable time of doctors. In this demo paper we present, MedPromptExtract an automated tool using a combination… ▽ More A major roadblock in the seamless digitization of medical records remains the lack of interoperability of existing records. Extracting relevant medical information required for further treatment planning or even research is a time consuming labour intensive task involving expenditure of valuable time of doctors. In this demo paper we present, MedPromptExtract an automated tool using a combination of semi supervised learning, large language models, natural language processing and prompt engineering to convert unstructured medical records to structured data which is amenable for further analysis. △ Less

Submitted 6 June, 2024; v1 submitted 4 May, 2024; originally announced May 2024.

Comments: 4 pages, 3 figures, pre-print sumitted to CIKM 2024

arXiv:2403.05749 [pdf, other]

Characterizing Flow Complexity in Transportation Networks using Graph Homology

Authors: Shashank A Deshpande, Hamsa Balakrishnan

Abstract: Series-parallel network topologies generally exhibit simplified dynamical behavior and avoid high combinatorial complexity. A comprehensive analysis of how flow complexity emerges with a graph's deviation from series-parallel topology is therefore of fundamental interest. We introduce the notion of a robust $k$-path on a directed acycylic graph, with increasing values of the length $k$ reflecting… ▽ More Series-parallel network topologies generally exhibit simplified dynamical behavior and avoid high combinatorial complexity. A comprehensive analysis of how flow complexity emerges with a graph's deviation from series-parallel topology is therefore of fundamental interest. We introduce the notion of a robust $k$-path on a directed acycylic graph, with increasing values of the length $k$ reflecting increasing deviations. We propose a graph homology with robust $k$-paths as the bases of its chain spaces. In this framework, the topological simplicity of series-parallel graphs translates into a triviality of higher-order chain spaces. We discuss a correspondence between the space of order-three chains and sites within the network that are susceptible to the Braess paradox, a well-known phenomenon in transportation networks. In this manner, we illustrate the utility of the proposed graph homology in sytematically studying the complexity of flow networks. △ Less

Submitted 8 March, 2024; originally announced March 2024.

Comments: 7 pages, 3 figures, letter

arXiv:2401.16914 [pdf, other]

Energy-conserving equivariant GNN for elasticity of lattice architected metamaterials

Authors: Ivan Grega, Ilyes Batatia, Gábor Csányi, Sri Karlapati, Vikram S. Deshpande

Abstract: Lattices are architected metamaterials whose properties strongly depend on their geometrical design. The analogy between lattices and graphs enables the use of graph neural networks (GNNs) as a faster surrogate model compared to traditional methods such as finite element modelling. In this work, we generate a big dataset of structure-property relationships for strut-based lattices. The dataset is… ▽ More Lattices are architected metamaterials whose properties strongly depend on their geometrical design. The analogy between lattices and graphs enables the use of graph neural networks (GNNs) as a faster surrogate model compared to traditional methods such as finite element modelling. In this work, we generate a big dataset of structure-property relationships for strut-based lattices. The dataset is made available to the community which can fuel the development of methods anchored in physical principles for the fitting of fourth-order tensors. In addition, we present a higher-order GNN model trained on this dataset. The key features of the model are (i) SE(3) equivariance, and (ii) consistency with the thermodynamic law of conservation of energy. We compare the model to non-equivariant models based on a number of error metrics and demonstrate its benefits in terms of predictive performance and reduced training requirements. Finally, we demonstrate an example application of the model to an architected material design task. The methods which we developed are applicable to fourth-order tensors beyond elasticity such as piezo-optical tensor etc. △ Less

Submitted 20 March, 2024; v1 submitted 30 January, 2024; originally announced January 2024.

Comments: International Conference on Learning Representations 2024

arXiv:2309.03812 [pdf, other]

AnthroNet: Conditional Generation of Humans via Anthropometrics

Authors: Francesco Picetti, Shrinath Deshpande, Jonathan Leban, Soroosh Shahtalebi, Jay Patel, Peifeng Jing, Chunpu Wang, Charles Metze III, Cameron Sun, Cera Laidlaw, James Warren, Kathy Huynh, River Page, Jonathan Hogins, Adam Crespi, Sujoy Ganguly, Salehe Erfanian Ebadi

Abstract: We present a novel human body model formulated by an extensive set of anthropocentric measurements, which is capable of generating a wide range of human body shapes and poses. The proposed model enables direct modeling of specific human identities through a deep generative architecture, which can produce humans in any arbitrary pose. It is the first of its kind to have been trained end-to-end usin… ▽ More We present a novel human body model formulated by an extensive set of anthropocentric measurements, which is capable of generating a wide range of human body shapes and poses. The proposed model enables direct modeling of specific human identities through a deep generative architecture, which can produce humans in any arbitrary pose. It is the first of its kind to have been trained end-to-end using only synthetically generated data, which not only provides highly accurate human mesh representations but also allows for precise anthropometry of the body. Moreover, using a highly diverse animation library, we articulated our synthetic humans' body and hands to maximize the diversity of the learnable priors for model training. Our model was trained on a dataset of $100k$ procedurally-generated posed human meshes and their corresponding anthropometric measurements. Our synthetic data generator can be used to generate millions of unique human identities and poses for non-commercial academic research purposes. △ Less

Submitted 7 September, 2023; originally announced September 2023.

Comments: AnthroNet's Unity data generator source code is available at: https://unity-technologies.github.io/AnthroNet/

arXiv:2308.07414 [pdf, other]

Votemandering: Strategies and Fairness in Political Redistricting

Authors: Sanyukta Deshpande, Ian G Ludden, Sheldon H Jacobson

Abstract: Gerrymandering, the deliberate manipulation of electoral district boundaries for political advantage, is a persistent issue in U.S. redistricting cycles. This paper introduces and analyzes a new phenomenon, 'votemandering'- a strategic blend of gerrymandering and targeted political campaigning, devised to gain more seats by circumventing fairness measures. It leverages accurate demographic and soc… ▽ More Gerrymandering, the deliberate manipulation of electoral district boundaries for political advantage, is a persistent issue in U.S. redistricting cycles. This paper introduces and analyzes a new phenomenon, 'votemandering'- a strategic blend of gerrymandering and targeted political campaigning, devised to gain more seats by circumventing fairness measures. It leverages accurate demographic and socio-political data to influence voter decisions, bolstered by advancements in technology and data analytics, and executes better-informed redistricting. Votemandering is established as a Mixed Integer Program (MIP) that performs fairness-constrained gerrymandering over multiple election rounds, via unit-specific variables for campaigns. To combat votemandering, we present a computationally efficient heuristic for creating and testing district maps that more robustly preserve voter preferences. We analyze the influence of various redistricting constraints and parameters on votemandering efficacy. We explore the interconnectedness of gerrymandering, substantial campaign budgets, and strategic campaigning, illustrating their collective potential to generate biased electoral maps. A Wisconsin State Senate redistricting case study substantiates our findings on real data, demonstrating how major parties can secure additional seats through votemandering. Our findings underscore the practical implications of these manipulations, stressing the need for informed policy and regulation to safeguard democratic processes. △ Less

Submitted 15 August, 2023; v1 submitted 14 August, 2023; originally announced August 2023.

arXiv:2308.03897 [pdf, other]

Hardware Architecture for a Quantum Computer Trusted Execution Environment

Authors: Theodoros Trochatos, Chuanqi Xu, Sanjay Deshpande, Yao Lu, Yongshan Ding, Jakub Szefer

Abstract: The cloud-based environments in which today's and future quantum computers will operate, raise concerns about the security and privacy of user's intellectual property. Quantum circuits submitted to cloud-based quantum computer providers represent sensitive or proprietary algorithms developed by users that need protection. Further, input data is hard-coded into the circuits, and leakage of the circ… ▽ More The cloud-based environments in which today's and future quantum computers will operate, raise concerns about the security and privacy of user's intellectual property. Quantum circuits submitted to cloud-based quantum computer providers represent sensitive or proprietary algorithms developed by users that need protection. Further, input data is hard-coded into the circuits, and leakage of the circuits can expose users' data. To help protect users' circuits and data from possibly malicious quantum computer cloud providers, this work presented the first hardware architecture for a trusted execution environment for quantum computers. To protect the user's circuits and data, the quantum computer control pulses are obfuscated with decoy control pulses. While digital data can be encrypted, analog control pulses cannot and this paper proposed the novel decoy pulse approach to obfuscate the analog control pulses. The proposed decoy pulses can easily be added to the software by users. Meanwhile, the hardware components of the architecture proposed in this paper take care of eliminating, i.e. attenuating, the decoy pulses inside the superconducting quantum computer's dilution refrigerator before they reach the qubits. The hardware architecture also contains tamper-resistant features to protect the trusted hardware and users' information. The work leverages a new metric of variational distance to analyze the impact and scalability of hardware protection. The variational distance of the circuits protected with our scheme, compared to unprotected circuits, is in the range of only $0.16$ to $0.26$. This work demonstrates that protection from possibly malicious cloud providers is feasible and all the hardware components needed for the proposed architecture are available today. △ Less

Submitted 7 August, 2023; originally announced August 2023.

arXiv:2306.00382 [pdf, other]

Calibrated and Conformal Propensity Scores for Causal Effect Estimation

Authors: Shachi Deshpande, Volodymyr Kuleshov

Abstract: Propensity scores are commonly used to estimate treatment effects from observational data. We argue that the probabilistic output of a learned propensity score model should be calibrated -- i.e., a predictive treatment probability of 90% should correspond to 90% of individuals being assigned the treatment group -- and we propose simple recalibration techniques to ensure this property. We prove tha… ▽ More Propensity scores are commonly used to estimate treatment effects from observational data. We argue that the probabilistic output of a learned propensity score model should be calibrated -- i.e., a predictive treatment probability of 90% should correspond to 90% of individuals being assigned the treatment group -- and we propose simple recalibration techniques to ensure this property. We prove that calibration is a necessary condition for unbiased treatment effect estimation when using popular inverse propensity weighted and doubly robust estimators. We derive error bounds on causal effect estimates that directly relate to the quality of uncertainties provided by the probabilistic propensity score model and show that calibration strictly improves this error bound while also avoiding extreme propensity weights. We demonstrate improved causal effect estimation with calibrated propensity scores in several tasks including high-dimensional image covariates and genome-wide association studies (GWASs). Calibrated propensity scores improve the speed of GWAS analysis by more than two-fold by enabling the use of simpler models that are faster to train. △ Less

Submitted 4 June, 2024; v1 submitted 1 June, 2023; originally announced June 2023.

Comments: 23 pages, 3 figures

ACM Class: I.2.m

arXiv:2305.05006 [pdf, other]

Synthesis of Annotated Colorectal Cancer Tissue Images from Gland Layout

Authors: Srijay Deshpande, Fayyaz Minhas, Nasir Rajpoot

Abstract: Generating realistic tissue images with annotations is a challenging task that is important in many computational histopathology applications. Synthetically generated images and annotations are valuable for training and evaluating algorithms in this domain. To address this, we propose an interactive framework generating pairs of realistic colorectal cancer histology images with corresponding gland… ▽ More Generating realistic tissue images with annotations is a challenging task that is important in many computational histopathology applications. Synthetically generated images and annotations are valuable for training and evaluating algorithms in this domain. To address this, we propose an interactive framework generating pairs of realistic colorectal cancer histology images with corresponding glandular masks from glandular structure layouts. The framework accurately captures vital features like stroma, goblet cells, and glandular lumen. Users can control gland appearance by adjusting parameters such as the number of glands, their locations, and sizes. The generated images exhibit good Frechet Inception Distance (FID) scores compared to the state-of-the-art image-to-image translation model. Additionally, we demonstrate the utility of our synthetic annotations for evaluating gland segmentation algorithms. Furthermore, we present a methodology for constructing glandular masks using advanced deep generative models, such as latent diffusion models. These masks enable tissue image generation through a residual encoder-decoder network. △ Less

Submitted 4 April, 2024; v1 submitted 8 May, 2023; originally announced May 2023.

arXiv:2304.06122 [pdf, other]

Analyzing ChatGPT's Aptitude in an Introductory Computer Engineering Course

Authors: Sanjay Deshpande, Jakub Szefer

Abstract: ChatGPT has recently gathered attention from the general public and academia as a tool that is able to generate plausible and human-sounding text answers to various questions. One potential use, or abuse, of ChatGPT is in answering various questions or even generating whole essays and research papers in an academic or classroom setting. While recent works have explored the use of ChatGPT in the co… ▽ More ChatGPT has recently gathered attention from the general public and academia as a tool that is able to generate plausible and human-sounding text answers to various questions. One potential use, or abuse, of ChatGPT is in answering various questions or even generating whole essays and research papers in an academic or classroom setting. While recent works have explored the use of ChatGPT in the context of humanities, business school, or medical school, this work explores how ChatGPT performs in the context of an introductory computer engineering course. This work assesses ChatGPT's aptitude in answering quizzes, homework, exam, and laboratory questions in an introductory-level computer engineering course. This work finds that ChatGPT can do well on questions asking about generic concepts. However, predictably, as a text-only tool, it cannot handle questions with diagrams or figures, nor can it generate diagrams and figures. Further, also clearly, the tool cannot do hands-on lab experiments, breadboard assembly, etc., but can generate plausible answers to some laboratory manual questions. One of the key observations presented in this work is that the ChatGPT tool could not be used to pass all components of the course. Nevertheless, it does well on quizzes and short-answer questions. On the other hand, plausible, human-sounding answers could confuse students when generating incorrect but still plausible answers. △ Less

Submitted 14 April, 2023; v1 submitted 13 March, 2023; originally announced April 2023.

Comments: 5 pages

arXiv:2303.06274 [pdf]

CoNIC Challenge: Pushing the Frontiers of Nuclear Detection, Segmentation, Classification and Counting

Authors: Simon Graham, Quoc Dang Vu, Mostafa Jahanifar, Martin Weigert, Uwe Schmidt, Wenhua Zhang, Jun Zhang, Sen Yang, Jinxi Xiang, Xiyue Wang, Josef Lorenz Rumberger, Elias Baumann, Peter Hirsch, Lihao Liu, Chenyang Hong, Angelica I. Aviles-Rivero, Ayushi Jain, Heeyoung Ahn, Yiyu Hong, Hussam Azzuni, Min Xu, Mohammad Yaqub, Marie-Claire Blache, Benoît Piégu, Bertrand Vernay , et al. (64 additional authors not shown)

Abstract: Nuclear detection, segmentation and morphometric profiling are essential in helping us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of repro… ▽ More Nuclear detection, segmentation and morphometric profiling are essential in helping us further understand the relationship between histology and patient outcome. To drive innovation in this area, we setup a community-wide challenge using the largest available dataset of its kind to assess nuclear segmentation and cellular composition. Our challenge, named CoNIC, stimulated the development of reproducible algorithms for cellular recognition with real-time result inspection on public leaderboards. We conducted an extensive post-challenge analysis based on the top-performing models using 1,658 whole-slide images of colon tissue. With around 700 million detected nuclei per model, associated features were used for dysplasia grading and survival analysis, where we demonstrated that the challenge's improvement over the previous state-of-the-art led to significant boosts in downstream performance. Our findings also suggest that eosinophils and neutrophils play an important role in the tumour microevironment. We release challenge models and WSI-level results to foster the development of further methods for biomarker discovery. △ Less

Submitted 14 March, 2023; v1 submitted 10 March, 2023; originally announced March 2023.

arXiv:2302.12196 [pdf, other]

Calibrated Regression Against An Adversary Without Regret

Authors: Shachi Deshpande, Charles Marx, Volodymyr Kuleshov

Abstract: We are interested in probabilistic prediction in online settings in which data does not follow a probability distribution. Our work seeks to achieve two goals: (1) producing valid probabilities that accurately reflect model confidence; and (2) ensuring that traditional notions of performance (e.g., high accuracy) still hold. We introduce online algorithms guaranteed to achieve these goals on arbit… ▽ More We are interested in probabilistic prediction in online settings in which data does not follow a probability distribution. Our work seeks to achieve two goals: (1) producing valid probabilities that accurately reflect model confidence; and (2) ensuring that traditional notions of performance (e.g., high accuracy) still hold. We introduce online algorithms guaranteed to achieve these goals on arbitrary streams of data points, including data chosen by an adversary. Specifically, our algorithms produce forecasts that are (1) calibrated -- i.e., an 80% confidence interval contains the true outcome 80% of the time -- and (2) have low regret relative to a user-specified baseline model. We implement a post-hoc recalibration strategy that provably achieves these goals in regression; previous algorithms applied to classification or achieved (1) but not (2). In the context of Bayesian optimization, an online model-based decision-making task in which the data distribution shifts over time, our method yields accelerated convergence to improved optima. △ Less

Submitted 4 June, 2024; v1 submitted 23 February, 2023; originally announced February 2023.

arXiv:2212.13780 [pdf, other]

SynCLay: Interactive Synthesis of Histology Images from Bespoke Cellular Layouts

Authors: Srijay Deshpande, Muhammad Dawood, Fayyaz Minhas, Nasir Rajpoot

Abstract: Automated synthesis of histology images has several potential applications in computational pathology. However, no existing method can generate realistic tissue images with a bespoke cellular layout or user-defined histology parameters. In this work, we propose a novel framework called SynCLay (Synthesis from Cellular Layouts) that can construct realistic and high-quality histology images from use… ▽ More Automated synthesis of histology images has several potential applications in computational pathology. However, no existing method can generate realistic tissue images with a bespoke cellular layout or user-defined histology parameters. In this work, we propose a novel framework called SynCLay (Synthesis from Cellular Layouts) that can construct realistic and high-quality histology images from user-defined cellular layouts along with annotated cellular boundaries. Tissue image generation based on bespoke cellular layouts through the proposed framework allows users to generate different histological patterns from arbitrary topological arrangement of different types of cells. SynCLay generated synthetic images can be helpful in studying the role of different types of cells present in the tumor microenvironmet. Additionally, they can assist in balancing the distribution of cellular counts in tissue images for designing accurate cellular composition predictors by minimizing the effects of data imbalance. We train SynCLay in an adversarial manner and integrate a nuclear segmentation and classification model in its training to refine nuclear structures and generate nuclear masks in conjunction with synthetic images. During inference, we combine the model with another parametric model for generating colon images and associated cellular counts as annotations given the grade of differentiation and cell densities of different cells. We assess the generated images quantitatively and report on feedback from trained pathologists who assigned realism scores to a set of images generated by the framework. The average realism score across all pathologists for synthetic images was as high as that for the real images. We also show that augmenting limited real data with the synthetic data generated by our framework can significantly boost prediction performance of the cellular composition prediction task. △ Less

Submitted 28 December, 2022; originally announced December 2022.

arXiv:2212.01386 [pdf, other]

doi 10.3389/fmats.2023.1128954

Convolution, aggregation and attention based deep neural networks for accelerating simulations in mechanics

Authors: Saurabh Deshpande, Raúl I. Sosa, Stéphane P. A. Bordas, Jakub Lengiewicz

Abstract: Deep learning surrogate models are being increasingly used in accelerating scientific simulations as a replacement for costly conventional numerical techniques. However, their use remains a significant challenge when dealing with real-world complex examples. In this work, we demonstrate three types of neural network architectures for efficient learning of highly non-linear deformations of solid bo… ▽ More Deep learning surrogate models are being increasingly used in accelerating scientific simulations as a replacement for costly conventional numerical techniques. However, their use remains a significant challenge when dealing with real-world complex examples. In this work, we demonstrate three types of neural network architectures for efficient learning of highly non-linear deformations of solid bodies. The first two architectures are based on the recently proposed CNN U-NET and MAgNET (graph U-NET) frameworks which have shown promising performance for learning on mesh-based data. The third architecture is Perceiver IO, a very recent architecture that belongs to the family of attention-based neural networks--a class that has revolutionised diverse engineering fields and is still unexplored in computational mechanics. We study and compare the performance of all three networks on two benchmark examples, and show their capabilities to accurately predict the non-linear mechanical responses of soft bodies. △ Less

Submitted 24 March, 2023; v1 submitted 1 December, 2022; originally announced December 2022.

Journal ref: Front. Mater. 10:1128954

arXiv:2212.00219 [pdf, other]

Are you using test log-likelihood correctly?

Authors: Sameer K. Deshpande, Soumya Ghosh, Tin D. Nguyen, Tamara Broderick

Abstract: Test log-likelihood is commonly used to compare different models of the same data or different approximate inference algorithms for fitting the same probabilistic model. We present simple examples demonstrating how comparisons based on test log-likelihood can contradict comparisons according to other objectives. Specifically, our examples show that (i) approximate Bayesian inference algorithms tha… ▽ More Test log-likelihood is commonly used to compare different models of the same data or different approximate inference algorithms for fitting the same probabilistic model. We present simple examples demonstrating how comparisons based on test log-likelihood can contradict comparisons according to other objectives. Specifically, our examples show that (i) approximate Bayesian inference algorithms that attain higher test log-likelihoods need not also yield more accurate posterior approximations and (ii) conclusions about forecast accuracy based on test log-likelihood comparisons may not agree with conclusions based on root mean squared error. △ Less

Submitted 18 January, 2024; v1 submitted 30 November, 2022; originally announced December 2022.

Comments: Presented at the ICBINB Workshop at NeurIPS 2022. This version accepted at TMLR, available at https://openreview.net/forum?id=n2YifD4Dxo

arXiv:2211.00713 [pdf, other]

doi 10.1016/j.engappai.2024.108055

MAgNET: A Graph U-Net Architecture for Mesh-Based Simulations

Authors: Saurabh Deshpande, Stéphane P. A. Bordas, Jakub Lengiewicz

Abstract: In many cutting-edge applications, high-fidelity computational models prove to be too slow for practical use and are therefore replaced by much faster surrogate models. Recently, deep learning techniques have increasingly been utilized to accelerate such predictions. To enable learning on large-dimensional and complex data, specific neural network architectures have been developed, including convo… ▽ More In many cutting-edge applications, high-fidelity computational models prove to be too slow for practical use and are therefore replaced by much faster surrogate models. Recently, deep learning techniques have increasingly been utilized to accelerate such predictions. To enable learning on large-dimensional and complex data, specific neural network architectures have been developed, including convolutional and graph neural networks. In this work, we present a novel encoder-decoder geometric deep learning framework called MAgNET, which extends the well-known convolutional neural networks to accommodate arbitrary graph-structured data. MAgNET consists of innovative Multichannel Aggregation (MAg) layers and graph pooling/unpooling layers, forming a graph U-Net architecture that is analogous to convolutional U-Nets. We demonstrate the predictive capabilities of MAgNET in surrogate modeling for non-linear finite element simulations in the mechanics of solids. △ Less

Submitted 2 April, 2024; v1 submitted 1 November, 2022; originally announced November 2022.

Journal ref: Engineering Applications of Artificial Intelligence, Volume 133, Part B, 2024, 108055

arXiv:2207.05016 [pdf, other]

Capacity Management in a Pandemic with Endogenous Patient Choices and Flows

Authors: Sanyukta Deshpande, Lavanya Marla, Alan Scheller-Wolf, Siddharth Prakash Singh

Abstract: Motivated by the experiences of a healthcare service provider during the Covid-19 pandemic, we aim to study the decisions of a provider that operates both an Emergency Department (ED) and a medical Clinic. Patients contact the provider through a phone call or may present directly at the ED: patients can be COVID (suspected/confirmed) or non-COVID, and have different severities. Depending on the se… ▽ More Motivated by the experiences of a healthcare service provider during the Covid-19 pandemic, we aim to study the decisions of a provider that operates both an Emergency Department (ED) and a medical Clinic. Patients contact the provider through a phone call or may present directly at the ED: patients can be COVID (suspected/confirmed) or non-COVID, and have different severities. Depending on the severity, patients who contact the provider may be directed to the ED (to be seen in a few hours), be offered an appointment at the Clinic (to be seen in a few days), or be treated via phone or telemedicine, avoiding a visit to a facility. All patients make joining decisions based on comparing their own risk perceptions versus their anticipated benefits: They then choose to enter a facility only if it is beneficial enough. Also, after initial contact, their severities may evolve, which may change their decision. The hospital system's objective is to allocate service capacity across facilities so as to minimize costs from patient deaths or defections. We model the system using a fluid approximation over multiple periods, possibly with different demand profiles. While the feasible space for this problem can be extremely complex, it is amenable to decomposition into different sub-regions that can be analyzed individually, the global optimal solution can be reached via provably parsimonious computational methods over a single period and over multiple periods with different demand rates. Our analytical and computational results indicate that endogeneity results in non-trivial and non-intuitive capacity allocations that do not always prioritize high severity patients, for both single and multi-period settings. △ Less

Submitted 11 July, 2022; originally announced July 2022.

arXiv:2205.02932 [pdf, other]

doi 10.1109/IGARSS46834.2022.9883890

Understanding Urban Water Consumption using Remotely Sensed Data

Authors: Shaswat Mohanty, Anirudh Vijay, Shailesh Deshpande

Abstract: Urban metabolism is an active field of research that deals with the estimation of emissions and resource consumption from urban regions. The analysis could be carried out through a manual surveyor by the implementation of elegant machine learning algorithms. In this exploratory work, we estimate the water consumption by the buildings in the region captured by satellite imagery. To this end, we bre… ▽ More Urban metabolism is an active field of research that deals with the estimation of emissions and resource consumption from urban regions. The analysis could be carried out through a manual surveyor by the implementation of elegant machine learning algorithms. In this exploratory work, we estimate the water consumption by the buildings in the region captured by satellite imagery. To this end, we break our analysis into three parts: i) Identification of building pixels, given a satellite image, followed by ii) identification of the building type (residential/non-residential) from the building pixels, and finally iii) using the building pixels along with their type to estimate the water consumption using the average per unit area consumption for different building types as obtained from municipal surveys. △ Less

Submitted 5 January, 2023; v1 submitted 3 May, 2022; originally announced May 2022.

Comments: 4 pages, 2 figures, IEEE Conference Proceedings (IGARSS 2022)

arXiv:2204.08491 [pdf, other]

Active Learning Helps Pretrained Models Learn the Intended Task

Authors: Alex Tamkin, Dat Nguyen, Salil Deshpande, Jesse Mu, Noah Goodman

Abstract: Models can fail in unpredictable ways during deployment due to task ambiguity, when multiple behaviors are consistent with the provided training data. An example is an object classifier trained on red squares and blue circles: when encountering blue squares, the intended behavior is undefined. We investigate whether pretrained models are better active learners, capable of disambiguating between th… ▽ More Models can fail in unpredictable ways during deployment due to task ambiguity, when multiple behaviors are consistent with the provided training data. An example is an object classifier trained on red squares and blue circles: when encountering blue squares, the intended behavior is undefined. We investigate whether pretrained models are better active learners, capable of disambiguating between the possible tasks a user may be trying to specify. Intriguingly, we find that better active learning is an emergent property of the pretraining process: pretrained models require up to 5 times fewer labels when using uncertainty-based active learning, while non-pretrained models see no or even negative benefit. We find these gains come from an ability to select examples with attributes that disambiguate the intended behavior, such as rare product categories or atypical backgrounds. These attributes are far more linearly separable in pretrained model's representation spaces vs non-pretrained models, suggesting a possible mechanism for this behavior. △ Less

Submitted 18 April, 2022; originally announced April 2022.

arXiv:2203.09672 [pdf, other]

Deep Multi-Modal Structural Equations For Causal Effect Estimation With Unstructured Proxies

Authors: Shachi Deshpande, Kaiwen Wang, Dhruv Sreenivas, Zheng Li, Volodymyr Kuleshov

Abstract: Estimating the effect of intervention from observational data while accounting for confounding variables is a key task in causal inference. Oftentimes, the confounders are unobserved, but we have access to large amounts of additional unstructured data (images, text) that contain valuable proxy signal about the missing confounders. This paper argues that leveraging this unstructured data can greatl… ▽ More Estimating the effect of intervention from observational data while accounting for confounding variables is a key task in causal inference. Oftentimes, the confounders are unobserved, but we have access to large amounts of additional unstructured data (images, text) that contain valuable proxy signal about the missing confounders. This paper argues that leveraging this unstructured data can greatly improve the accuracy of causal effect estimation. Specifically, we introduce deep multi-modal structural equations, a generative model for causal effect estimation in which confounders are latent variables and unstructured data are proxy variables. This model supports multiple multi-modal proxies (images, text) as well as missing data. We empirically demonstrate that our approach outperforms existing methods based on propensity scores and corrects for confounding using unstructured inputs on tasks in genomics and healthcare. Our methods can potentially support the use of large amounts of data that were previously not used in causal inference △ Less

Submitted 11 December, 2022; v1 submitted 17 March, 2022; originally announced March 2022.

Comments: NeurIPS 2022 (accepted version)

arXiv:2203.02649 [pdf, other]

Towards an Antivirus for Quantum Computers

Authors: Sanjay Deshpande, Chuanqi Xu, Theodoros Trochatos, Yongshan Ding, Jakub Szefer

Abstract: Researchers are today exploring models for cloud-based usage of quantum computers where multi-tenancy can be used to share quantum computer hardware among multiple users. Multi-tenancy has a promise of allowing better utilization of the quantum computer hardware, but also opens up the quantum computer to new types of security attacks. As this and other recent research shows, it is possible to perf… ▽ More Researchers are today exploring models for cloud-based usage of quantum computers where multi-tenancy can be used to share quantum computer hardware among multiple users. Multi-tenancy has a promise of allowing better utilization of the quantum computer hardware, but also opens up the quantum computer to new types of security attacks. As this and other recent research shows, it is possible to perform a fault injection attack using crosstalk on quantum computers when a victim and attacker circuits are instantiated as co-tenants on the same quantum computer. To ensure such attacks do not happen, this paper proposes that new techniques should be developed to help catch malicious circuits before they are loaded onto quantum computer hardware. Following ideas from classical computers, a compile-time technique can be designed to scan quantum computer programs for malicious or suspicious code patterns before they are compiled into quantum circuits that run on a quantum computer. This paper presents ongoing work which demonstrates how crosstalk can affect Grover's algorithm, and then presents suggestions of how quantum programs could be analyzed to catch circuits that generate large amounts of crosstalk with malicious intent. △ Less

Submitted 4 March, 2022; originally announced March 2022.

Comments: 4 pages, 5 figures, HOST 2022 author version

arXiv:2203.02510 [pdf, ps, other]

Cellular Segmentation and Composition in Routine Histology Images using Deep Learning

Authors: Muhammad Dawood, Raja Muhammad Saad Bashir, Srijay Deshpande, Manahil Raza, Adam Shephard

Abstract: Identification and quantification of nuclei in colorectal cancer haematoxylin \& eosin (H\&E) stained histology images is crucial to prognosis and patient management. In computational pathology these tasks are referred to as nuclear segmentation, classification and composition and are used to extract meaningful interpretable cytological and architectural features for downstream analysis. The CoNIC… ▽ More Identification and quantification of nuclei in colorectal cancer haematoxylin \& eosin (H\&E) stained histology images is crucial to prognosis and patient management. In computational pathology these tasks are referred to as nuclear segmentation, classification and composition and are used to extract meaningful interpretable cytological and architectural features for downstream analysis. The CoNIC challenge poses the task of automated nuclei segmentation, classification and composition into six different types of nuclei from the largest publicly known nuclei dataset - Lizard. In this regard, we have developed pipelines for the prediction of nuclei segmentation using HoVer-Net and ALBRT for cellular composition. On testing on the preliminary test set, HoVer-Net achieved a PQ of 0.58, a PQ+ of 0.58 and finally a mPQ+ of 0.35. For the prediction of cellular composition with ALBRT on the preliminary test set, we achieved an overall $R^2$ score of 0.53, consisting of 0.84 for lymphocytes, 0.70 for epithelial cells, 0.70 for plasma and .060 for eosinophils. △ Less

Submitted 4 March, 2022; originally announced March 2022.

arXiv:2203.01183 [pdf]

doi 10.1109/CSCN53733.2021.9686150

Omnidirectional MediA Format (OMAF): Toolbox for Virtual Reality Services

Authors: Sachin Deshpande, Miska M. Hannuksela

Abstract: This paper provides an overview of the Omnidirectional Media Format (OMAF) standard, second edition, which has been recently finalized. OMAF specifies the media format for coding, storage, delivery, and rendering of omnidirectional media, including video, audio, images, and timed text. Additionally, OMAF supports multiple viewpoints corresponding to omnidirectional cameras and overlay images or vi… ▽ More This paper provides an overview of the Omnidirectional Media Format (OMAF) standard, second edition, which has been recently finalized. OMAF specifies the media format for coding, storage, delivery, and rendering of omnidirectional media, including video, audio, images, and timed text. Additionally, OMAF supports multiple viewpoints corresponding to omnidirectional cameras and overlay images or video rendered over the omnidirectional background image or video. Many examples of usage scenarios for multiple viewpoints and overlays are described in the paper. OMAF provides a toolbox of features, which can be selectively used in virtual reality services. Consequently, the paper presents the interoperability points specified in the OMAF standard, which enable signaling which OMAF features are in use or required to be supported in implementations. Finally, the paper summarizes which OMAF interoperability points have been taken into use in virtual reality service specifications by the 3rd Generation Partnership Project (3GPP) and the Virtual Reality Industry Forum (VRIF). △ Less

Submitted 2 March, 2022; originally announced March 2022.

Comments: 7 pages, 1 figure. This document is the accepted version of the paper that has been published in 2021 IEEE Conference on Standards for Communications and Networking (CSCN)

Journal ref: 2021 IEEE Conference on Standards for Communications and Networking (CSCN), 2021, pp. 20-25

arXiv:2112.07184 [pdf, other]

Calibrated and Sharp Uncertainties in Deep Learning via Density Estimation

Authors: Volodymyr Kuleshov, Shachi Deshpande

Abstract: Accurate probabilistic predictions can be characterized by two properties -- calibration and sharpness. However, standard maximum likelihood training yields models that are poorly calibrated and thus inaccurate -- a 90% confidence interval typically does not contain the true outcome 90% of the time. This paper argues that calibration is important in practice and is easy to maintain by performing l… ▽ More Accurate probabilistic predictions can be characterized by two properties -- calibration and sharpness. However, standard maximum likelihood training yields models that are poorly calibrated and thus inaccurate -- a 90% confidence interval typically does not contain the true outcome 90% of the time. This paper argues that calibration is important in practice and is easy to maintain by performing low-dimensional density estimation. We introduce a simple training procedure based on recalibration that yields calibrated models without sacrificing overall performance; unlike previous approaches, ours ensures the most general property of distribution calibration and applies to any model, including neural networks. We formally prove the correctness of our procedure assuming that we can estimate densities in low dimensions and we establish uniform convergence bounds. Our results yield empirical performance improvements on linear and deep Bayesian models and suggest that calibration should be increasingly leveraged across machine learning. △ Less

Submitted 19 September, 2022; v1 submitted 14 December, 2021; originally announced December 2021.

ACM Class: I.2; I.5

arXiv:2112.04620 [pdf, other]

Online Calibrated and Conformal Prediction Improves Bayesian Optimization

Authors: Shachi Deshpande, Charles Marx, Volodymyr Kuleshov

Abstract: Accurate uncertainty estimates are important in sequential model-based decision-making tasks such as Bayesian optimization. However, these estimates can be imperfect if the data violates assumptions made by the model (e.g., Gaussianity). This paper studies which uncertainties are needed in model-based decision-making and in Bayesian optimization, and argues that uncertainties can benefit from cali… ▽ More Accurate uncertainty estimates are important in sequential model-based decision-making tasks such as Bayesian optimization. However, these estimates can be imperfect if the data violates assumptions made by the model (e.g., Gaussianity). This paper studies which uncertainties are needed in model-based decision-making and in Bayesian optimization, and argues that uncertainties can benefit from calibration -- i.e., an 80% predictive interval should contain the true outcome 80% of the time. Maintaining calibration, however, can be challenging when the data is non-stationary and depends on our actions. We propose using simple algorithms based on online learning to provably maintain calibration on non-i.i.d. data, and we show how to integrate these algorithms in Bayesian optimization with minimal overhead. Empirically, we find that calibrated Bayesian optimization converges to better optima in fewer steps, and we demonstrate improved performance on standard benchmark functions and hyperparameter optimization tasks. △ Less

Submitted 25 June, 2024; v1 submitted 8 December, 2021; originally announced December 2021.

ACM Class: I.2; I.5

Journal ref: Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, May 2024; PMLR 238:1450-1458

arXiv:2111.01867 [pdf, other]

doi 10.1016/j.cma.2022.115307

Probabilistic Deep Learning for Real-Time Large Deformation Simulations

Authors: Saurabh Deshpande, Jakub Lengiewicz, Stéphane P. A. Bordas

Abstract: For many novel applications, such as patient-specific computer-aided surgery, conventional solution techniques of the underlying nonlinear problems are usually computationally too expensive and are lacking information about how certain can we be about their predictions. In the present work, we propose a highly efficient deep-learning surrogate framework that is able to accurately predict the respo… ▽ More For many novel applications, such as patient-specific computer-aided surgery, conventional solution techniques of the underlying nonlinear problems are usually computationally too expensive and are lacking information about how certain can we be about their predictions. In the present work, we propose a highly efficient deep-learning surrogate framework that is able to accurately predict the response of bodies undergoing large deformations in real-time. The surrogate model has a convolutional neural network architecture, called U-Net, which is trained with force-displacement data obtained with the finite element method. We propose deterministic and probabilistic versions of the framework. The probabilistic framework utilizes the Variational Bayes Inference approach and is able to capture all the uncertainties present in the data as well as in the deep-learning model. Based on several benchmark examples, we show the predictive capabilities of the framework and discuss its possible limitations △ Less

Submitted 4 July, 2022; v1 submitted 2 November, 2021; originally announced November 2021.

Journal ref: Computer Methods in Applied Mechanics and Engineering, 2022, Volume 398

arXiv:2109.09248 [pdf, other]

Wages and Utilities in a Closed Economy

Authors: Sanyukta Deshpande, Milind A. Sohoni

Abstract: The broad objective of this paper is to propose a mathematical model for the study of causes of wage inequality and relate it to choices of consumption, the technologies of production, and the composition of labor in an economy. The paper constructs a Simple Closed Model, or an SCM, for short, for closed economies, in which the consumption and the production parts are clearly separated and yet cou… ▽ More The broad objective of this paper is to propose a mathematical model for the study of causes of wage inequality and relate it to choices of consumption, the technologies of production, and the composition of labor in an economy. The paper constructs a Simple Closed Model, or an SCM, for short, for closed economies, in which the consumption and the production parts are clearly separated and yet coupled. The model is established as a specialization of the Arrow-Debreu model and its equilibria correspond directly with those of the general Arrow-Debreu model. The formulation allows us to identify the combinatorial data which link parameters of the economic system with its equilibria, in particular, the impact of consumer preferences on wages. The SCM model also allows the formulation and explicit construction of the consumer choice game, where expressed utilities of various labor classes serve as strategies with total or relative wages as the pay-offs. We illustrate, through examples, the mathematical details of the consumer choice game. We show that consumer preferences, expressed through modified utility functions, do indeed percolate through the economy, and influence not only prices but also production and wages. Thus, consumer choice may serve as an effective tool for wage redistribution. △ Less

Submitted 17 August, 2023; v1 submitted 19 September, 2021; originally announced September 2021.

arXiv:2108.07031 [pdf, other]

doi 10.1109/HiPC56025.2022.00031

On the performance of GPU accelerated q-LSKUM based meshfree solvers in Fortran, C++, Python, and Julia

Authors: Nischay Ram Mamidi, Kumar Prasun, Dhruv Saxena, Anil Nemili, Bharatkumar Sharma, S. M. Deshpande

Abstract: This report presents a comprehensive analysis of the performance of GPU accelerated meshfree CFD solvers for two-dimensional compressible flows in Fortran, C++, Python, and Julia. The programming model CUDA is used to develop the GPU codes. The meshfree solver is based on the least squares kinetic upwind method with entropy variables (q-LSKUM). To assess the computational efficiency of the GPU sol… ▽ More This report presents a comprehensive analysis of the performance of GPU accelerated meshfree CFD solvers for two-dimensional compressible flows in Fortran, C++, Python, and Julia. The programming model CUDA is used to develop the GPU codes. The meshfree solver is based on the least squares kinetic upwind method with entropy variables (q-LSKUM). To assess the computational efficiency of the GPU solvers and to compare their relative performance, benchmark calculations are performed on seven levels of point distribution. To analyse the difference in their run-times, the computationally intensive kernel is profiled. Various performance metrics are investigated from the profiled data to determine the cause of observed variation in run-times. To address some of the performance related issues, various optimisation strategies are employed. The optimised GPU codes are compared with the naive codes, and conclusions are drawn from their performance. △ Less

Submitted 16 August, 2021; originally announced August 2021.

Comments: 42 pages, 3 figures

ACM Class: D.3.0; J.2

arXiv:2106.06510 [pdf, other]

Measuring the robustness of Gaussian processes to kernel choice

Authors: William T. Stephenson, Soumya Ghosh, Tin D. Nguyen, Mikhail Yurochkin, Sameer K. Deshpande, Tamara Broderick

Abstract: Gaussian processes (GPs) are used to make medical and scientific decisions, including in cardiac care and monitoring of atmospheric carbon dioxide levels. Notably, the choice of GP kernel is often somewhat arbitrary. In particular, uncountably many kernels typically align with qualitative prior knowledge (e.g.\ function smoothness or stationarity). But in practice, data analysts choose among a han… ▽ More Gaussian processes (GPs) are used to make medical and scientific decisions, including in cardiac care and monitoring of atmospheric carbon dioxide levels. Notably, the choice of GP kernel is often somewhat arbitrary. In particular, uncountably many kernels typically align with qualitative prior knowledge (e.g.\ function smoothness or stationarity). But in practice, data analysts choose among a handful of convenient standard kernels (e.g.\ squared exponential). In the present work, we ask: Would decisions made with a GP differ under other, qualitatively interchangeable kernels? We show how to answer this question by solving a constrained optimization problem over a finite-dimensional space. We can then use standard optimizers to identify substantive changes in relevant decisions made with a GP. We demonstrate in both synthetic and real-world examples that decisions made with a GP can exhibit non-robustness to kernel choice, even when prior draws are qualitatively interchangeable to a user. △ Less

Submitted 12 March, 2022; v1 submitted 11 June, 2021; originally announced June 2021.

Comments: AISTATS 2022

arXiv:2008.07331 [pdf, other]

Interactive Visualization for Debugging RL

Authors: Shuby Deshpande, Benjamin Eysenbach, Jeff Schneider

Abstract: Visualization tools for supervised learning allow users to interpret, introspect, and gain an intuition for the successes and failures of their models. While reinforcement learning practitioners ask many of the same questions, existing tools are not applicable to the RL setting as these tools address challenges typically found in the supervised learning regime. In this work, we design and implemen… ▽ More Visualization tools for supervised learning allow users to interpret, introspect, and gain an intuition for the successes and failures of their models. While reinforcement learning practitioners ask many of the same questions, existing tools are not applicable to the RL setting as these tools address challenges typically found in the supervised learning regime. In this work, we design and implement an interactive visualization tool for debugging and interpreting RL algorithms. Our system addresses many features missing from previous tools such as (1) tools for supervised learning often are not interactive; (2) while debugging RL policies researchers use state representations that are different from those seen by the agent; (3) a framework designed to make the debugging RL policies more conducive. We provide an example workflow of how this system could be used, along with ideas for future extensions. △ Less

Submitted 18 August, 2020; v1 submitted 14 August, 2020; originally announced August 2020.

Comments: Builds on preliminary work presented at ICML 2020 (WHI) arXiv:2007.05577. An interactive demo of the system can be at https://tinyurl.com/y5gv5t4m

arXiv:2008.04526 [pdf, other]

SAFRON: Stitching Across the Frontier for Generating Colorectal Cancer Histology Images

Authors: Srijay Deshpande, Fayyaz Minhas, Simon Graham, Nasir Rajpoot

Abstract: Synthetic images can be used for the development and evaluation of deep learning algorithms in the context of limited availability of data. In the field of computational pathology, where histology images are large in size and visual context is crucial, synthesis of large high resolution images via generative modeling is a challenging task. This is due to memory and computational constraints hinder… ▽ More Synthetic images can be used for the development and evaluation of deep learning algorithms in the context of limited availability of data. In the field of computational pathology, where histology images are large in size and visual context is crucial, synthesis of large high resolution images via generative modeling is a challenging task. This is due to memory and computational constraints hindering the generation of large images. To address this challenge, we propose a novel SAFRON (Stitching Across the FRONtiers) framework to construct realistic, large high resolution tissue image tiles from ground truth annotations while preserving morphological features and with minimal boundary artifacts. We show that the proposed method can generate realistic image tiles of arbitrarily large size after training it on relatively small image patches. We demonstrate that our model can generate high quality images, both visually and in terms of the Frechet Inception Distance. Compared to other existing approaches, our framework is efficient in terms of the memory requirements for training and also in terms of the number of computations to construct a large high-resolution image. We also show that training on synthetic data generated by SAFRON can significantly boost the performance of a state-of-the-art algorithm for gland segmentation in colorectal cancer histology images. Sample high resolution images generated using SAFRON are available at the URL: https://warwick.ac.uk/TIALab/SAFRON △ Less

Submitted 26 March, 2021; v1 submitted 11 August, 2020; originally announced August 2020.

arXiv:2007.09186 [pdf, other]

AWS CORD-19 Search: A Neural Search Engine for COVID-19 Literature

Authors: Parminder Bhatia, Lan Liu, Kristjan Arumae, Nima Pourdamghani, Suyog Deshpande, Ben Snively, Mona Mona, Colby Wise, George Price, Shyam Ramaswamy, Xiaofei Ma, Ramesh Nallapati, Zhiheng Huang, Bing Xiang, Taha Kass-Hout

Abstract: Coronavirus disease (COVID-19) has been declared as a pandemic by WHO with thousands of cases being reported each day. Numerous scientific articles are being published on the disease raising the need for a service which can organize, and query them in a reliable fashion. To support this cause we present AWS CORD-19 Search (ACS), a public, COVID-19 specific, neural search engine that is powered by… ▽ More Coronavirus disease (COVID-19) has been declared as a pandemic by WHO with thousands of cases being reported each day. Numerous scientific articles are being published on the disease raising the need for a service which can organize, and query them in a reliable fashion. To support this cause we present AWS CORD-19 Search (ACS), a public, COVID-19 specific, neural search engine that is powered by several machine learning systems to support natural language based searches. ACS with capabilities such as document ranking, passage ranking, question answering and topic classification provides a scalable solution to COVID-19 researchers and policy makers in their search and discovery for answers to high priority scientific questions. We present a quantitative evaluation and qualitative analysis of the system against other leading COVID-19 search platforms. ACS is top performing across these systems yielding quality results which we detail with relevant examples in this work. △ Less

Submitted 7 October, 2020; v1 submitted 17 July, 2020; originally announced July 2020.

arXiv:2007.05577 [pdf, other]

Vizarel: A System to Help Better Understand RL Agents

Authors: Shuby Deshpande, Jeff Schneider

Abstract: Visualization tools for supervised learning have allowed users to interpret, introspect, and gain intuition for the successes and failures of their models. While reinforcement learning practitioners ask many of the same questions, existing tools are not applicable to the RL setting. In this work, we describe our initial attempt at constructing a prototype of these ideas, through identifying possib… ▽ More Visualization tools for supervised learning have allowed users to interpret, introspect, and gain intuition for the successes and failures of their models. While reinforcement learning practitioners ask many of the same questions, existing tools are not applicable to the RL setting. In this work, we describe our initial attempt at constructing a prototype of these ideas, through identifying possible features that such a system should encapsulate. Our design is motivated by envisioning the system to be a platform on which to experiment with interpretable reinforcement learning. △ Less

Submitted 10 July, 2020; originally announced July 2020.

Comments: Accepted to ICML 2020 Workshop on Human Interpretability in Machine Learning (Spotlight)

arXiv:2007.02149 [pdf]

Human Assisted Artificial Intelligence Based Technique to Create Natural Features for OpenStreetMap

Authors: Piyush Yadav, Dipto Sarkar, Shailesh Deshpande, Edward Curry

Abstract: In this work, we propose an AI-based technique using freely available satellite images like Landsat and Sentinel to create natural features over OSM in congruence with human editors acting as initiators and validators. The method is based on Interactive Machine Learning technique where human inputs are coupled with the machine to solve complex problems efficiently as compare to pure autonomous pro… ▽ More In this work, we propose an AI-based technique using freely available satellite images like Landsat and Sentinel to create natural features over OSM in congruence with human editors acting as initiators and validators. The method is based on Interactive Machine Learning technique where human inputs are coupled with the machine to solve complex problems efficiently as compare to pure autonomous process. We use a bottom-up approach where a machine learning (ML) pipeline in loop with editors is used to extract classes using spectral signatures of images and later convert them to editable features to create natural features. △ Less

Submitted 8 July, 2020; v1 submitted 4 July, 2020; originally announced July 2020.

Comments: 3 pages, 2 Figures, Submitted to FOSS4G Europe 2020 Academic Track (Postponed to 2021)

arXiv:2007.00480 [pdf]

doi 10.1007/978-3-030-13453-2_6

Computational Model for Urban Growth Using Socioeconomic Latent Parameters

Authors: Piyush Yadav, Shamsuddin Ladha, Shailesh Deshpande, Edward Curry

Abstract: Land use land cover changes (LULCC) are generally modeled using multi-scale spatio-temporal variables. Recently, Markov Chain (MC) has been used to model LULCC. However, the model is derived from the proportion of LULCC observed over a given period and it does not account for temporal factors such as macro-economic, socio-economic, etc. In this paper, we present a richer model based on Hidden Mark… ▽ More Land use land cover changes (LULCC) are generally modeled using multi-scale spatio-temporal variables. Recently, Markov Chain (MC) has been used to model LULCC. However, the model is derived from the proportion of LULCC observed over a given period and it does not account for temporal factors such as macro-economic, socio-economic, etc. In this paper, we present a richer model based on Hidden Markov Model (HMM), grounded in the common knowledge that economic, social and LULCC processes are tightly coupled. We propose a HMM where LULCC classes represent hidden states and temporal fac-tors represent emissions that are conditioned on the hidden states. To our knowledge, HMM has not been used in LULCC models in the past. We further demonstrate its integration with other spatio-temporal models such as Logistic Regression. The integrated model is applied on the LULCC data of Pune district in the state of Maharashtra (India) to predict and visualize urban LULCC over the past 14 years. We observe that the HMM integrated model has improved prediction accuracy as compared to the corresponding MC integrated model △ Less

Submitted 1 July, 2020; originally announced July 2020.

Comments: 12 pages

Journal ref: ECML PKDD 2018 Lecture Notes in Computer Science vol 11329 Springer Cham

arXiv:2006.13473 [pdf, other]

doi 10.1145/3394486.3403323

AutoKnow: Self-Driving Knowledge Collection for Products of Thousands of Types

Authors: Xin Luna Dong, Xiang He, Andrey Kan, Xian Li, Yan Liang, Jun Ma, Yifan Ethan Xu, Chenwei Zhang, Tong Zhao, Gabriel Blanco Saldana, Saurabh Deshpande, Alexandre Michetti Manduca, Jay Ren, Surender Pal Singh, Fan Xiao, Haw-Shiuan Chang, Giannis Karamanolakis, Yuning Mao, Yaqing Wang, Christos Faloutsos, Andrew McCallum, Jiawei Han

Abstract: Can one build a knowledge graph (KG) for all products in the world? Knowledge graphs have firmly established themselves as valuable sources of information for search and question answering, and it is natural to wonder if a KG can contain information about products offered at online retail sites. There have been several successful examples of generic KGs, but organizing information about products p… ▽ More Can one build a knowledge graph (KG) for all products in the world? Knowledge graphs have firmly established themselves as valuable sources of information for search and question answering, and it is natural to wonder if a KG can contain information about products offered at online retail sites. There have been several successful examples of generic KGs, but organizing information about products poses many additional challenges, including sparsity and noise of structured data for products, complexity of the domain with millions of product types and thousands of attributes, heterogeneity across large number of categories, as well as large and constantly growing number of products. We describe AutoKnow, our automatic (self-driving) system that addresses these challenges. The system includes a suite of novel techniques for taxonomy construction, product property identification, knowledge extraction, anomaly detection, and synonym discovery. AutoKnow is (a) automatic, requiring little human intervention, (b) multi-scalable, scalable in multiple dimensions (many domains, many products, and many attributes), and (c) integrative, exploiting rich customer behavior logs. AutoKnow has been operational in collecting product knowledge for over 11K product types. △ Less

Submitted 24 June, 2020; originally announced June 2020.

Comments: KDD 2020

arXiv:2006.12669 [pdf, other]

Approximate Cross-Validation for Structured Models

Authors: Soumya Ghosh, William T. Stephenson, Tin D. Nguyen, Sameer K. Deshpande, Tamara Broderick

Abstract: Many modern data analyses benefit from explicitly modeling dependence structure in data -- such as measurements across time or space, ordered words in a sentence, or genes in a genome. A gold standard evaluation technique is structured cross-validation (CV), which leaves out some data subset (such as data within a time interval or data in a geographic region) in each fold. But CV here can be prohi… ▽ More Many modern data analyses benefit from explicitly modeling dependence structure in data -- such as measurements across time or space, ordered words in a sentence, or genes in a genome. A gold standard evaluation technique is structured cross-validation (CV), which leaves out some data subset (such as data within a time interval or data in a geographic region) in each fold. But CV here can be prohibitively slow due to the need to re-run already-expensive learning algorithms many times. Previous work has shown approximate cross-validation (ACV) methods provide a fast and provably accurate alternative in the setting of empirical risk minimization. But this existing ACV work is restricted to simpler models by the assumptions that (i) data across CV folds are independent and (ii) an exact initial model fit is available. In structured data analyses, both these assumptions are often untrue. In the present work, we address (i) by extending ACV to CV schemes with dependence structure between the folds. To address (ii), we verify -- both theoretically and empirically -- that ACV quality deteriorates smoothly with noise in the initial fit. We demonstrate the accuracy and computational benefits of our proposed methods on a diverse set of real-world applications. △ Less

Submitted 1 December, 2020; v1 submitted 22 June, 2020; originally announced June 2020.

Comments: 25 pages, 8 figures. NeurIPS 2020 camera ready. v2 fixes typos and provides additional empirical results. Code: https://github.com/SoumyaTGhosh/structured-infinitesimal-jackknife

arXiv:1906.03479 [pdf, other]

Learning Radiative Transfer Models for Climate Change Applications in Imaging Spectroscopy

Authors: Shubhankar Deshpande, Brian D. Bue, David R. Thompson, Vijay Natraj, Mario Parente

Abstract: According to a recent investigation, an estimated 33-50% of the world's coral reefs have undergone degradation, believed to be as a result of climate change. A strong driver of climate change and the subsequent environmental impact are greenhouse gases such as methane. However, the exact relation climate change has to the environmental condition cannot be easily established. Remote sensing methods… ▽ More According to a recent investigation, an estimated 33-50% of the world's coral reefs have undergone degradation, believed to be as a result of climate change. A strong driver of climate change and the subsequent environmental impact are greenhouse gases such as methane. However, the exact relation climate change has to the environmental condition cannot be easily established. Remote sensing methods are increasingly being used to quantify and draw connections between rapidly changing climatic conditions and environmental impact. A crucial part of this analysis is processing spectroscopy data using radiative transfer models (RTMs) which is a computationally expensive process and limits their use with high volume imaging spectrometers. This work presents an algorithm that can efficiently emulate RTMs using neural networks leading to a multifold speedup in processing time, and yielding multiple downstream benefits. △ Less

Submitted 8 June, 2019; originally announced June 2019.

Comments: Accepted to International Conference on Machine Learning (ICML) 2019 Workshop: Climate Change: How Can AI Help?

arXiv:1902.06371 [pdf, other]

Achieving Throughput via Fine-Grained Path Planning in Small World DTNs

Authors: Dhrubojyoti Roy, Mukundan Sridharan, Satyajeet Deshpande, Anish Arora

Abstract: We explore the benefits of using fine-grained statistics in small world DTNs to achieve high throughput without the aid of external infrastructure. We first design an empirical node-pair inter-contacts model that predicts meetings within a time frame of suitable length, typically of the order of days, with a probability above some threshold, and can be readily computed with low overhead. This temp… ▽ More We explore the benefits of using fine-grained statistics in small world DTNs to achieve high throughput without the aid of external infrastructure. We first design an empirical node-pair inter-contacts model that predicts meetings within a time frame of suitable length, typically of the order of days, with a probability above some threshold, and can be readily computed with low overhead. This temporal knowledge enables effective time-dependent path planning that can be respond to even per-packet deadline variabilities. We describe one such routing framework, REAPER (for Reliable, Efficient and Predictive Routing), that is fully distributed and self-stabilizing. Its key objective is to provide probabilistic bounds on path length (cost) and delay in a temporally fine-grained way, while exploiting the small world structure to entail only polylogarithmic storage and control overhead. A simulation-based evaluation confirms that REAPER achieves high throughput and energy efficiency across the spectrum of ultra-light to heavy network traffic, and substantially outperforms state-of-the-art single copy protocols as well as sociability-based protocols that rely on essentially coarse-grained metrics. △ Less

Submitted 17 February, 2019; originally announced February 2019.

Comments: arXiv admin note: text overlap with arXiv:1310.1162

arXiv:1902.05064 [pdf, other]

doi 10.1016/j.compbiomed.2018.12.014

PLIT: An alignment-free computational tool for identification of long non-coding RNAs in plant transcriptomic datasets

Authors: S. Deshpande, J. Shuttleworth, J. Yang, S. Taramonli, M. England

Abstract: Long non-coding RNAs (lncRNAs) are a class of non-coding RNAs which play a significant role in several biological processes. RNA-seq based transcriptome sequencing has been extensively used for identification of lncRNAs. However, accurate identification of lncRNAs in RNA-seq datasets is crucial for exploring their characteristic functions in the genome as most coding potential computation (CPC) to… ▽ More Long non-coding RNAs (lncRNAs) are a class of non-coding RNAs which play a significant role in several biological processes. RNA-seq based transcriptome sequencing has been extensively used for identification of lncRNAs. However, accurate identification of lncRNAs in RNA-seq datasets is crucial for exploring their characteristic functions in the genome as most coding potential computation (CPC) tools fail to accurately identify them in transcriptomic data. Well-known CPC tools such as CPC2, lncScore, CPAT are primarily designed for prediction of lncRNAs based on the GENCODE, NONCODE and CANTATAdb databases. The prediction accuracy of these tools often drops when tested on transcriptomic datasets. This leads to higher false positive results and inaccuracy in the function annotation process. In this study, we present a novel tool, PLIT, for the identification of lncRNAs in plants RNA-seq datasets. PLIT implements a feature selection method based on L1 regularization and iterative Random Forests (iRF) classification for selection of optimal features. Based on sequence and codon-bias features, it classifies the RNA-seq derived FASTA sequences into coding or long non-coding transcripts. Using L1 regularization, 31 optimal features were obtained based on lncRNA and protein-coding transcripts from 8 plant species. The performance of the tool was evaluated on 7 plant RNA-seq datasets using 10-fold cross-validation. The analysis exhibited superior accuracy when evaluated against currently available state-of-the-art CPC tools. △ Less

Submitted 12 February, 2019; originally announced February 2019.

Comments: 36 pages. Author's accepted version (Green OA)

Journal ref: Computers in Biology and Medicine, 105, pp. 169 - 181, Elevier, 2019

arXiv:1807.08820 [pdf, other]

RAIM: Recurrent Attentive and Intensive Model of Multimodal Patient Monitoring Data

Authors: Yanbo Xu, Siddharth Biswal, Shriprasad R Deshpande, Kevin O Maher, Jimeng Sun

Abstract: With the improvement of medical data capturing, vast amount of continuous patient monitoring data, e.g., electrocardiogram (ECG), real-time vital signs and medications, become available for clinical decision support at intensive care units (ICUs). However, it becomes increasingly challenging to model such data, due to high density of the monitoring data, heterogeneous data types and the requiremen… ▽ More With the improvement of medical data capturing, vast amount of continuous patient monitoring data, e.g., electrocardiogram (ECG), real-time vital signs and medications, become available for clinical decision support at intensive care units (ICUs). However, it becomes increasingly challenging to model such data, due to high density of the monitoring data, heterogeneous data types and the requirement for interpretable models. Integration of these high-density monitoring data with the discrete clinical events (including diagnosis, medications, labs) is challenging but potentially rewarding since richness and granularity in such multimodal data increase the possibilities for accurate detection of complex problems and predicting outcomes (e.g., length of stay and mortality). We propose Recurrent Attentive and Intensive Model (RAIM) for jointly analyzing continuous monitoring data and discrete clinical events. RAIM introduces an efficient attention mechanism for continuous monitoring data (e.g., ECG), which is guided by discrete clinical events (e.g, medication usage). We apply RAIM in predicting physiological decompensation and length of stay in those critically ill patients at ICU. With evaluations on MIMIC- III Waveform Database Matched Subset, we obtain an AUC-ROC score of 90.18% for predicting decompensation and an accuracy of 86.82% for forecasting length of stay with our final model, which outperforms our six baseline models. △ Less

Submitted 23 July, 2018; originally announced July 2018.

arXiv:1310.1162

A Little Prediction Goes a Long Way: Routing in Semi-Deterministic Delay Tolerant Networks

Authors: Dhrubojyoti Roy, Mukundan Sridharan, Satyajeet Deshpande, Anish Arora

Abstract: Realizing delay-capacity in intermittently connected mobile networks remains a largely open question, with state-of-the-art routing schemes typically focusing either on delay or on capacity. We show the feasibility of routing with both high goodput and desired delay constraints, with REAPER (for Reliable, Efficient, and Predictive Routing), a fully distributed convergecast routing framework that j… ▽ More Realizing delay-capacity in intermittently connected mobile networks remains a largely open question, with state-of-the-art routing schemes typically focusing either on delay or on capacity. We show the feasibility of routing with both high goodput and desired delay constraints, with REAPER (for Reliable, Efficient, and Predictive Routing), a fully distributed convergecast routing framework that jointly optimizes both path length and path delay. A key idea for efficient instantiation of REAPER is to exploit predictability of mobility patterns, in terms of a semi-deterministic model which appropriately captures several vehicular and human inter-contact patterns. Packets are thus routed using paths that are jointly optimal at their time of arrival, in contrast to extant DTN protocols which use time-average metrics for routing. REAPER is also self-stabilizing to changes in the mobility pattern. A simulation-based evaluation confirms that, across the spectrum of ultra-light to heavy traffics, REAPER achieves up to 135% and 200% higher throughput and up to 250% and 1666% higher energy efficiency than state-of-the-art single-copy protocols MEED-DVR and PROPHET, which optimize a single metric only, specifically, expected delay and path probability respectively. △ Less

Submitted 22 January, 2014; v1 submitted 4 October, 2013; originally announced October 2013.

Comments: This paper has been withdrawn by the authors. Withdrawn since document intended to be anonymous

Showing 1–44 of 44 results for author: Deshpande, S