Search | arXiv e-print repository

arXiv:2407.20685 [pdf, other]

CultureVo: The Serious Game of Utilizing Gen AI for Enhancing Cultural Intelligence

Authors: Ajita Agarwala, Anupam Purwar, Viswanadhasai Rao

Abstract: CultureVo, Inc. has developed the Integrated Culture Learning Suite (ICLS) to deliver foundational knowledge of world cultures through a combination of interactive lessons and gamified experiences. This paper explores how Generative AI powered by open source Large Langauge Models are utilized within the ICLS to enhance cultural intelligence. The suite employs Generative AI techniques to automate t… ▽ More CultureVo, Inc. has developed the Integrated Culture Learning Suite (ICLS) to deliver foundational knowledge of world cultures through a combination of interactive lessons and gamified experiences. This paper explores how Generative AI powered by open source Large Langauge Models are utilized within the ICLS to enhance cultural intelligence. The suite employs Generative AI techniques to automate the assessment of learner knowledge, analyze behavioral patterns, and manage interactions with non-player characters using real time learner assessment. Additionally, ICLS provides contextual hint and recommend course content by assessing learner proficiency, while Generative AI facilitates the automated creation and validation of educational content. △ Less

Submitted 1 August, 2024; v1 submitted 30 July, 2024; originally announced July 2024.

Comments: Fourth International Conference on AI-ML Systems, 8-11 October, 2024, Louisiana, USA

arXiv:2407.18919 [pdf]

Accelerating Drug Safety Assessment using Bidirectional-LSTM for SMILES Data

Authors: K. Venkateswara Rao, Kunjam Nageswara Rao, G. Sita Ratnam

Abstract: Computational methods are useful in accelerating the pace of drug discovery. Drug discovery carries several steps such as target identification and validation, lead discovery, and lead optimisation etc., In the phase of lead optimisation, the absorption, distribution, metabolism, excretion, and toxicity properties of lead compounds are assessed. To address the issue of predicting toxicity and solu… ▽ More Computational methods are useful in accelerating the pace of drug discovery. Drug discovery carries several steps such as target identification and validation, lead discovery, and lead optimisation etc., In the phase of lead optimisation, the absorption, distribution, metabolism, excretion, and toxicity properties of lead compounds are assessed. To address the issue of predicting toxicity and solubility in the lead compounds, represented in Simplified Molecular Input Line Entry System (SMILES) notation. Among the different approaches that work on SMILES data, the proposed model was built using a sequence-based approach. The proposed Bi-Directional Long Short Term Memory (BiLSTM) is a variant of Recurrent Neural Network (RNN) that processes input molecular sequences for the comprehensive examination of the structural features of molecules from both forward and backward directions. The proposed work aims to understand the sequential patterns encoded in the SMILES strings, which are then utilised for predicting the toxicity of the molecules. The proposed model on the ClinTox dataset surpasses previous approaches such as Trimnet and Pre-training Graph neural networks(GNN) by achieving a ROC accuracy of 0.96. BiLSTM outperforms the previous model on FreeSolv dataset with a low RMSE value of 1.22 in solubility prediction. △ Less

Submitted 8 July, 2024; originally announced July 2024.

Comments: 10 pages

arXiv:2407.11149 [pdf]

BMR and BWR: Two simple metaphor-free optimization algorithms for solving real-life non-convex constrained and unconstrained problems

Authors: Ravipudi Venkata Rao, Ravikumar shah

Abstract: This paper presents two simple yet powerful optimization algorithms named Best-Mean-Random (BMR) and Best-Worst-Randam (BWR) algorithms to handle both constrained and unconstrained optimization problems. These algorithms are free of metaphors and algorithm-specific parameters. The BMR algorithm is based on the best, mean, and random solutions of the population generated for solving a given problem… ▽ More This paper presents two simple yet powerful optimization algorithms named Best-Mean-Random (BMR) and Best-Worst-Randam (BWR) algorithms to handle both constrained and unconstrained optimization problems. These algorithms are free of metaphors and algorithm-specific parameters. The BMR algorithm is based on the best, mean, and random solutions of the population generated for solving a given problem; and the BWR algorithm is based on the best, worst, and random solutions. The performances of the proposed two algorithms are investigated by implementing them on 26 real-life non-convex constrained optimization problems given in the Congress on Evolutionary Computation (CEC) 2020 competition and comparisons are made with those of the other prominent optimization algorithms. Furthermore, computational experiments are conducted on 30 unconstrained standard benchmark optimization problems including 5 recently developed benchmark problems having distinct characteristics. The results proved the better competitiveness and superiority of the proposed simple algorithms. The optimization research community may gain an advantage by adapting these algorithms to solve various constrained and unconstrained real-life optimization problems across various scientific and engineering disciplines. △ Less

Submitted 15 July, 2024; originally announced July 2024.

Comments: 28 pages, 5 figures, original paper

ACM Class: C.1.3; I.2.6; I.5

arXiv:2406.19150 [pdf, other]

RAVEN: Multitask Retrieval Augmented Vision-Language Learning

Authors: Varun Nagaraj Rao, Siddharth Choudhary, Aditya Deshpande, Ravi Kumar Satzoda, Srikar Appalaraju

Abstract: The scaling of large language models to encode all the world's knowledge in model parameters is unsustainable and has exacerbated resource barriers. Retrieval-Augmented Generation (RAG) presents a potential solution, yet its application to vision-language models (VLMs) is under explored. Existing methods focus on models designed for single tasks. Furthermore, they're limited by the need for resour… ▽ More The scaling of large language models to encode all the world's knowledge in model parameters is unsustainable and has exacerbated resource barriers. Retrieval-Augmented Generation (RAG) presents a potential solution, yet its application to vision-language models (VLMs) is under explored. Existing methods focus on models designed for single tasks. Furthermore, they're limited by the need for resource intensive pre training, additional parameter requirements, unaddressed modality prioritization and lack of clear benefit over non-retrieval baselines. This paper introduces RAVEN, a multitask retrieval augmented VLM framework that enhances base VLMs through efficient, task specific fine-tuning. By integrating retrieval augmented samples without the need for additional retrieval-specific parameters, we show that the model acquires retrieval properties that are effective across multiple tasks. Our results and extensive ablations across retrieved modalities for the image captioning and VQA tasks indicate significant performance improvements compared to non retrieved baselines +1 CIDEr on MSCOCO, +4 CIDEr on NoCaps and nearly a +3\% accuracy on specific VQA question types. This underscores the efficacy of applying RAG approaches to VLMs, marking a stride toward more efficient and accessible multimodal learning. △ Less

Submitted 27 June, 2024; originally announced June 2024.

arXiv:2406.10768 [pdf, other]

Rideshare Transparency: Translating Gig Worker Insights on AI Platform Design to Policy

Authors: Varun Nagaraj Rao, Samantha Dalal, Eesha Agarwal, Dana Calacci, Andrés Monroy-Hernández

Abstract: Rideshare platforms exert significant control over workers through algorithmic systems that can result in financial, emotional, and physical harm. What steps can platforms, designers, and practitioners take to mitigate these negative impacts and meet worker needs? In this paper, through a novel mixed methods study combining a LLM-based analysis of over 1 million comments posted to online platform… ▽ More Rideshare platforms exert significant control over workers through algorithmic systems that can result in financial, emotional, and physical harm. What steps can platforms, designers, and practitioners take to mitigate these negative impacts and meet worker needs? In this paper, through a novel mixed methods study combining a LLM-based analysis of over 1 million comments posted to online platform worker communities with semi-structured interviews of workers, we thickly characterize transparency-related harms, mitigation strategies, and worker needs while validating and contextualizing our findings within the broader worker community. Our findings expose a transparency gap between existing platform designs and the information drivers need, particularly concerning promotions, fares, routes, and task allocation. Our analysis suggests that rideshare workers need key pieces of information, which we refer to as indicators, to make informed work decisions. These indicators include details about rides, driver statistics, algorithmic implementation details, and platform policy information. We argue that instead of relying on platforms to include such information in their designs, new regulations that require platforms to publish public transparency reports may be a more effective solution to improve worker well-being. We offer recommendations for implementing such a policy. △ Less

Submitted 19 June, 2024; v1 submitted 15 June, 2024; originally announced June 2024.

arXiv:2405.18322 [pdf, other]

SCE-MAE: Selective Correspondence Enhancement with Masked Autoencoder for Self-Supervised Landmark Estimation

Authors: Kejia Yin, Varshanth R. Rao, Ruowei Jiang, Xudong Liu, Parham Aarabi, David B. Lindell

Abstract: Self-supervised landmark estimation is a challenging task that demands the formation of locally distinct feature representations to identify sparse facial landmarks in the absence of annotated data. To tackle this task, existing state-of-the-art (SOTA) methods (1) extract coarse features from backbones that are trained with instance-level self-supervised learning (SSL) paradigms, which neglect the… ▽ More Self-supervised landmark estimation is a challenging task that demands the formation of locally distinct feature representations to identify sparse facial landmarks in the absence of annotated data. To tackle this task, existing state-of-the-art (SOTA) methods (1) extract coarse features from backbones that are trained with instance-level self-supervised learning (SSL) paradigms, which neglect the dense prediction nature of the task, (2) aggregate them into memory-intensive hypercolumn formations, and (3) supervise lightweight projector networks to naively establish full local correspondences among all pairs of spatial features. In this paper, we introduce SCE-MAE, a framework that (1) leverages the MAE, a region-level SSL method that naturally better suits the landmark prediction task, (2) operates on the vanilla feature map instead of on expensive hypercolumns, and (3) employs a Correspondence Approximation and Refinement Block (CARB) that utilizes a simple density peak clustering algorithm and our proposed Locality-Constrained Repellence Loss to directly hone only select local correspondences. We demonstrate through extensive experiments that SCE-MAE is highly effective and robust, outperforming existing SOTA methods by large margins of approximately 20%-44% on the landmark matching and approximately 9%-15% on the landmark detection tasks. △ Less

Submitted 28 May, 2024; originally announced May 2024.

Comments: Accepted at CVPR 2024

arXiv:2405.05345 [pdf, other]

QuaLLM: An LLM-based Framework to Extract Quantitative Insights from Online Forums

Authors: Varun Nagaraj Rao, Eesha Agarwal, Samantha Dalal, Dan Calacci, Andrés Monroy-Hernández

Abstract: Online discussion forums provide crucial data to understand the concerns of a wide range of real-world communities. However, the typical qualitative and quantitative methods used to analyze those data, such as thematic analysis and topic modeling, are infeasible to scale or require significant human effort to translate outputs to human readable forms. This study introduces QuaLLM, a novel LLM-base… ▽ More Online discussion forums provide crucial data to understand the concerns of a wide range of real-world communities. However, the typical qualitative and quantitative methods used to analyze those data, such as thematic analysis and topic modeling, are infeasible to scale or require significant human effort to translate outputs to human readable forms. This study introduces QuaLLM, a novel LLM-based framework to analyze and extract quantitative insights from text data on online forums. The framework consists of a novel prompting methodology and evaluation strategy. We applied this framework to analyze over one million comments from two Reddit's rideshare worker communities, marking the largest study of its type. We uncover significant worker concerns regarding AI and algorithmic platform decisions, responding to regulatory calls about worker insights. In short, our work sets a new precedent for AI-assisted quantitative data analysis to surface concerns from online forums. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: Accepted to CHI LLM as Research Tools Workshop (2024)

arXiv:2404.10274 [pdf]

Sparse Attention Regression Network Based Soil Fertility Prediction With Ummaso

Authors: R V Raghavendra Rao, U Srinivasulu Reddy

Abstract: The challenge of imbalanced soil nutrient datasets significantly hampers accurate predictions of soil fertility. To tackle this, a new method is suggested in this research, combining Uniform Manifold Approximation and Projection (UMAP) with Least Absolute Shrinkage and Selection Operator (LASSO). The main aim is to counter the impact of uneven data distribution and improve soil fertility models' p… ▽ More The challenge of imbalanced soil nutrient datasets significantly hampers accurate predictions of soil fertility. To tackle this, a new method is suggested in this research, combining Uniform Manifold Approximation and Projection (UMAP) with Least Absolute Shrinkage and Selection Operator (LASSO). The main aim is to counter the impact of uneven data distribution and improve soil fertility models' predictive precision. The model introduced uses Sparse Attention Regression, effectively incorporating pertinent features from the imbalanced dataset. UMAP is utilized initially to reduce data complexity, unveiling hidden structures and important patterns. Following this, LASSO is applied to refine features and enhance the model's interpretability. The experimental outcomes highlight the effectiveness of the UMAP and LASSO hybrid approach. The proposed model achieves outstanding performance metrics, reaching a predictive accuracy of 98%, demonstrating its capability in accurate soil fertility predictions. Additionally, it showcases a Precision of 91.25%, indicating its adeptness in identifying fertile soil instances accurately. The Recall metric stands at 90.90%, emphasizing the model's ability to capture true positive cases effectively. △ Less

Submitted 16 April, 2024; originally announced April 2024.

arXiv:2403.05530 [pdf, other]

Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context

Authors: Gemini Team, Petko Georgiev, Ving Ian Lei, Ryan Burnell, Libin Bai, Anmol Gulati, Garrett Tanzer, Damien Vincent, Zhufeng Pan, Shibo Wang, Soroosh Mariooryad, Yifan Ding, Xinyang Geng, Fred Alcober, Roy Frostig, Mark Omernick, Lexi Walker, Cosmin Paduraru, Christina Sorokin, Andrea Tacchetti, Colin Gaffney, Samira Daruki, Olcan Sercinoglu, Zach Gleicher, Juliette Love , et al. (1110 additional authors not shown)

Abstract: In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February… ▽ More In this report, we introduce the Gemini 1.5 family of models, representing the next generation of highly compute-efficient multimodal models capable of recalling and reasoning over fine-grained information from millions of tokens of context, including multiple long documents and hours of video and audio. The family includes two new models: (1) an updated Gemini 1.5 Pro, which exceeds the February version on the great majority of capabilities and benchmarks; (2) Gemini 1.5 Flash, a more lightweight variant designed for efficiency with minimal regression in quality. Gemini 1.5 models achieve near-perfect recall on long-context retrieval tasks across modalities, improve the state-of-the-art in long-document QA, long-video QA and long-context ASR, and match or surpass Gemini 1.0 Ultra's state-of-the-art performance across a broad set of benchmarks. Studying the limits of Gemini 1.5's long-context ability, we find continued improvement in next-token prediction and near-perfect retrieval (>99%) up to at least 10M tokens, a generational leap over existing models such as Claude 3.0 (200k) and GPT-4 Turbo (128k). Finally, we highlight real-world use cases, such as Gemini 1.5 collaborating with professionals on completing their tasks achieving 26 to 75% time savings across 10 different job categories, as well as surprising new capabilities of large language models at the frontier; when given a grammar manual for Kalamang, a language with fewer than 200 speakers worldwide, the model learns to translate English to Kalamang at a similar level to a person who learned from the same content. △ Less

Submitted 8 August, 2024; v1 submitted 8 March, 2024; originally announced March 2024.

arXiv:2401.05683 [pdf, other]

Deep Learning Meets Mechanism Design: Key Results and Some Novel Applications

Authors: V. Udaya Sankar, Vishisht Srihari Rao, Y. Narahari

Abstract: Mechanism design is essentially reverse engineering of games and involves inducing a game among strategic agents in a way that the induced game satisfies a set of desired properties in an equilibrium of the game. Desirable properties for a mechanism include incentive compatibility, individual rationality, welfare maximisation, revenue maximisation (or cost minimisation), fairness of allocation, et… ▽ More Mechanism design is essentially reverse engineering of games and involves inducing a game among strategic agents in a way that the induced game satisfies a set of desired properties in an equilibrium of the game. Desirable properties for a mechanism include incentive compatibility, individual rationality, welfare maximisation, revenue maximisation (or cost minimisation), fairness of allocation, etc. It is known from mechanism design theory that only certain strict subsets of these properties can be simultaneously satisfied exactly by any given mechanism. Often, the mechanisms required by real-world applications may need a subset of these properties that are theoretically impossible to be simultaneously satisfied. In such cases, a prominent recent approach is to use a deep learning based approach to learn a mechanism that approximately satisfies the required properties by minimizing a suitably defined loss function. In this paper, we present, from relevant literature, technical details of using a deep learning approach for mechanism design and provide an overview of key results in this topic. We demonstrate the power of this approach for three illustrative case studies: (a) efficient energy management in a vehicular network (b) resource allocation in a mobile network (c) designing a volume discount procurement auction for agricultural inputs. Section 6 concludes the paper. △ Less

Submitted 11 January, 2024; originally announced January 2024.

arXiv:2312.11805 [pdf, other]

Gemini: A Family of Highly Capable Multimodal Models

Authors: Gemini Team, Rohan Anil, Sebastian Borgeaud, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M. Dai, Anja Hauth, Katie Millican, David Silver, Melvin Johnson, Ioannis Antonoglou, Julian Schrittwieser, Amelia Glaese, Jilin Chen, Emily Pitler, Timothy Lillicrap, Angeliki Lazaridou, Orhan Firat, James Molloy, Michael Isard, Paul R. Barham, Tom Hennigan, Benjamin Lee , et al. (1325 additional authors not shown)

Abstract: This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultr… ▽ More This report introduces a new family of multimodal models, Gemini, that exhibit remarkable capabilities across image, audio, video, and text understanding. The Gemini family consists of Ultra, Pro, and Nano sizes, suitable for applications ranging from complex reasoning tasks to on-device memory-constrained use-cases. Evaluation on a broad range of benchmarks shows that our most-capable Gemini Ultra model advances the state of the art in 30 of 32 of these benchmarks - notably being the first model to achieve human-expert performance on the well-studied exam benchmark MMLU, and improving the state of the art in every one of the 20 multimodal benchmarks we examined. We believe that the new capabilities of the Gemini family in cross-modal reasoning and language understanding will enable a wide variety of use cases. We discuss our approach toward post-training and deploying Gemini models responsibly to users through services including Gemini, Gemini Advanced, Google AI Studio, and Cloud Vertex AI. △ Less

Submitted 17 June, 2024; v1 submitted 18 December, 2023; originally announced December 2023.

arXiv:2312.08267 [pdf, other]

TABSurfer: a Hybrid Deep Learning Architecture for Subcortical Segmentation

Authors: Aaron Cao, Vishwanatha M. Rao, Kejia Liu, Xinru Liu, Andrew F. Laine, Jia Guo

Abstract: Subcortical segmentation remains challenging despite its important applications in quantitative structural analysis of brain MRI scans. The most accurate method, manual segmentation, is highly labor intensive, so automated tools like FreeSurfer have been adopted to handle this task. However, these traditional pipelines are slow and inefficient for processing large datasets. In this study, we propo… ▽ More Subcortical segmentation remains challenging despite its important applications in quantitative structural analysis of brain MRI scans. The most accurate method, manual segmentation, is highly labor intensive, so automated tools like FreeSurfer have been adopted to handle this task. However, these traditional pipelines are slow and inefficient for processing large datasets. In this study, we propose TABSurfer, a novel 3D patch-based CNN-Transformer hybrid deep learning model designed for superior subcortical segmentation compared to existing state-of-the-art tools. To evaluate, we first demonstrate TABSurfer's consistent performance across various T1w MRI datasets with significantly shorter processing times compared to FreeSurfer. Then, we validate against manual segmentations, where TABSurfer outperforms FreeSurfer based on the manual ground truth. In each test, we also establish TABSurfer's advantage over a leading deep learning benchmark, FastSurferVINN. Together, these studies highlight TABSurfer's utility as a powerful tool for fully automated subcortical segmentation with high fidelity. △ Less

Submitted 13 December, 2023; originally announced December 2023.

Comments: 5 pages, 3 figures, 2 tables

arXiv:2311.17705 [pdf, other]

Q-PAC: Automated Detection of Quantum Bug-Fix Patterns

Authors: Pranav K. Nayak, Krishn V. Kher, M. Bharat Chandra, M. V. Panduranga Rao, Lei Zhang

Abstract: Context: Bug-fix pattern detection has been investigated in the past in the context of classical software. However, while quantum software is developing rapidly, the literature still lacks automated methods and tools to identify, analyze, and detect bug-fix patterns. To the best of our knowledge, our work previously published in SEKE'23 was the first to leverage classical techniques to detect bug-… ▽ More Context: Bug-fix pattern detection has been investigated in the past in the context of classical software. However, while quantum software is developing rapidly, the literature still lacks automated methods and tools to identify, analyze, and detect bug-fix patterns. To the best of our knowledge, our work previously published in SEKE'23 was the first to leverage classical techniques to detect bug-fix patterns in quantum code. Objective: To extend our previous effort, we present a research agenda (Q-Repair), including a series of testing and debugging methodologies, to improve the quality of quantum software. The ultimate goal is to utilize machine learning techniques to automatically predict fix patterns for existing quantum bugs. Method: As part of the first stage of the agenda, we extend our initial study and propose a more comprehensive automated framework, called Q-PAC, for detecting bug-fix patterns in IBM Qiskit quantum code. In the framework, we develop seven bug-fix pattern detectors using abstract syntax trees, syntactic filters, and semantic checks. Results: To demonstrate our method, we run Q-PAC on a variety of quantum bug-fix patterns using both real-world and handcrafted examples of bugs and fixes. The experimental results show that Q-PAC can effectively identify bug-fix patterns in IBM Qiskit. Conclusion: We hope our initial study on quantum bug-fix detection can bring awareness of quantum software engineering to both researchers and practitioners. Thus, we also publish Q-PAC as an open-source software on GitHub. We would like to encourage other researchers to work on research directions (such as Q-Repair) to improve the quality of the quantum programming. △ Less

Submitted 29 November, 2023; originally announced November 2023.

Comments: 16 pages, 2 figures

arXiv:2311.11645 [pdf, other]

doi 10.1145/3632775.3661959

Exploding AI Power Use: an Opportunity to Rethink Grid Planning and Management

Authors: Liuzixuan Lin, Rajini Wijayawardana, Varsha Rao, Hai Nguyen, Wedan Emmanuel Gnibga, Andrew A. Chien

Abstract: The unprecedented rapid growth of computing demand for AI is projected to increase global annual datacenter (DC) growth from 7.2% to 11.3%. We project the 5-year AI DC demand for several power grids and assess whether they will allow desired AI growth (resource adequacy). If not, several "desperate measures" -- grid policies that enable more load growth and maintain grid reliability by sacrificing… ▽ More The unprecedented rapid growth of computing demand for AI is projected to increase global annual datacenter (DC) growth from 7.2% to 11.3%. We project the 5-year AI DC demand for several power grids and assess whether they will allow desired AI growth (resource adequacy). If not, several "desperate measures" -- grid policies that enable more load growth and maintain grid reliability by sacrificing new DC reliability are considered. We find that two DC hotspots -- EirGrid (Ireland) and Dominion (US) -- will have difficulty accommodating new DCs needed by the AI growth. In EirGrid, relaxing new DC reliability guarantees increases the power available to 1.6x--4.1x while maintaining 99.6% actual power availability for the new DCs, sufficient for the 5-year AI demand. In Dominion, relaxing reliability guarantees increases available DC capacity similarly (1.5x--4.6x) but not enough for the 5-year AI demand. New DCs only receive 89% power availability. Study of other US power grids -- SPP, CAISO, ERCOT -- shows that sufficient capacity exists for the projected AI load growth. Our results suggest the need to rethink adequacy assessment and also grid planning and management. New research opportunities include coordinated planning, reliability models that incorporate load flexibility, and adaptive load abstractions. △ Less

Submitted 30 April, 2024; v1 submitted 20 November, 2023; originally announced November 2023.

Comments: Accepted by ACM e-Energy '24: the 15th ACM International Conference on Future and Sustainable Energy Systems

arXiv:2310.05972 [pdf, other]

Normality of I-V Measurements Using ML

Authors: Anees Al-Najjar, Nageswara S. V. Rao, Craig A. Bridges, Sheng Dai

Abstract: Electrochemistry ecosystems are promising for accelerating the design and discovery of electrochemical systems for energy storage and conversion, by automating significant parts of workflows that combine synthesis and characterization experiments with computations. They require the integration of flow controllers, solvent containers, pumps, fraction collectors, and potentiostats, all connected to… ▽ More Electrochemistry ecosystems are promising for accelerating the design and discovery of electrochemical systems for energy storage and conversion, by automating significant parts of workflows that combine synthesis and characterization experiments with computations. They require the integration of flow controllers, solvent containers, pumps, fraction collectors, and potentiostats, all connected to an electrochemical cell. These are specialized instruments with custom software that is not originally designed for network integration. We developed network and software solutions for electrochemical workflows that adapt system and instrument settings in real-time for multiple rounds of experiments. We demonstrate this automated workflow by remotely operating the instruments and collecting their measurements to generate a voltammogram (I-V profile) of an electrolyte solution in an electrochemical cell. These measurements are made available at the remote computing system and used for subsequent analysis. In this paper, we focus on a novel, analytically validated machine learning (ML) method for an electrochemistry ecosystem to ensure that I-V measurements are consistent with the normal experimental conditions, and to detect abnormal conditions, such as disconnected electrodes or low cell content volume. △ Less

Submitted 28 September, 2023; originally announced October 2023.

Comments: published at eScience 2023

Journal ref: in 2023 IEEE 19th International Conference on e-Science (e-Science), Limassol, Cyprus, 2023 pp. 1-2

arXiv:2309.17147 [pdf, other]

Using Large Language Models for Qualitative Analysis can Introduce Serious Bias

Authors: Julian Ashwin, Aditya Chhabra, Vijayendra Rao

Abstract: Large Language Models (LLMs) are quickly becoming ubiquitous, but the implications for social science research are not yet well understood. This paper asks whether LLMs can help us analyse large-N qualitative data from open-ended interviews, with an application to transcripts of interviews with Rohingya refugees in Cox's Bazaar, Bangladesh. We find that a great deal of caution is needed in using L… ▽ More Large Language Models (LLMs) are quickly becoming ubiquitous, but the implications for social science research are not yet well understood. This paper asks whether LLMs can help us analyse large-N qualitative data from open-ended interviews, with an application to transcripts of interviews with Rohingya refugees in Cox's Bazaar, Bangladesh. We find that a great deal of caution is needed in using LLMs to annotate text as there is a risk of introducing biases that can lead to misleading inferences. We here mean bias in the technical sense, that the errors that LLMs make in annotating interview transcripts are not random with respect to the characteristics of the interview subjects. Training simpler supervised models on high-quality human annotations with flexible coding leads to less measurement error and bias than LLM annotations. Therefore, given that some high quality annotations are necessary in order to asses whether an LLM introduces bias, we argue that it is probably preferable to train a bespoke model on these annotations than it is to use an LLM for annotation. △ Less

Submitted 5 October, 2023; v1 submitted 29 September, 2023; originally announced September 2023.

arXiv:2307.06883 [pdf, other]

Cyber Framework for Steering and Measurements Collection Over Instrument-Computing Ecosystems

Authors: Anees Al-Najjar, Nageswara S. V. Rao, Ramanan Sankaran, Helia Zandi, Debangshu Mukherjee, Maxim Ziatdinov, Craig Bridges

Abstract: We propose a framework to develop cyber solutions to support the remote steering of science instruments and measurements collection over instrument-computing ecosystems. It is based on provisioning separate data and control connections at the network level, and developing software modules consisting of Python wrappers for instrument commands and Pyro server-client codes that make them available ac… ▽ More We propose a framework to develop cyber solutions to support the remote steering of science instruments and measurements collection over instrument-computing ecosystems. It is based on provisioning separate data and control connections at the network level, and developing software modules consisting of Python wrappers for instrument commands and Pyro server-client codes that make them available across the ecosystem network. We demonstrate automated measurement transfers and remote steering operations in a microscopy use case for materials research over an ecosystem of Nion microscopes and computing platforms connected over site networks. The proposed framework is currently under further refinement and being adopted to science workflows with automated remote experiments steering for autonomous chemistry laboratories and smart energy grid simulations. △ Less

Submitted 12 July, 2023; originally announced July 2023.

Comments: Paper accepted for presentation at IEEE SMARTCOMP 2023

arXiv:2306.07527 [pdf, other]

doi 10.1145/3593013.3594115

Discrimination through Image Selection by Job Advertisers on Facebook

Authors: Varun Nagaraj Rao, Aleksandra Korolova

Abstract: Targeted advertising platforms are widely used by job advertisers to reach potential employees; thus issues of discrimination due to targeting that have surfaced have received widespread attention. Advertisers could misuse targeting tools to exclude people based on gender, race, location and other protected attributes from seeing their job ads. In response to legal actions, Facebook disabled the a… ▽ More Targeted advertising platforms are widely used by job advertisers to reach potential employees; thus issues of discrimination due to targeting that have surfaced have received widespread attention. Advertisers could misuse targeting tools to exclude people based on gender, race, location and other protected attributes from seeing their job ads. In response to legal actions, Facebook disabled the ability for explicit targeting based on many attributes for some ad categories, including employment. Although this is a step in the right direction, prior work has shown that discrimination can take place not just due to the explicit targeting tools of the platforms, but also due to the impact of the biased ad delivery algorithm. Thus, one must look at the potential for discrimination more broadly, and not merely through the lens of the explicit targeting tools. In this work, we propose and investigate the prevalence of a new means for discrimination in job advertising, that combines both targeting and delivery -- through the disproportionate representation or exclusion of people of certain demographics in job ad images. We use the Facebook Ad Library to demonstrate the prevalence of this practice through: (1) evidence of advertisers running many campaigns using ad images of people of only one perceived gender, (2) systematic analysis for gender representation in all current ad campaigns for truck drivers and nurses, (3) longitudinal analysis of ad campaign image use by gender and race for select advertisers. After establishing that the discrimination resulting from a selective choice of people in job ad images, combined with algorithmic amplification of skews by the ad delivery algorithm, is of immediate concern, we discuss approaches and challenges for addressing it. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: Published in FAccT 2023

arXiv:2306.04733 [pdf, other]

doi 10.1103/PhysRevX.13.041054

Epidemic spreading in group-structured populations

Authors: Siddharth Patwardhan, Varun K. Rao, Santo Fortunato, Filippo Radicchi

Abstract: Individuals involved in common group activities/settings -- e.g., college students that are enrolled in the same class and/or live in the same dorm -- are exposed to recurrent contacts of physical proximity. These contacts are known to mediate the spread of an infectious disease, however, it is not obvious how the properties of the spreading process are determined by the structure of and the inter… ▽ More Individuals involved in common group activities/settings -- e.g., college students that are enrolled in the same class and/or live in the same dorm -- are exposed to recurrent contacts of physical proximity. These contacts are known to mediate the spread of an infectious disease, however, it is not obvious how the properties of the spreading process are determined by the structure of and the interrelation among the group settings that are at the root of those recurrent interactions. Here, we show that reshaping the organization of groups within a population can be used as an effective strategy to decrease the severity of an epidemic. Specifically, we show that when group structures are sufficiently correlated -- e.g., the likelihood for two students living in the same dorm to attend the same class is sufficiently high -- outbreaks are longer but milder than for uncorrelated group structures. Also, we show that the effectiveness of interventions for disease containment increases as the correlation among group structures increases. We demonstrate the practical relevance of our findings by taking advantage of data about housing and attendance of students at the Indiana University campus in Bloomington. By appropriately optimizing the assignment of students to dorms based on their enrollment, we are able to observe a two- to five-fold reduction in the severity of simulated epidemic processes. △ Less

Submitted 20 December, 2023; v1 submitted 7 June, 2023; originally announced June 2023.

Comments: 10 pages, 4 figures + Supplemental Material

Journal ref: Phys. Rev. X 13, 041054 (2023)

arXiv:2305.19444 [pdf]

Pixelated Interactions: Exploring Pixel Art for Graphical Primitives on a Tactile Display

Authors: Tigmanshu Bhatnagar, Vikas Upadhyay, Anchal Sharma, P V Madhusudhan Rao, Mark Miodownik, Nicolai Marquardt, Catherine Holloway

Abstract: Two-dimensional pin array tactile displays enable access to tactile graphics that are important for the education of students with visual impairments. Due to their prohibitive cost, limited access, and limited research within HCI, the rules to design graphical primitives on these low-resolution tactile displays are unclear. In this paper, eight tactile readers with visual impairments qualitatively… ▽ More Two-dimensional pin array tactile displays enable access to tactile graphics that are important for the education of students with visual impairments. Due to their prohibitive cost, limited access, and limited research within HCI, the rules to design graphical primitives on these low-resolution tactile displays are unclear. In this paper, eight tactile readers with visual impairments qualitatively evaluate the implementation of Pixel Art to create tactile graphical primitives on a pin array display. Every pin of the pin array is assumed to be a pixel on a pixel grid. Our findings suggest that Pixel Art tactile graphics on a pin array are clear and comprehensible to tactile readers, positively confirming its use to design basic tactile shapes and line segments. The guidelines provide a framework to create tactile media which implies that the guidelines can be used to downsize basic shapes for refreshable pin-array displays. △ Less

Submitted 30 May, 2023; originally announced May 2023.

Comments: 25 pages, 10 figures. To appear in DIS'23 Designing Interactive Systems Conference, July 10 to 14, 2023, Pittsburgh, PA, USA

arXiv:2303.02043 [pdf, other]

An Integrated Real-time UAV Trajectory Optimization with Potential Field Approach for Dynamic Collision Avoidance

Authors: D. M. K. K. Venkateswara Rao, Hamed Habibi, Jose Luis Sanchez-Lopez, Holger Voos

Abstract: This paper presents an integrated approach that combines trajectory optimization and Artificial Potential Field (APF) method for real-time optimal Unmanned Aerial Vehicle (UAV) trajectory planning and dynamic collision avoidance. A minimum-time trajectory optimization problem is formulated with initial and final positions as boundary conditions and collision avoidance as constraints. It is transcr… ▽ More This paper presents an integrated approach that combines trajectory optimization and Artificial Potential Field (APF) method for real-time optimal Unmanned Aerial Vehicle (UAV) trajectory planning and dynamic collision avoidance. A minimum-time trajectory optimization problem is formulated with initial and final positions as boundary conditions and collision avoidance as constraints. It is transcribed into a nonlinear programming problem using Chebyshev pseudospectral method. The state and control histories are approximated by using Lagrange polynomials and the collocation points are used to satisfy constraints. A novel sigmoid-type collision avoidance constraint is proposed to overcome the drawbacks of Lagrange polynomial approximation in pseudospectral methods that only guarantees inequality constraint satisfaction only at nodal points. Automatic differentiation of cost function and constraints is used to quickly determine their gradient and Jacobian, respectively. An APF method is used to update the optimal control inputs for guaranteeing collision avoidance. The trajectory optimization and APF method are implemented in a closed-loop fashion continuously, but in parallel at moderate and high frequencies, respectively. The initial guess for the optimization is provided based on the previous solution. The proposed approach is tested and validated through indoor experiments. △ Less

Submitted 3 March, 2023; originally announced March 2023.

arXiv:2211.07092 [pdf, ps, other]

Offline Estimation of Controlled Markov Chains: Minimaxity and Sample Complexity

Authors: Imon Banerjee, Harsha Honnappa, Vinayak Rao

Abstract: In this work, we study a natural nonparametric estimator of the transition probability matrices of a finite controlled Markov chain. We consider an offline setting with a fixed dataset, collected using a so-called logging policy. We develop sample complexity bounds for the estimator and establish conditions for minimaxity. Our statistical bounds depend on the logging policy through its mixing prop… ▽ More In this work, we study a natural nonparametric estimator of the transition probability matrices of a finite controlled Markov chain. We consider an offline setting with a fixed dataset, collected using a so-called logging policy. We develop sample complexity bounds for the estimator and establish conditions for minimaxity. Our statistical bounds depend on the logging policy through its mixing properties. We show that achieving a particular statistical risk bound involves a subtle and interesting trade-off between the strength of the mixing properties and the number of samples. We demonstrate the validity of our results under various examples, such as ergodic Markov chains, weakly ergodic inhomogeneous Markov chains, and controlled Markov chains with non-stationary Markov, episodic, and greedy controls. Lastly, we use these sample complexity bounds to establish concomitant ones for offline evaluation of stationary Markov control policies. △ Less

Submitted 26 January, 2024; v1 submitted 13 November, 2022; originally announced November 2022.

Comments: 71 pages, 23 main

arXiv:2210.09791 [pdf, other]

Enabling Autonomous Electron Microscopy for Networked Computation and Steering

Authors: Anees Al-Najjar, Nageswara S. V. Rao, Ramanan Sankaran, Maxim Ziatdinov, Debangshu Mukherjee, Olga Ovchinnikova, Kevin Roccapriore, Andrew R. Lupini, Sergei V. Kalinin

Abstract: Advanced electron microscopy workflows require an ecosystem of microscope instruments and computing systems possibly located at different sites to conduct remotely steered and automated experiments. Current workflow executions involve manual operations for steering and measurement tasks, which are typically performed from control workstations co-located with microscopes; consequently, their operat… ▽ More Advanced electron microscopy workflows require an ecosystem of microscope instruments and computing systems possibly located at different sites to conduct remotely steered and automated experiments. Current workflow executions involve manual operations for steering and measurement tasks, which are typically performed from control workstations co-located with microscopes; consequently, their operational tempo and effectiveness are limited. We propose an approach based on separate data and control channels for such an ecosystem of Scanning Transmission Electron Microscopes (STEM) and computing systems, for which no general solutions presently exist, unlike the neutron and light source instruments. We demonstrate automated measurement transfers and remote steering of Nion STEM physical instruments over site networks. We propose a Virtual Infrastructure Twin (VIT) of this ecosystem, which is used to develop and test our steering software modules without requiring access to the physical instrument infrastructure. Additionally, we develop a VIT for a multiple laboratory scenario, which illustrates the applicability of this approach to ecosystems connected over wide-area networks, for the development and testing of software modules and their later field deployment. △ Less

Submitted 18 October, 2022; originally announced October 2022.

Comments: 11 pages, 16 figures, accepted at IEEE eScience 2022 conference

arXiv:2207.13310 [pdf, other]

doi 10.1109/PESGM48719.2022.9916982

Learning the Evolution of Correlated Stochastic Power System Dynamics

Authors: Tyler E. Maltba, Vishwas Rao, Daniel Adrian Maldonado

Abstract: A machine learning technique is proposed for quantifying uncertainty in power system dynamics with spatiotemporally correlated stochastic forcing. We learn one-dimensional linear partial differential equations for the probability density functions of real-valued quantities of interest. The method is suitable for high-dimensional systems and helps to alleviate the curse of dimensionality. A machine learning technique is proposed for quantifying uncertainty in power system dynamics with spatiotemporally correlated stochastic forcing. We learn one-dimensional linear partial differential equations for the probability density functions of real-valued quantities of interest. The method is suitable for high-dimensional systems and helps to alleviate the curse of dimensionality. △ Less

Submitted 27 July, 2022; originally announced July 2022.

Comments: 5 pages, 2 figures, Accepted to 2022 IEEE PES GM

Journal ref: IEEE Power & Energy Society General Meeting (2022) 01-05

arXiv:2206.12980 [pdf]

Detecting Schizophrenia with 3D Structural Brain MRI Using Deep Learning

Authors: Junhao Zhang, Vishwanatha M. Rao, Ye Tian, Yanting Yang, Nicolas Acosta, Zihan Wan, Pin-Yu Lee, Chloe Zhang, Lawrence S. Kegeles, Scott A. Small, Jia Guo

Abstract: Schizophrenia is a chronic neuropsychiatric disorder that causes distinct structural alterations within the brain. We hypothesize that deep learning applied to a structural neuroimaging dataset could detect disease-related alteration and improve classification and diagnostic accuracy. We tested this hypothesis using a single, widely available, and conventional T1-weighted MRI scan, from which we e… ▽ More Schizophrenia is a chronic neuropsychiatric disorder that causes distinct structural alterations within the brain. We hypothesize that deep learning applied to a structural neuroimaging dataset could detect disease-related alteration and improve classification and diagnostic accuracy. We tested this hypothesis using a single, widely available, and conventional T1-weighted MRI scan, from which we extracted the 3D whole-brain structure using standard post-processing methods. A deep learning model was then developed, optimized, and evaluated on three open datasets with T1-weighted MRI scans of patients with schizophrenia. Our proposed model outperformed the benchmark model, which was also trained with structural MR images using a 3D CNN architecture. Our model is capable of almost perfectly (area under the ROC curve = 0.987) distinguishing schizophrenia patients from healthy controls on unseen structural MRI scans. Regional analysis localized subcortical regions and ventricles as the most predictive brain regions. Subcortical structures serve a pivotal role in cognitive, affective, and social functions in humans, and structural abnormalities of these regions have been associated with schizophrenia. Our finding corroborates that schizophrenia is associated with widespread alterations in subcortical brain structure and the subcortical structural information provides prominent features in diagnostic classification. Together, these results further demonstrate the potential of deep learning to improve schizophrenia diagnosis and identify its structural neuroimaging signatures from a single, standard T1-weighted brain MRI. △ Less

Submitted 7 July, 2022; v1 submitted 26 June, 2022; originally announced June 2022.

Comments: 13 pages, 6 figures

arXiv:2202.11820 [pdf]

Nowcasting the Financial Time Series with Streaming Data Analytics under Apache Spark

Authors: Mohammad Arafat Ali Khan, Chandra Bhushan, Vadlamani Ravi, Vangala Sarveswara Rao, Shiva Shankar Orsu

Abstract: This paper proposes nowcasting of high-frequency financial datasets in real-time with a 5-minute interval using the streaming analytics feature of Apache Spark. The proposed 2 stage method consists of modelling chaos in the first stage and then using a sliding window approach for training with machine learning algorithms namely Lasso Regression, Ridge Regression, Generalised Linear Model, Gradient… ▽ More This paper proposes nowcasting of high-frequency financial datasets in real-time with a 5-minute interval using the streaming analytics feature of Apache Spark. The proposed 2 stage method consists of modelling chaos in the first stage and then using a sliding window approach for training with machine learning algorithms namely Lasso Regression, Ridge Regression, Generalised Linear Model, Gradient Boosting Tree and Random Forest available in the MLLib of Apache Spark in the second stage. For testing the effectiveness of the proposed methodology, 3 different datasets, of which two are stock markets namely National Stock Exchange & Bombay Stock Exchange, and finally One Bitcoin-INR conversion dataset. For evaluating the proposed methodology, we used metrics such as Symmetric Mean Absolute Percentage Error, Directional Symmetry, and Theil U Coefficient. We tested the significance of each pair of models using the Diebold Mariano (DM) test. △ Less

Submitted 23 February, 2022; originally announced February 2022.

Comments: 26 pages; 7 Tables and 11 Figures

MSC Class: 37M10; 62M10; 91B84 ACM Class: I.2.11; J.4

arXiv:2201.08741 [pdf]

doi 10.3389/fnimg.2022.1023481

Improving Across-Dataset Brain Tissue Segmentation Using Transformer

Authors: Vishwanatha M. Rao, Zihan Wan, Soroush Arabshahi, David J. Ma, Pin-Yu Lee, Ye Tian, Xuzhe Zhang, Andrew F. Laine, Jia Guo

Abstract: Brain tissue segmentation has demonstrated great utility in quantifying MRI data through Voxel-Based Morphometry and highlighting subtle structural changes associated with various conditions within the brain. However, manual segmentation is highly labor-intensive, and automated approaches have struggled due to properties inherent to MRI acquisition, leaving a great need for an effective segmentati… ▽ More Brain tissue segmentation has demonstrated great utility in quantifying MRI data through Voxel-Based Morphometry and highlighting subtle structural changes associated with various conditions within the brain. However, manual segmentation is highly labor-intensive, and automated approaches have struggled due to properties inherent to MRI acquisition, leaving a great need for an effective segmentation tool. Despite the recent success of deep convolutional neural networks (CNNs) for brain tissue segmentation, many such solutions do not generalize well to new datasets, which is critical for a reliable solution. Transformers have demonstrated success in natural image segmentation and have recently been applied to 3D medical image segmentation tasks due to their ability to capture long-distance relationships in the input where the local receptive fields of CNNs struggle. This study introduces a novel CNN-Transformer hybrid architecture designed for brain tissue segmentation. We validate our model's performance across four multi-site T1w MRI datasets, covering different vendors, field strengths, scan parameters, time points, and neuropsychiatric conditions. In all situations, our model achieved the greatest generality and reliability. Out method is inherently robust and can serve as a valuable tool for brain-related T1w MRI studies. The code for the TABS network is available at: https://github.com/raovish6/TABS. △ Less

Submitted 31 January, 2023; v1 submitted 21 January, 2022; originally announced January 2022.

ACM Class: I.4.6

arXiv:2112.11547 [pdf, other]

Decompose the Sounds and Pixels, Recompose the Events

Authors: Varshanth R. Rao, Md Ibrahim Khalil, Haoda Li, Peng Dai, Juwei Lu

Abstract: In this paper, we propose a framework centering around a novel architecture called the Event Decomposition Recomposition Network (EDRNet) to tackle the Audio-Visual Event (AVE) localization problem in the supervised and weakly supervised settings. AVEs in the real world exhibit common unravelling patterns (termed as Event Progress Checkpoints (EPC)), which humans can perceive through the cooperati… ▽ More In this paper, we propose a framework centering around a novel architecture called the Event Decomposition Recomposition Network (EDRNet) to tackle the Audio-Visual Event (AVE) localization problem in the supervised and weakly supervised settings. AVEs in the real world exhibit common unravelling patterns (termed as Event Progress Checkpoints (EPC)), which humans can perceive through the cooperation of their auditory and visual senses. Unlike earlier methods which attempt to recognize entire event sequences, the EDRNet models EPCs and inter-EPC relationships using stacked temporal convolutions. Based on the postulation that EPC representations are theoretically consistent for an event category, we introduce the State Machine Based Video Fusion, a novel augmentation technique that blends source videos using different EPC template sequences. Additionally, we design a new loss function called the Land-Shore-Sea loss to compactify continuous foreground and background representations. Lastly, to alleviate the issue of confusing events during weak supervision, we propose a prediction stabilization method called Bag to Instance Label Correction. Experiments on the AVE dataset show that our collective framework outperforms the state-of-the-art by a sizable margin. △ Less

Submitted 21 December, 2021; originally announced December 2021.

Comments: Accepted at AAAI 2022

arXiv:2112.09752 [pdf, other]

Set Twister for Single-hop Node Classification

Authors: Yangze Zhou, Vinayak Rao, Bruno Ribeiro

Abstract: Node classification is a central task in relational learning, with the current state-of-the-art hinging on two key principles: (i) predictions are permutation-invariant to the ordering of a node's neighbors, and (ii) predictions are a function of the node's $r$-hop neighborhood topology and attributes, $r \geq 2$. Both graph neural networks and collective inference methods (e.g., belief propagatio… ▽ More Node classification is a central task in relational learning, with the current state-of-the-art hinging on two key principles: (i) predictions are permutation-invariant to the ordering of a node's neighbors, and (ii) predictions are a function of the node's $r$-hop neighborhood topology and attributes, $r \geq 2$. Both graph neural networks and collective inference methods (e.g., belief propagation) rely on information from up to $r$-hops away. In this work, we study if the use of more powerful permutation-invariant functions can sometimes avoid the need for classifiers to collect information beyond $1$-hop. Towards this, we introduce a new architecture, the Set Twister, which generalizes DeepSets (Zaheer et al., 2017), a simple and widely-used permutation-invariant representation. Set Twister theoretically increases expressiveness of DeepSets, allowing it to capture higher-order dependencies, while keeping its simplicity and low computational cost. Empirically, we see accuracy improvements of Set Twister over DeepSets as well as a variety of graph neural networks and collective inference schemes in several tasks, while showcasing its implementation simplicity and computational efficiency. △ Less

Submitted 17 December, 2021; originally announced December 2021.

Comments: Accepted for presentation at the 2nd GCLR workshop in conjunction with AAAI 2022

arXiv:2111.07552 [pdf, other]

Dynamic Placement of Rapidly Deployable Mobile Sensor Robots Using Machine Learning and Expected Value of Information

Authors: Alice Agogino, Hae Young Jang, Vivek Rao, Ritik Batra, Felicity Liao, Rohan Sood, Irving Fang, R. Lily Hu, Emerson Shoichet-Bartus, John Matranga

Abstract: Although the Industrial Internet of Things has increased the number of sensors permanently installed in industrial plants, there will be gaps in coverage due to broken sensors or sparse density in very large plants, such as in the petrochemical industry. Modern emergency response operations are beginning to use Small Unmanned Aerial Systems (sUAS) that have the ability to drop sensor robots to pre… ▽ More Although the Industrial Internet of Things has increased the number of sensors permanently installed in industrial plants, there will be gaps in coverage due to broken sensors or sparse density in very large plants, such as in the petrochemical industry. Modern emergency response operations are beginning to use Small Unmanned Aerial Systems (sUAS) that have the ability to drop sensor robots to precise locations. sUAS can provide longer-term persistent monitoring that aerial drones are unable to provide. Despite the relatively low cost of these assets, the choice of which robotic sensing systems to deploy to which part of an industrial process in a complex plant environment during emergency response remains challenging. This paper describes a framework for optimizing the deployment of emergency sensors as a preliminary step towards realizing the responsiveness of robots in disaster circumstances. AI techniques (Long short-term memory, 1-dimensional convolutional neural network, logistic regression, and random forest) identify regions where sensors would be most valued without requiring humans to enter the potentially dangerous area. In the case study described, the cost function for optimization considers costs of false-positive and false-negative errors. Decisions on mitigation include implementing repairs or shutting down the plant. The Expected Value of Information (EVI) is used to identify the most valuable type and location of physical sensors to be deployed to increase the decision-analytic value of a sensor network. This method is applied to a case study using the Tennessee Eastman process data set of a chemical plant, and we discuss implications of our findings for operation, distribution, and decision-making of sensors in plant emergency and resilience scenarios. △ Less

Submitted 15 November, 2021; originally announced November 2021.

Comments: 14 pages, 11 figures, IMECE2021

arXiv:2111.03808 [pdf]

Contextual Unsupervised Outlier Detection in Sequences

Authors: Mohamed A. Zahran, Leonardo Teixeira, Vinayak Rao, Bruno Ribeiro

Abstract: This work proposes an unsupervised learning framework for trajectory (sequence) outlier detection that combines ranking tests with user sequence models. The overall framework identifies sequence outliers at a desired false positive rate (FPR), in an otherwise parameter-free manner. We evaluate our methodology on a collection of real and simulated datasets based on user actions at the websites last… ▽ More This work proposes an unsupervised learning framework for trajectory (sequence) outlier detection that combines ranking tests with user sequence models. The overall framework identifies sequence outliers at a desired false positive rate (FPR), in an otherwise parameter-free manner. We evaluate our methodology on a collection of real and simulated datasets based on user actions at the websites last.fm and msnbc.com, where we know ground truth, and demonstrate improved accuracy over existing approaches. We also apply our approach to a large real-world dataset of Pinterest and Facebook users, where we find that users tend to re-share Pinterest posts of Facebook friends significantly more than other types of users, pointing to a potential influence of Facebook friendship on sharing behavior on Pinterest. △ Less

Submitted 6 November, 2021; originally announced November 2021.

Comments: 11 pages

arXiv:2111.01634 [pdf, other]

Towards Enabling High-Five Over WiFi

Authors: Vineet Gokhale, Mohamad Eid, Kees Kroep, R. Venkatesha Prasad, Vijay Rao

Abstract: The next frontier for immersive applications is enabling sentience over the Internet. Tactile Internet (TI) envisages transporting skills by providing Ultra-Low Latency (ULL) communications for transporting touch senses. In this work, we focus our study on the first/last mile communication, where the future generation WiFi-7 is pitched as the front-runner for ULL applications. We discuss a few can… ▽ More The next frontier for immersive applications is enabling sentience over the Internet. Tactile Internet (TI) envisages transporting skills by providing Ultra-Low Latency (ULL) communications for transporting touch senses. In this work, we focus our study on the first/last mile communication, where the future generation WiFi-7 is pitched as the front-runner for ULL applications. We discuss a few candidate features of WiFi-7 and highlight its major pitfalls with respect to ULL communication. Further, through a specific implementation of WiFi-7 (vanilla WiFi-7) in our custom simulator, we demonstrate the impact of one of the pitfalls - standard practice of using jitter buffer in conjunction with frame aggregation - on TI communication. To circumvent this, we propose Non-Buffered Scheme (NoBuS) - a simple MAC layer enhancement for enabling TI applications on WiFi-7. NoBuS trades off packet loss for latency enabling swift synchronization between the master and controlled domains. Our findings reveal that employing NoBuS yields a significant improvement in RMSE of TI signals. Further, we show that the worst-case WiFi latency with NoBuS is 3.72 ms - an order of magnitude lower than vanilla WiFi-7 even under highly congested network conditions. △ Less

Submitted 2 November, 2021; originally announced November 2021.

arXiv:2110.11489 [pdf, ps, other]

Supporting Massive DLRM Inference Through Software Defined Memory

Authors: Ehsan K. Ardestani, Changkyu Kim, Seung Jae Lee, Luoshang Pan, Valmiki Rampersad, Jens Axboe, Banit Agrawal, Fuxun Yu, Ansha Yu, Trung Le, Hector Yuen, Shishir Juluri, Akshat Nanda, Manoj Wodekar, Dheevatsa Mudigere, Krishnakumar Nair, Maxim Naumov, Chris Peterson, Mikhail Smelyanskiy, Vijay Rao

Abstract: Deep Learning Recommendation Models (DLRM) are widespread, account for a considerable data center footprint, and grow by more than 1.5x per year. With model size soon to be in terabytes range, leveraging Storage ClassMemory (SCM) for inference enables lower power consumption and cost. This paper evaluates the major challenges in extending the memory hierarchy to SCM for DLRM, and presents differen… ▽ More Deep Learning Recommendation Models (DLRM) are widespread, account for a considerable data center footprint, and grow by more than 1.5x per year. With model size soon to be in terabytes range, leveraging Storage ClassMemory (SCM) for inference enables lower power consumption and cost. This paper evaluates the major challenges in extending the memory hierarchy to SCM for DLRM, and presents different techniques to improve performance through a Software Defined Memory. We show how underlying technologies such as Nand Flash and 3DXP differentiate, and relate to real world scenarios, enabling from 5% to 29% power savings. △ Less

Submitted 8 November, 2021; v1 submitted 21 October, 2021; originally announced October 2021.

Comments: 14 pages, 5 figures

arXiv:2108.10971 [pdf]

An Effective Pixel-Wise Approach for Skin Colour Segmentation Using Pixel Neighbourhood Technique

Authors: Tejas Dastane, Varun Rao, Kartik Shenoy, Devendra Vyavaharkar

Abstract: This paper presents a novel technique for skin colour segmentation that overcomes the limitations faced by existing techniques such as Colour Range Thresholding. Skin colour segmentation is affected by the varied skin colours and surrounding lighting conditions, leading to poorskin segmentation for many techniques. We propose a new two stage Pixel Neighbourhood technique that classifies any pixel… ▽ More This paper presents a novel technique for skin colour segmentation that overcomes the limitations faced by existing techniques such as Colour Range Thresholding. Skin colour segmentation is affected by the varied skin colours and surrounding lighting conditions, leading to poorskin segmentation for many techniques. We propose a new two stage Pixel Neighbourhood technique that classifies any pixel as skin or non-skin based on its neighbourhood pixels. The first step calculates the probability of each pixel being skin by passing HSV values of the pixel to a Deep Neural Network model. In the next step, it calculates the likeliness of pixel being skin using these probabilities of neighbouring pixels. This technique performs skin colour segmentation better than the existing techniques. △ Less

Submitted 24 August, 2021; originally announced August 2021.

Comments: 5 pages

Journal ref: International Journal on Recent and Innovation Trends in Computing and Communication 2018, Volume: 6, Issue: 3, pp. 182-186

arXiv:2108.10970 [pdf]

Real-time Indian Sign Language (ISL) Recognition

Authors: Kartik Shenoy, Tejas Dastane, Varun Rao, Devendra Vyavaharkar

Abstract: This paper presents a system which can recognise hand poses & gestures from the Indian Sign Language (ISL) in real-time using grid-based features. This system attempts to bridge the communication gap between the hearing and speech impaired and the rest of the society. The existing solutions either provide relatively low accuracy or do not work in real-time. This system provides good results on bot… ▽ More This paper presents a system which can recognise hand poses & gestures from the Indian Sign Language (ISL) in real-time using grid-based features. This system attempts to bridge the communication gap between the hearing and speech impaired and the rest of the society. The existing solutions either provide relatively low accuracy or do not work in real-time. This system provides good results on both the parameters. It can identify 33 hand poses and some gestures from the ISL. Sign Language is captured from a smartphone camera and its frames are transmitted to a remote server for processing. The use of any external hardware (such as gloves or the Microsoft Kinect sensor) is avoided, making it user-friendly. Techniques such as Face detection, Object stabilisation and Skin Colour Segmentation are used for hand detection and tracking. The image is further subjected to a Grid-based Feature Extraction technique which represents the hand's pose in the form of a Feature Vector. Hand poses are then classified using the k-Nearest Neighbours algorithm. On the other hand, for gesture classification, the motion and intermediate hand poses observation sequences are fed to Hidden Markov Model chains corresponding to the 12 pre-selected gestures defined in ISL. Using this methodology, the system is able to achieve an accuracy of 99.7% for static hand poses, and an accuracy of 97.23% for gesture recognition. △ Less

Submitted 24 August, 2021; originally announced August 2021.

Comments: 9 pages

Journal ref: 9th International Conference on Communication and Network Technology 2018

arXiv:2108.10168 [pdf]

CGEMs: A Metric Model for Automatic Code Generation using GPT-3

Authors: Aishwarya Narasimhan, Krishna Prasad Agara Venkatesha Rao, Veena M B

Abstract: Today, AI technology is showing its strengths in almost every industry and walks of life. From text generation, text summarization, chatbots, NLP is being used widely. One such paradigm is automatic code generation. An AI could be generating anything; hence the output space is unconstrained. A self-driving car is driven for 100 million miles to validate its safety, but tests cannot be written to m… ▽ More Today, AI technology is showing its strengths in almost every industry and walks of life. From text generation, text summarization, chatbots, NLP is being used widely. One such paradigm is automatic code generation. An AI could be generating anything; hence the output space is unconstrained. A self-driving car is driven for 100 million miles to validate its safety, but tests cannot be written to monitor and cover an unconstrained space. One of the solutions to validate AI-generated content is to constrain the problem and convert it from abstract to realistic, and this can be accomplished by either validating the unconstrained algorithm using theoretical proofs or by using Monte-Carlo simulation methods. In this case, we use the latter approach to test/validate a statistically significant number of samples. This hypothesis of validating the AI-generated code is the main motive of this work and to know if AI-generated code is reliable, a metric model CGEMs is proposed. This is an extremely challenging task as programs can have different logic with different naming conventions, but the metrics must capture the structure and logic of the program. This is similar to the importance grammar carries in AI-based text generation, Q&A, translations, etc. The various metrics that are garnered in this work to support the evaluation of generated code are as follows: Compilation, NL description to logic conversion, number of edits needed, some of the commonly used static-code metrics and NLP metrics. These metrics are applied to 80 codes generated using OpenAI's GPT-3. Post which a Neural network is designed for binary classification (acceptable/not acceptable quality of the generated code). The inputs to this network are the values of the features obtained from the metrics. The model achieves a classification accuracy of 76.92% and an F1 score of 55.56%. XAI is augmented for model interpretability. △ Less

Submitted 23 August, 2021; originally announced August 2021.

Comments: 11 pages, 6 figures, 2 tables

arXiv:2108.00965 [pdf, other]

Privacy-Aware Rejection Sampling

Authors: Jordan Awan, Vinayak Rao

Abstract: Differential privacy (DP) offers strong theoretical privacy guarantees, but implementations of DP mechanisms may be vulnerable to side-channel attacks, such as timing attacks. When sampling methods such as MCMC or rejection sampling are used to implement a mechanism, the runtime can leak private information. We characterize the additional privacy cost due to the runtime of a rejection sampler in t… ▽ More Differential privacy (DP) offers strong theoretical privacy guarantees, but implementations of DP mechanisms may be vulnerable to side-channel attacks, such as timing attacks. When sampling methods such as MCMC or rejection sampling are used to implement a mechanism, the runtime can leak private information. We characterize the additional privacy cost due to the runtime of a rejection sampler in terms of both $(ε,δ)$-DP as well as $f$-DP. We also show that unless the acceptance probability is constant across databases, the runtime of a rejection sampler does not satisfy $ε$-DP for any $ε$. We show that there is a similar breakdown in privacy with adaptive rejection samplers. We propose three modifications to the rejection sampling algorithm, with varying assumptions, to protect against timing attacks by making the runtime independent of the data. The modification with the weakest assumptions is an approximate sampler, introducing a small increase in the privacy cost, whereas the other modifications give perfect samplers. We also use our techniques to develop an adaptive rejection sampler for log-Hölder densities, which also has data-independent runtime. We give several examples of DP mechanisms that fit the assumptions of our methods and can thus be implemented using our samplers. △ Less

Submitted 29 September, 2022; v1 submitted 2 August, 2021; originally announced August 2021.

Comments: 25 pages + references, 4 figures

arXiv:2107.04140 [pdf, other]

First-Generation Inference Accelerator Deployment at Facebook

Authors: Michael Anderson, Benny Chen, Stephen Chen, Summer Deng, Jordan Fix, Michael Gschwind, Aravind Kalaiah, Changkyu Kim, Jaewon Lee, Jason Liang, Haixin Liu, Yinghai Lu, Jack Montgomery, Arun Moorthy, Satish Nadathur, Sam Naghshineh, Avinash Nayak, Jongsoo Park, Chris Petersen, Martin Schatz, Narayanan Sundaram, Bangsheng Tang, Peter Tang, Amy Yang, Jiecao Yu , et al. (90 additional authors not shown)

Abstract: In this paper, we provide a deep dive into the deployment of inference accelerators at Facebook. Many of our ML workloads have unique characteristics, such as sparse memory accesses, large model sizes, as well as high compute, memory and network bandwidth requirements. We co-designed a high-performance, energy-efficient inference accelerator platform based on these requirements. We describe the in… ▽ More In this paper, we provide a deep dive into the deployment of inference accelerators at Facebook. Many of our ML workloads have unique characteristics, such as sparse memory accesses, large model sizes, as well as high compute, memory and network bandwidth requirements. We co-designed a high-performance, energy-efficient inference accelerator platform based on these requirements. We describe the inference accelerator platform ecosystem we developed and deployed at Facebook: both hardware, through Open Compute Platform (OCP), and software framework and tooling, through Pytorch/Caffe2/Glow. A characteristic of this ecosystem from the start is its openness to enable a variety of AI accelerators from different vendors. This platform, with six low-power accelerator cards alongside a single-socket host CPU, allows us to serve models of high complexity that cannot be easily or efficiently run on CPUs. We describe various performance optimizations, at both platform and accelerator level, which enables this platform to serve production traffic at Facebook. We also share deployment challenges, lessons learned during performance optimization, as well as provide guidance for future inference hardware co-design. △ Less

Submitted 4 August, 2021; v1 submitted 8 July, 2021; originally announced July 2021.

arXiv:2106.12199 [pdf, other]

doi 10.1137/21M1430005

Bayesian Joint Chance Constrained Optimization: Approximations and Statistical Consistency

Authors: Prateek Jaiswal, Harsha Honnappa, Vinayak A. Rao

Abstract: This paper considers data-driven chance-constrained stochastic optimization problems in a Bayesian framework. Bayesian posteriors afford a principled mechanism to incorporate data and prior knowledge into stochastic optimization problems. However, the computation of Bayesian posteriors is typically an intractable problem, and has spawned a large literature on approximate Bayesian computation. Here… ▽ More This paper considers data-driven chance-constrained stochastic optimization problems in a Bayesian framework. Bayesian posteriors afford a principled mechanism to incorporate data and prior knowledge into stochastic optimization problems. However, the computation of Bayesian posteriors is typically an intractable problem, and has spawned a large literature on approximate Bayesian computation. Here, in the context of chance-constrained optimization, we focus on the question of statistical consistency (in an appropriate sense) of the optimal value, computed using an approximate posterior distribution. To this end, we rigorously prove a frequentist consistency result demonstrating the convergence of the optimal value to the optimal value of a fixed, parameterized constrained optimization problem. We augment this by also establishing a probabilistic rate of convergence of the optimal value. We also prove the convex feasibility of the approximate Bayesian stochastic optimization problem. Finally, we demonstrate the utility of our approach on an optimal staffing problem for an M/M/c queueing model. △ Less

Submitted 30 September, 2022; v1 submitted 23 June, 2021; originally announced June 2021.

arXiv:2106.02357 [pdf, ps, other]

Adiabatic Quantum Feature Selection for Sparse Linear Regression

Authors: Surya Sai Teja Desu, P. K. Srijith, M. V. Panduranga Rao, Naveen Sivadasan

Abstract: Linear regression is a popular machine learning approach to learn and predict real valued outputs or dependent variables from independent variables or features. In many real world problems, its beneficial to perform sparse linear regression to identify important features helpful in predicting the dependent variable. It not only helps in getting interpretable results but also avoids overfitting whe… ▽ More Linear regression is a popular machine learning approach to learn and predict real valued outputs or dependent variables from independent variables or features. In many real world problems, its beneficial to perform sparse linear regression to identify important features helpful in predicting the dependent variable. It not only helps in getting interpretable results but also avoids overfitting when the number of features is large, and the amount of data is small. The most natural way to achieve this is by using `best subset selection' which penalizes non-zero model parameters by adding $\ell_0$ norm over parameters to the least squares loss. However, this makes the objective function non-convex and intractable even for a small number of features. This paper aims to address the intractability of sparse linear regression with $\ell_0$ norm using adiabatic quantum computing, a quantum computing paradigm that is particularly useful for solving optimization problems faster. We formulate the $\ell_0$ optimization problem as a Quadratic Unconstrained Binary Optimization (QUBO) problem and solve it using the D-Wave adiabatic quantum computer. We study and compare the quality of QUBO solution on synthetic and real world datasets. The results demonstrate the effectiveness of the proposed adiabatic quantum computing approach in finding the optimal solution. The QUBO solution matches the optimal solution for a wide range of sparsity penalty values across the datasets. △ Less

Submitted 4 June, 2021; originally announced June 2021.

Comments: 8 pages, 2 tables

arXiv:2105.02626 [pdf, other]

A First Look: Towards Explainable TextVQA Models via Visual and Textual Explanations

Authors: Varun Nagaraj Rao, Xingjian Zhen, Karen Hovsepian, Mingwei Shen

Abstract: Explainable deep learning models are advantageous in many situations. Prior work mostly provide unimodal explanations through post-hoc approaches not part of the original system design. Explanation mechanisms also ignore useful textual information present in images. In this paper, we propose MTXNet, an end-to-end trainable multimodal architecture to generate multimodal explanations, which focuses… ▽ More Explainable deep learning models are advantageous in many situations. Prior work mostly provide unimodal explanations through post-hoc approaches not part of the original system design. Explanation mechanisms also ignore useful textual information present in images. In this paper, we propose MTXNet, an end-to-end trainable multimodal architecture to generate multimodal explanations, which focuses on the text in the image. We curate a novel dataset TextVQA-X, containing ground truth visual and multi-reference textual explanations that can be leveraged during both training and evaluation. We then quantitatively show that training with multimodal explanations complements model performance and surpasses unimodal baselines by up to 7% in CIDEr scores and 2% in IoU. More importantly, we demonstrate that the multimodal explanations are consistent with human interpretations, help justify the models' decision, and provide useful insights to help diagnose an incorrect prediction. Finally, we describe a real-world e-commerce application for using the generated multimodal explanations. △ Less

Submitted 28 April, 2021; originally announced May 2021.

Comments: This paper is done when Xingjian was an intern in Amazon PARS group, summer 2020. This paper is accepted by NAACL-MAI-Workshop, 2021

arXiv:2104.14538 [pdf, other]

Distributed Multigrid Neural Solvers on Megavoxel Domains

Authors: Aditya Balu, Sergio Botelho, Biswajit Khara, Vinay Rao, Chinmay Hegde, Soumik Sarkar, Santi Adavani, Adarsh Krishnamurthy, Baskar Ganapathysubramanian

Abstract: We consider the distributed training of large-scale neural networks that serve as PDE solvers producing full field outputs. We specifically consider neural solvers for the generalized 3D Poisson equation over megavoxel domains. A scalable framework is presented that integrates two distinct advances. First, we accelerate training a large model via a method analogous to the multigrid technique used… ▽ More We consider the distributed training of large-scale neural networks that serve as PDE solvers producing full field outputs. We specifically consider neural solvers for the generalized 3D Poisson equation over megavoxel domains. A scalable framework is presented that integrates two distinct advances. First, we accelerate training a large model via a method analogous to the multigrid technique used in numerical linear algebra. Here, the network is trained using a hierarchy of increasing resolution inputs in sequence, analogous to the 'V', 'W', 'F', and 'Half-V' cycles used in multigrid approaches. In conjunction with the multi-grid approach, we implement a distributed deep learning framework which significantly reduces the time to solve. We show the scalability of this approach on both GPU (Azure VMs on Cloud) and CPU clusters (PSC Bridges2). This approach is deployed to train a generalized 3D Poisson solver that scales well to predict output full-field solutions up to the resolution of 512x512x512 for a high dimensional family of inputs. △ Less

Submitted 29 April, 2021; originally announced April 2021.

arXiv:2104.05158 [pdf, other]

Software-Hardware Co-design for Fast and Scalable Training of Deep Learning Recommendation Models

Authors: Dheevatsa Mudigere, Yuchen Hao, Jianyu Huang, Zhihao Jia, Andrew Tulloch, Srinivas Sridharan, Xing Liu, Mustafa Ozdal, Jade Nie, Jongsoo Park, Liang Luo, Jie Amy Yang, Leon Gao, Dmytro Ivchenko, Aarti Basant, Yuxi Hu, Jiyan Yang, Ehsan K. Ardestani, Xiaodong Wang, Rakesh Komuravelli, Ching-Hsiang Chu, Serhat Yilmaz, Huayu Li, Jiyuan Qian, Zhuobo Feng , et al. (28 additional authors not shown)

Abstract: Deep learning recommendation models (DLRMs) are used across many business-critical services at Facebook and are the single largest AI application in terms of infrastructure demand in its data-centers. In this paper we discuss the SW/HW co-designed solution for high-performance distributed training of large-scale DLRMs. We introduce a high-performance scalable software stack based on PyTorch and pa… ▽ More Deep learning recommendation models (DLRMs) are used across many business-critical services at Facebook and are the single largest AI application in terms of infrastructure demand in its data-centers. In this paper we discuss the SW/HW co-designed solution for high-performance distributed training of large-scale DLRMs. We introduce a high-performance scalable software stack based on PyTorch and pair it with the new evolution of Zion platform, namely ZionEX. We demonstrate the capability to train very large DLRMs with up to 12 Trillion parameters and show that we can attain 40X speedup in terms of time to solution over previous systems. We achieve this by (i) designing the ZionEX platform with dedicated scale-out network, provisioned with high bandwidth, optimal topology and efficient transport (ii) implementing an optimized PyTorch-based training stack supporting both model and data parallelism (iii) developing sharding algorithms capable of hierarchical partitioning of the embedding tables along row, column dimensions and load balancing them across multiple workers; (iv) adding high-performance core operators while retaining flexibility to support optimizers with fully deterministic updates (v) leveraging reduced precision communications, multi-level memory hierarchy (HBM+DDR+SSD) and pipelining. Furthermore, we develop and briefly comment on distributed data ingestion and other supporting services that are required for the robust and efficient end-to-end training in production environments. △ Less

Submitted 26 February, 2023; v1 submitted 11 April, 2021; originally announced April 2021.

arXiv:2103.04838 [pdf]

doi 10.23919/IWLPC52010.2020.9375903

Machine-learning based methodologies for 3d x-ray measurement, characterization and optimization for buried structures in advanced ic packages

Authors: Ramanpreet S Pahwa, Soon Wee Ho, Ren Qin, Richard Chang, Oo Zaw Min, Wang Jie, Vempati Srinivasa Rao, Tin Lay Nwe, Yanjing Yang, Jens Timo Neumann, Ramani Pichumani, Thomas Gregorich

Abstract: For over 40 years lithographic silicon scaling has driven circuit integration and performance improvement in the semiconductor industry. As silicon scaling slows down, the industry is increasingly dependent on IC package technologies to contribute to further circuit integration and performance improvements. This is a paradigm shift and requires the IC package industry to reduce the size and increa… ▽ More For over 40 years lithographic silicon scaling has driven circuit integration and performance improvement in the semiconductor industry. As silicon scaling slows down, the industry is increasingly dependent on IC package technologies to contribute to further circuit integration and performance improvements. This is a paradigm shift and requires the IC package industry to reduce the size and increase the density of internal interconnects on a scale which has never been done before. Traditional package characterization and process optimization relies on destructive techniques such as physical cross-sections and delayering to extract data from internal package features. These destructive techniques are not practical with today's advanced packages. In this paper we will demonstrate how data acquired non-destructively with a 3D X-ray microscope can be enhanced and optimized using machine learning, and can then be used to measure, characterize and optimize the design and production of buried interconnects in advanced IC packages. Test vehicles replicating 2.5D and HBM construction were designed and fabricated, and digital data was extracted from these test vehicles using 3D X-ray and machine learning techniques. The extracted digital data was used to characterize and optimize the design and production of the interconnects and demonstrates a superior alternative to destructive physical analysis. We report an mAP of 0.96 for 3D object detection, a dice score of 0.92 for 3D segmentation, and an average of 2.1um error for 3D metrology on the test dataset. This paper is the first part of a multi-part report. △ Less

Submitted 19 May, 2021; v1 submitted 8 March, 2021; originally announced March 2021.

Comments: 7 pages, 9 figures

Journal ref: International Wafer-Level Packaging Conference (IWLPC) 2020

arXiv:2102.09681 [pdf, other]

WebRED: Effective Pretraining And Finetuning For Relation Extraction On The Web

Authors: Robert Ormandi, Mohammad Saleh, Erin Winter, Vinay Rao

Abstract: Relation extraction is used to populate knowledge bases that are important to many applications. Prior datasets used to train relation extraction models either suffer from noisy labels due to distant supervision, are limited to certain domains or are too small to train high-capacity models. This constrains downstream applications of relation extraction. We therefore introduce: WebRED (Web Relation… ▽ More Relation extraction is used to populate knowledge bases that are important to many applications. Prior datasets used to train relation extraction models either suffer from noisy labels due to distant supervision, are limited to certain domains or are too small to train high-capacity models. This constrains downstream applications of relation extraction. We therefore introduce: WebRED (Web Relation Extraction Dataset), a strongly-supervised human annotated dataset for extracting relationships from a variety of text found on the World Wide Web, consisting of ~110K examples. We also describe the methods we used to collect ~200M examples as pre-training data for this task. We show that combining pre-training on a large weakly supervised dataset with fine-tuning on a small strongly-supervised dataset leads to better relation extraction performance. We provide baselines for this new dataset and present a case for the importance of human annotation in improving the performance of relation extraction from text found on the web. △ Less

Submitted 18 February, 2021; originally announced February 2021.

arXiv:2101.02184 [pdf, other]

VFSIE -- Development and Testing Framework for Federated Science Instruments

Authors: Anees Al-Najjar, Nageswara S. V. Rao, Neena Imam, Thomas Naughton, Seth Hitefield, Lawrence Sorrillo, James Kohl, Wael Elwasif, Jean-Christophe Bilheux, Hassina Bilheux, Swen Boehm, Jason Kincl

Abstract: Recent developments in softwarization of networked infrastructures combined with containerization of computing workflows promise unprecedented compute anywhere and everywhere capabilities for federations of edge and remote computing systems and science instruments. The development and testing of software stacks that implement these capabilities over physical production federations, however, is not… ▽ More Recent developments in softwarization of networked infrastructures combined with containerization of computing workflows promise unprecedented compute anywhere and everywhere capabilities for federations of edge and remote computing systems and science instruments. The development and testing of software stacks that implement these capabilities over physical production federations, however, is not very practical nor cost-effective. In response, we develop a digital twin of the physical infrastructure, called the Virtual Federated Science Instrument Environment (VFSIE). This framework emulates the federation using containers and hosts connected over an emulated network, and supports the development and testing of federation stacks and workflows. We illustrate its use in a case study involving Jupiter Notebook computations and instrument control. △ Less

Submitted 2 February, 2021; v1 submitted 6 January, 2021; originally announced January 2021.

Comments: Earlier Version of VFSIE framework for emulating science workflows at a single site

arXiv:2012.11171 [pdf, other]

A Comprehensive Survey of Machine Learning Based Localization with Wireless Signals

Authors: Daoud Burghal, Ashwin T. Ravi, Varun Rao, Abdullah A. Alghafis, Andreas F. Molisch

Abstract: The last few decades have witnessed a growing interest in location-based services. Using localization systems based on Radio Frequency (RF) signals has proven its efficacy for both indoor and outdoor applications. However, challenges remain with respect to both complexity and accuracy of such systems. Machine Learning (ML) is one of the most promising methods for mitigating these problems, as ML (… ▽ More The last few decades have witnessed a growing interest in location-based services. Using localization systems based on Radio Frequency (RF) signals has proven its efficacy for both indoor and outdoor applications. However, challenges remain with respect to both complexity and accuracy of such systems. Machine Learning (ML) is one of the most promising methods for mitigating these problems, as ML (especially deep learning) offers powerful practical data-driven tools that can be integrated into localization systems. In this paper, we provide a comprehensive survey of ML-based localization solutions that use RF signals. The survey spans different aspects, ranging from the system architectures, to the input features, the ML methods, and the datasets. A main point of the paper is the interaction between the domain knowledge arising from the physics of localization systems, and the various ML approaches. Besides the ML methods, the utilized input features play a major role in shaping the localization solution; we present a detailed discussion of the different features and what could influence them, be it the underlying wireless technology or standards or the preprocessing techniques. A detailed discussion is dedicated to the different ML methods that have been applied to localization problems, discussing the underlying problem and the solution structure. Furthermore, we summarize the different ways the datasets were acquired, and then list the publicly available ones. Overall, the survey categorizes and partly summarizes insights from almost 400 papers in this field. This survey is self-contained, as we provide a concise review of the main ML and wireless propagation concepts, which shall help the researchers in either field navigate through the surveyed solutions, and suggested open problems. △ Less

Submitted 21 December, 2020; originally announced December 2020.

arXiv:2010.10687 [pdf, other]

Is Batch Norm unique? An empirical investigation and prescription to emulate the best properties of common normalizers without batch dependence

Authors: Vinay Rao, Jascha Sohl-Dickstein

Abstract: We perform an extensive empirical study of the statistical properties of Batch Norm and other common normalizers. This includes an examination of the correlation between representations of minibatches, gradient norms, and Hessian spectra both at initialization and over the course of training. Through this analysis, we identify several statistical properties which appear linked to Batch Norm's supe… ▽ More We perform an extensive empirical study of the statistical properties of Batch Norm and other common normalizers. This includes an examination of the correlation between representations of minibatches, gradient norms, and Hessian spectra both at initialization and over the course of training. Through this analysis, we identify several statistical properties which appear linked to Batch Norm's superior performance. We propose two simple normalizers, PreLayerNorm and RegNorm, which better match these desirable properties without involving operations along the batch dimension. We show that PreLayerNorm and RegNorm achieve much of the performance of Batch Norm without requiring batch dependence, that they reliably outperform LayerNorm, and that they can be applied in situations where Batch Norm is ineffective. △ Less

Submitted 20 October, 2020; originally announced October 2020.

arXiv:2010.01385 [pdf, other]

Limitations of Sums of Bounded-Read Formulas

Authors: Purnata Ghosal, B. V. Raghavendra Rao

Abstract: Proving super polynomial size lower bounds for various classes of arithmetic circuits computing explicit polynomials is a very important and challenging task in algebraic complexity theory. We study representation of polynomials as sums of weaker models such as read once formulas (ROFs) and read once oblivious algebraic branching programs (ROABPs). We prove: (1) An exponential separation between… ▽ More Proving super polynomial size lower bounds for various classes of arithmetic circuits computing explicit polynomials is a very important and challenging task in algebraic complexity theory. We study representation of polynomials as sums of weaker models such as read once formulas (ROFs) and read once oblivious algebraic branching programs (ROABPs). We prove: (1) An exponential separation between sum of ROFs and read-$k$ formulas for some constant $k$. (2) A sub-exponential separation between sum of ROABPs and syntactic multilinear ABPs. Our results are based on analysis of the partial derivative matrix under different distributions. These results highlight richness of bounded read restrictions in arithmetic formulas and ABPs. Finally, we consider a generalization of multilinear ROABPs known as strict-interval ABPs defined in [Ramya-Rao, MFCS2019]. We show that strict-interval ABPs are equivalent to ROABPs upto a polynomial size blow up. In contrast, we show that interval formulas are different from ROFs and also admit depth reduction which is not known in the case of strict-interval ABPs. △ Less

Submitted 3 October, 2020; originally announced October 2020.

Comments: 20 pages, 3 figures

arXiv:2007.01189 [pdf, other]

Sequential Domain Adaptation through Elastic Weight Consolidation for Sentiment Analysis

Authors: Avinash Madasu, Vijjini Anvesh Rao

Abstract: Elastic Weight Consolidation (EWC) is a technique used in overcoming catastrophic forgetting between successive tasks trained on a neural network. We use this phenomenon of information sharing between tasks for domain adaptation. Training data for tasks such as sentiment analysis (SA) may not be fairly represented across multiple domains. Domain Adaptation (DA) aims to build algorithms that levera… ▽ More Elastic Weight Consolidation (EWC) is a technique used in overcoming catastrophic forgetting between successive tasks trained on a neural network. We use this phenomenon of information sharing between tasks for domain adaptation. Training data for tasks such as sentiment analysis (SA) may not be fairly represented across multiple domains. Domain Adaptation (DA) aims to build algorithms that leverage information from source domains to facilitate performance on an unseen target domain. We propose a model-independent framework - Sequential Domain Adaptation (SDA). SDA draws on EWC for training on successive source domains to move towards a general domain solution, thereby solving the problem of domain adaptation. We test SDA on convolutional, recurrent, and attention-based architectures. Our experiments show that the proposed framework enables simple architectures such as CNNs to outperform complex state-of-the-art models in domain adaptation of SA. In addition, we observe that the effectiveness of a harder first Anti-Curriculum ordering of source domains leads to maximum performance. △ Less

Submitted 19 July, 2020; v1 submitted 2 July, 2020; originally announced July 2020.

Comments: Accepted at 25th International Conference on Pattern Recognition, January 2021, Milan, Italy

Showing 1–50 of 123 results for author: Rao, V