Search | arXiv e-print repository

Exploring the 6G Potentials: Immersive, Hyper Reliable, and Low-Latency Communication

Authors: Afsoon Alidadi Shamsabadi, Animesh Yadav, Yasser Gadallah, Halim Yanikomeroglu

Abstract: The transition towards the sixth-generation (6G) wireless telecommunications networks introduces significant challenges for researchers and industry stakeholders. The 6G technology aims to enhance existing usage scenarios, particularly supporting innovative applications requiring stringent performance metrics. Among the key performance indicators (KPIs) for 6G, immersive throughput, hyper-reliabil… ▽ More The transition towards the sixth-generation (6G) wireless telecommunications networks introduces significant challenges for researchers and industry stakeholders. The 6G technology aims to enhance existing usage scenarios, particularly supporting innovative applications requiring stringent performance metrics. Among the key performance indicators (KPIs) for 6G, immersive throughput, hyper-reliability, and hyper-low latency must be achieved simultaneously in some critical applications to achieve the application requirements. However, this is challenging due to the conflicting nature of these KPIs. This article proposes a new service class of 6G as immersive, hyper reliable, and low-latency communication (IHRLLC), and introduces a potential network architecture to achieve the associated KPIs. Specifically, technologies such as ultra-massive multiple-input multiple-output (umMIMO)-aided terahertz (THz) communications, and reconfigurable intelligent surfaces (RIS) are viewed as the key enablers for achieving immersive data rate and hyper reliability. Given the computational complexity involved in employing these technologies, and the challenges encountered in designing real-time algorithms for efficient resource allocation and management strategies as well as dynamic beamforming and tracking techniques, we also propose the involvement of other potential enabling technologies such as non-terrestrial networks (NTN), learn-to-optimize (L2O) and generative-AI (GenAI) technologies, quantum computing, and network digital twin (NDT) for limiting the latency. △ Less

Submitted 9 July, 2024; originally announced July 2024.

arXiv:2406.13608 [pdf, other]

Wiretapped Commitment over Binary Channels

Authors: Anuj Kumar Yadav, Manideep Mamindlapally, Amitalok J. Budkuley

Abstract: We propose the problem of wiretapped commitment, where two parties, say committer Alice and receiver Bob, engage in a commitment protocol using a noisy channel as a resource, in the presence of an eavesdropper, say Eve. Noisy versions of Alice's transmission over the wiretap channel are received at both Bob and Eve. We seek to determine the maximum commitment throughput in the presence of an eaves… ▽ More We propose the problem of wiretapped commitment, where two parties, say committer Alice and receiver Bob, engage in a commitment protocol using a noisy channel as a resource, in the presence of an eavesdropper, say Eve. Noisy versions of Alice's transmission over the wiretap channel are received at both Bob and Eve. We seek to determine the maximum commitment throughput in the presence of an eavesdropper, i.e., wiretapped commitment capacity, where in addition to the standard security requirements for two-party commitment, one seeks to ensure that Eve doesn't learn about the commit string. A key interest in this work is to explore the effect of collusion (or lack of it) between the eavesdropper Eve and either Alice or Bob. Toward the same, we present results on the wiretapped commitment capacity under the so-called 1-private regime (when Alice or Bob cannot collude with Eve) and the 2-private regime (when Alice or Bob may possibly collude with Eve). △ Less

Submitted 19 June, 2024; originally announced June 2024.

Comments: 13 Pages, 1 figure

arXiv:2406.06613 [pdf, other]

GameBench: Evaluating Strategic Reasoning Abilities of LLM Agents

Authors: Anthony Costarelli, Mat Allen, Roman Hauksson, Grace Sodunke, Suhas Hariharan, Carlson Cheng, Wenjie Li, Arjun Yadav

Abstract: Large language models have demonstrated remarkable few-shot performance on many natural language understanding tasks. Despite several demonstrations of using large language models in complex, strategic scenarios, there lacks a comprehensive framework for evaluating agents' performance across various types of reasoning found in games. To address this gap, we introduce GameBench, a cross-domain benc… ▽ More Large language models have demonstrated remarkable few-shot performance on many natural language understanding tasks. Despite several demonstrations of using large language models in complex, strategic scenarios, there lacks a comprehensive framework for evaluating agents' performance across various types of reasoning found in games. To address this gap, we introduce GameBench, a cross-domain benchmark for evaluating strategic reasoning abilities of LLM agents. We focus on 9 different game environments, where each covers at least one axis of key reasoning skill identified in strategy games, and select games for which strategy explanations are unlikely to form a significant portion of models' pretraining corpuses. Our evaluations use GPT-3 and GPT-4 in their base form along with two scaffolding frameworks designed to enhance strategic reasoning ability: Chain-of-Thought (CoT) prompting and Reasoning Via Planning (RAP). Our results show that none of the tested models match human performance, and at worse GPT-4 performs worse than random action. CoT and RAP both improve scores but not comparable to human levels. △ Less

Submitted 6 June, 2024; originally announced June 2024.

arXiv:2405.18671 [pdf, other]

Watermarking Counterfactual Explanations

Authors: Hangzhi Guo, Amulya Yadav

Abstract: The field of Explainable Artificial Intelligence (XAI) focuses on techniques for providing explanations to end-users about the decision-making processes that underlie modern-day machine learning (ML) models. Within the vast universe of XAI techniques, counterfactual (CF) explanations are often preferred by end-users as they help explain the predictions of ML models by providing an easy-to-understa… ▽ More The field of Explainable Artificial Intelligence (XAI) focuses on techniques for providing explanations to end-users about the decision-making processes that underlie modern-day machine learning (ML) models. Within the vast universe of XAI techniques, counterfactual (CF) explanations are often preferred by end-users as they help explain the predictions of ML models by providing an easy-to-understand & actionable recourse (or contrastive) case to individual end-users who are adversely impacted by predicted outcomes. However, recent studies have shown significant security concerns with using CF explanations in real-world applications; in particular, malicious adversaries can exploit CF explanations to perform query-efficient model extraction attacks on proprietary ML models. In this paper, we propose a model-agnostic watermarking framework (for adding watermarks to CF explanations) that can be leveraged to detect unauthorized model extraction attacks (which rely on the watermarked CF explanations). Our novel framework solves a bi-level optimization problem to embed an indistinguishable watermark into the generated CF explanation such that any future model extraction attacks that rely on these watermarked CF explanations can be detected using a null hypothesis significance testing (NHST) scheme, while ensuring that these embedded watermarks do not compromise the quality of the generated CF explanations. We evaluate this framework's performance across a diverse set of real-world datasets, CF explanation methods, and model extraction techniques, and show that our watermarking detection system can be used to accurately identify extracted ML models that are trained using the watermarked CF explanations. Our work paves the way for the secure adoption of CF explanations in real-world applications. △ Less

Submitted 28 May, 2024; originally announced May 2024.

arXiv:2405.12929 [pdf, other]

Code-mixed Sentiment and Hate-speech Prediction

Authors: Anjali Yadav, Tanya Garg, Matej Klemen, Matej Ulcar, Basant Agarwal, Marko Robnik Sikonja

Abstract: Code-mixed discourse combines multiple languages in a single text. It is commonly used in informal discourse in countries with several official languages, but also in many other countries in combination with English or neighboring languages. As recently large language models have dominated most natural language processing tasks, we investigated their performance in code-mixed settings for relevant… ▽ More Code-mixed discourse combines multiple languages in a single text. It is commonly used in informal discourse in countries with several official languages, but also in many other countries in combination with English or neighboring languages. As recently large language models have dominated most natural language processing tasks, we investigated their performance in code-mixed settings for relevant tasks. We first created four new bilingual pre-trained masked language models for English-Hindi and English-Slovene languages, specifically aimed to support informal language. Then we performed an evaluation of monolingual, bilingual, few-lingual, and massively multilingual models on several languages, using two tasks that frequently contain code-mixed text, in particular, sentiment analysis and offensive language detection in social media texts. The results show that the most successful classifiers are fine-tuned bilingual models and multilingual models, specialized for social media texts, followed by non-specialized massively multilingual and monolingual models, while huge generative models are not competitive. For our affective problems, the models mostly perform slightly better on code-mixed data compared to non-code-mixed data. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2405.06676 [pdf, other]

EDA Corpus: A Large Language Model Dataset for Enhanced Interaction with OpenROAD

Authors: Bing-Yue Wu, Utsav Sharma, Sai Rahul Dhanvi Kankipati, Ajay Yadav, Bintu Kappil George, Sai Ritish Guntupalli, Austin Rovinski, Vidya A. Chhabria

Abstract: Large language models (LLMs) serve as powerful tools for design, providing capabilities for both task automation and design assistance. Recent advancements have shown tremendous potential for facilitating LLM integration into the chip design process; however, many of these works rely on data that are not publicly available and/or not permissively licensed for use in LLM training and distribution.… ▽ More Large language models (LLMs) serve as powerful tools for design, providing capabilities for both task automation and design assistance. Recent advancements have shown tremendous potential for facilitating LLM integration into the chip design process; however, many of these works rely on data that are not publicly available and/or not permissively licensed for use in LLM training and distribution. In this paper, we present a solution aimed at bridging this gap by introducing an open-source dataset tailored for OpenROAD, a widely adopted open-source EDA toolchain. The dataset features over 1000 data points and is structured in two formats: (i) a pairwise set comprised of question prompts with prose answers, and (ii) a pairwise set comprised of code prompts and their corresponding OpenROAD scripts. By providing this dataset, we aim to facilitate LLM-focused research within the EDA domain. The dataset is available at https://github.com/OpenROAD-Assistant/EDA-Corpus. △ Less

Submitted 4 May, 2024; originally announced May 2024.

Comments: Under review at Workshop on LLM-Aided Design (LAD'24)

arXiv:2404.10989 [pdf, other]

FairSSD: Understanding Bias in Synthetic Speech Detectors

Authors: Amit Kumar Singh Yadav, Kratika Bhagtani, Davide Salvi, Paolo Bestagini, Edward J. Delp

Abstract: Methods that can generate synthetic speech which is perceptually indistinguishable from speech recorded by a human speaker, are easily available. Several incidents report misuse of synthetic speech generated from these methods to commit fraud. To counter such misuse, many methods have been proposed to detect synthetic speech. Some of these detectors are more interpretable, can generalize to detect… ▽ More Methods that can generate synthetic speech which is perceptually indistinguishable from speech recorded by a human speaker, are easily available. Several incidents report misuse of synthetic speech generated from these methods to commit fraud. To counter such misuse, many methods have been proposed to detect synthetic speech. Some of these detectors are more interpretable, can generalize to detect synthetic speech in the wild and are robust to noise. However, limited work has been done on understanding bias in these detectors. In this work, we examine bias in existing synthetic speech detectors to determine if they will unfairly target a particular gender, age and accent group. We also inspect whether these detectors will have a higher misclassification rate for bona fide speech from speech-impaired speakers w.r.t fluent speakers. Extensive experiments on 6 existing synthetic speech detectors using more than 0.9 million speech signals demonstrate that most detectors are gender, age and accent biased, and future work is needed to ensure fairness. To support future research, we release our evaluation dataset, models used in our study and source code at https://gitlab.com/viper-purdue/fairssd. △ Less

Submitted 16 April, 2024; originally announced April 2024.

Comments: Accepted at CVPR 2024 (WMF)

arXiv:2403.18819 [pdf, other]

Benchmarking Object Detectors with COCO: A New Path Forward

Authors: Shweta Singh, Aayan Yadav, Jitesh Jain, Humphrey Shi, Justin Johnson, Karan Desai

Abstract: The Common Objects in Context (COCO) dataset has been instrumental in benchmarking object detectors over the past decade. Like every dataset, COCO contains subtle errors and imperfections stemming from its annotation procedure. With the advent of high-performing models, we ask whether these errors of COCO are hindering its utility in reliably benchmarking further progress. In search for an answer,… ▽ More The Common Objects in Context (COCO) dataset has been instrumental in benchmarking object detectors over the past decade. Like every dataset, COCO contains subtle errors and imperfections stemming from its annotation procedure. With the advent of high-performing models, we ask whether these errors of COCO are hindering its utility in reliably benchmarking further progress. In search for an answer, we inspect thousands of masks from COCO (2017 version) and uncover different types of errors such as imprecise mask boundaries, non-exhaustively annotated instances, and mislabeled masks. Due to the prevalence of COCO, we choose to correct these errors to maintain continuity with prior research. We develop COCO-ReM (Refined Masks), a cleaner set of annotations with visibly better mask quality than COCO-2017. We evaluate fifty object detectors and find that models that predict visually sharper masks score higher on COCO-ReM, affirming that they were being incorrectly penalized due to errors in COCO-2017. Moreover, our models trained using COCO-ReM converge faster and score higher than their larger variants trained using COCO-2017, highlighting the importance of data quality in improving object detectors. With these findings, we advocate using COCO-ReM for future object detection research. Our dataset is available at https://cocorem.xyz △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: Technical report. Dataset website: https://cocorem.xyz and code: https://github.com/kdexd/coco-rem

arXiv:2403.18667 [pdf, other]

Improving Content Recommendation: Knowledge Graph-Based Semantic Contrastive Learning for Diversity and Cold-Start Users

Authors: Yejin Kim, Scott Rome, Kevin Foley, Mayur Nankani, Rimon Melamed, Javier Morales, Abhay Yadav, Maria Peifer, Sardar Hamidian, H. Howie Huang

Abstract: Addressing the challenges related to data sparsity, cold-start problems, and diversity in recommendation systems is both crucial and demanding. Many current solutions leverage knowledge graphs to tackle these issues by combining both item-based and user-item collaborative signals. A common trend in these approaches focuses on improving ranking performance at the cost of escalating model complexity… ▽ More Addressing the challenges related to data sparsity, cold-start problems, and diversity in recommendation systems is both crucial and demanding. Many current solutions leverage knowledge graphs to tackle these issues by combining both item-based and user-item collaborative signals. A common trend in these approaches focuses on improving ranking performance at the cost of escalating model complexity, reducing diversity, and complicating the task. It is essential to provide recommendations that are both personalized and diverse, rather than solely relying on achieving high rank-based performance, such as Click-through Rate, Recall, etc. In this paper, we propose a hybrid multi-task learning approach, training on user-item and item-item interactions. We apply item-based contrastive learning on descriptive text, sampling positive and negative pairs based on item metadata. Our approach allows the model to better understand the relationships between entities within the knowledge graph by utilizing semantic information from text. It leads to more accurate, relevant, and diverse user recommendations and a benefit that extends even to cold-start users who have few interactions with items. We perform extensive experiments on two widely used datasets to validate the effectiveness of our approach. Our findings demonstrate that jointly training user-item interactions and item-based signals using synopsis text is highly effective. Furthermore, our results provide evidence that item-based contrastive learning enhances the quality of entity embeddings, as indicated by metrics such as uniformity and alignment. △ Less

Submitted 27 March, 2024; originally announced March 2024.

Comments: Accepted at LREC-COLING 2024

arXiv:2402.14205 [pdf, other]

Compression Robust Synthetic Speech Detection Using Patched Spectrogram Transformer

Authors: Amit Kumar Singh Yadav, Ziyue Xiang, Kratika Bhagtani, Paolo Bestagini, Stefano Tubaro, Edward J. Delp

Abstract: Many deep learning synthetic speech generation tools are readily available. The use of synthetic speech has caused financial fraud, impersonation of people, and misinformation to spread. For this reason forensic methods that can detect synthetic speech have been proposed. Existing methods often overfit on one dataset and their performance reduces substantially in practical scenarios such as detect… ▽ More Many deep learning synthetic speech generation tools are readily available. The use of synthetic speech has caused financial fraud, impersonation of people, and misinformation to spread. For this reason forensic methods that can detect synthetic speech have been proposed. Existing methods often overfit on one dataset and their performance reduces substantially in practical scenarios such as detecting synthetic speech shared on social platforms. In this paper we propose, Patched Spectrogram Synthetic Speech Detection Transformer (PS3DT), a synthetic speech detector that converts a time domain speech signal to a mel-spectrogram and processes it in patches using a transformer neural network. We evaluate the detection performance of PS3DT on ASVspoof2019 dataset. Our experiments show that PS3DT performs well on ASVspoof2019 dataset compared to other approaches using spectrogram for synthetic speech detection. We also investigate generalization performance of PS3DT on In-the-Wild dataset. PS3DT generalizes well than several existing methods on detecting synthetic speech from an out-of-distribution dataset. We also evaluate robustness of PS3DT to detect telephone quality synthetic speech and synthetic speech shared on social platforms (compressed speech). PS3DT is robust to compression and can detect telephone quality synthetic speech better than several existing methods. △ Less

Submitted 21 February, 2024; originally announced February 2024.

Comments: Accepted as long oral paper at ICMLA 2023

arXiv:2402.11997 [pdf, other]

Remember This Event That Year? Assessing Temporal Information and Reasoning in Large Language Models

Authors: Himanshu Beniwal, Dishant Patel, Kowsik Nandagopan D, Hritik Ladia, Ankit Yadav, Mayank Singh

Abstract: Large Language Models (LLMs) are increasingly ubiquitous, yet their ability to retain and reason about temporal information remains limited, hindering their application in real-world scenarios where understanding the sequential nature of events is crucial. Our study experiments with 12 state-of-the-art models (ranging from 2B to 70B+ parameters) on a novel numerical-temporal dataset, \textbf{TempU… ▽ More Large Language Models (LLMs) are increasingly ubiquitous, yet their ability to retain and reason about temporal information remains limited, hindering their application in real-world scenarios where understanding the sequential nature of events is crucial. Our study experiments with 12 state-of-the-art models (ranging from 2B to 70B+ parameters) on a novel numerical-temporal dataset, \textbf{TempUN}, spanning from 10,000 BCE to 2100 CE, to uncover significant temporal retention and comprehension limitations. We propose six metrics to assess three learning paradigms to enhance temporal knowledge acquisition. Our findings reveal that open-source models exhibit knowledge gaps more frequently, suggesting a trade-off between limited knowledge and incorrect responses. Additionally, various fine-tuning approaches significantly improved performance, reducing incorrect outputs and impacting the identification of 'information not available' in the generations. The associated dataset and code are available at (https://github.com/lingoiitgn/TempUN). △ Less

Submitted 5 July, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

arXiv:2401.14065 [pdf]

doi 10.1016/j.esr.2022.100864

Novel application of Relief Algorithm in cascaded artificial neural network to predict wind speed for wind power resource assessment in India

Authors: Hasmat Malik, Amit Kumar Yadav, Fausto Pedro García Márquez, Jesús María Pinar-Pérez

Abstract: Wind power generated by wind has non-schedule nature due to stochastic nature of meteorological variable. Hence energy business and control of wind power generation requires prediction of wind speed (WS) from few seconds to different time steps in advance. To deal with prediction shortcomings, various WS prediction methods have been used. Predictive data mining offers variety of methods for WS pre… ▽ More Wind power generated by wind has non-schedule nature due to stochastic nature of meteorological variable. Hence energy business and control of wind power generation requires prediction of wind speed (WS) from few seconds to different time steps in advance. To deal with prediction shortcomings, various WS prediction methods have been used. Predictive data mining offers variety of methods for WS predictions where artificial neural network (ANN) is one of the reliable and accurate methods. It is observed from the result of this study that ANN gives better accuracy in comparison conventional model. The accuracy of WS prediction models is found to be dependent on input parameters and architecture type algorithms utilized. So the selection of most relevant input parameters is important research area in WS predicton field. The objective of the paper is twofold: first extensive review of ANN for wind power and WS prediction is carried out. Discussion and analysis of feature selection using Relief Algorithm (RA) in WS prediction are considered for different Indian sites. RA identify atmospheric pressure, solar radiation and relative humidity are relevant input variables. Based on relevant input variables Cascade ANN model is developed and prediction accuracy is evaluated. It is found that root mean square error (RMSE) for comparison between predicted and measured WS for training and testing wind speed are found to be 1.44 m/s and 1.49 m/s respectively. The developed cascade ANN model can be used to predict wind speed for sites where there are not WS measuring instruments are installed in India. △ Less

Submitted 25 January, 2024; originally announced January 2024.

Comments: Malik, H., Yadav, A. K., Márquez, F. P. G., & Pinar-Pérez, J. M. (2022). Novel application of Relief Algorithm in cascaded artificial neural network to predict wind speed for wind power resource assessment in India. Energy Strategy Reviews, 41, 100864

Journal ref: Energy Strategy Reviews 2022. Vol 41, 100864

arXiv:2401.06999 [pdf]

Datasets, Clues and State-of-the-Arts for Multimedia Forensics: An Extensive Review

Authors: Ankit Yadav, Dinesh Kumar Vishwakarma

Abstract: With the large chunks of social media data being created daily and the parallel rise of realistic multimedia tampering methods, detecting and localising tampering in images and videos has become essential. This survey focusses on approaches for tampering detection in multimedia data using deep learning models. Specifically, it presents a detailed analysis of benchmark datasets for malicious manipu… ▽ More With the large chunks of social media data being created daily and the parallel rise of realistic multimedia tampering methods, detecting and localising tampering in images and videos has become essential. This survey focusses on approaches for tampering detection in multimedia data using deep learning models. Specifically, it presents a detailed analysis of benchmark datasets for malicious manipulation detection that are publicly available. It also offers a comprehensive list of tampering clues and commonly used deep learning architectures. Next, it discusses the current state-of-the-art tampering detection methods, categorizing them into meaningful types such as deepfake detection methods, splice tampering detection methods, copy-move tampering detection methods, etc. and discussing their strengths and weaknesses. Top results achieved on benchmark datasets, comparison of deep learning approaches against traditional methods and critical insights from the recent tampering detection methods are also discussed. Lastly, the research gaps, future direction and conclusion are discussed to provide an in-depth understanding of the tampering detection research arena. △ Less

Submitted 13 January, 2024; originally announced January 2024.

arXiv:2401.06998 [pdf]

Towards Effective Image Forensics via A Novel Computationally Efficient Framework and A New Image Splice Dataset

Authors: Ankit Yadav, Dinesh Kumar Vishwakarma

Abstract: Splice detection models are the need of the hour since splice manipulations can be used to mislead, spread rumors and create disharmony in society. However, there is a severe lack of image splicing datasets, which restricts the capabilities of deep learning models to extract discriminative features without overfitting. This manuscript presents two-fold contributions toward splice detection. Firstl… ▽ More Splice detection models are the need of the hour since splice manipulations can be used to mislead, spread rumors and create disharmony in society. However, there is a severe lack of image splicing datasets, which restricts the capabilities of deep learning models to extract discriminative features without overfitting. This manuscript presents two-fold contributions toward splice detection. Firstly, a novel splice detection dataset is proposed having two variants. The two variants include spliced samples generated from code and through manual editing. Spliced images in both variants have corresponding binary masks to aid localization approaches. Secondly, a novel Spatio-Compression Lightweight Splice Detection Framework is proposed for accurate splice detection with minimum computational cost. The proposed dual-branch framework extracts discriminative spatial features from a lightweight spatial branch. It uses original resolution compression data to extract double compression artifacts from the second branch, thereby making it 'information preserving.' Several CNNs are tested in combination with the proposed framework on a composite dataset of images from the proposed dataset and the CASIA v2.0 dataset. The best model accuracy of 0.9382 is achieved and compared with similar state-of-the-art methods, demonstrating the superiority of the proposed framework. △ Less

Submitted 13 January, 2024; originally announced January 2024.

arXiv:2401.06995 [pdf]

A Visually Attentive Splice Localization Network with Multi-Domain Feature Extractor and Multi-Receptive Field Upsampler

Authors: Ankit Yadav, Dinesh Kumar Vishwakarma

Abstract: Image splice manipulation presents a severe challenge in today's society. With easy access to image manipulation tools, it is easier than ever to modify images that can mislead individuals, organizations or society. In this work, a novel, "Visually Attentive Splice Localization Network with Multi-Domain Feature Extractor and Multi-Receptive Field Upsampler" has been proposed. It contains a unique… ▽ More Image splice manipulation presents a severe challenge in today's society. With easy access to image manipulation tools, it is easier than ever to modify images that can mislead individuals, organizations or society. In this work, a novel, "Visually Attentive Splice Localization Network with Multi-Domain Feature Extractor and Multi-Receptive Field Upsampler" has been proposed. It contains a unique "visually attentive multi-domain feature extractor" (VA-MDFE) that extracts attentional features from the RGB, edge and depth domains. Next, a "visually attentive downsampler" (VA-DS) is responsible for fusing and downsampling the multi-domain features. Finally, a novel "visually attentive multi-receptive field upsampler" (VA-MRFU) module employs multiple receptive field-based convolutions to upsample attentional features by focussing on different information scales. Experimental results conducted on the public benchmark dataset CASIA v2.0 prove the potency of the proposed model. It comfortably beats the existing state-of-the-arts by achieving an IoU score of 0.851, pixel F1 score of 0.9195 and pixel AUC score of 0.8989. △ Less

Submitted 13 January, 2024; originally announced January 2024.

arXiv:2401.05308 [pdf, ps, other]

Strategic Client Selection to Address Non-IIDness in HAPS-enabled FL Networks

Authors: Amin Farajzadeh, Animesh Yadav, Halim Yanikomeroglu

Abstract: The deployment of federated learning (FL) within vertical heterogeneous networks, such as those enabled by high-altitude platform station (HAPS), offers the opportunity to engage a wide array of clients, each endowed with distinct communication and computational capabilities. This diversity not only enhances the training accuracy of FL models but also hastens their convergence. Yet, applying FL in… ▽ More The deployment of federated learning (FL) within vertical heterogeneous networks, such as those enabled by high-altitude platform station (HAPS), offers the opportunity to engage a wide array of clients, each endowed with distinct communication and computational capabilities. This diversity not only enhances the training accuracy of FL models but also hastens their convergence. Yet, applying FL in these expansive networks presents notable challenges, particularly the significant non-IIDness in client data distributions. Such data heterogeneity often results in slower convergence rates and reduced effectiveness in model training performance. Our study introduces a client selection strategy tailored to address this issue, leveraging user network traffic behaviour. This strategy involves the prediction and classification of clients based on their network usage patterns while prioritizing user privacy. By strategically selecting clients whose data exhibit similar patterns for participation in FL training, our approach fosters a more uniform and representative data distribution across the network. Our simulations demonstrate that this targeted client selection methodology significantly reduces the training loss of FL models in HAPS networks, thereby effectively tackling a crucial challenge in implementing large-scale FL systems. △ Less

Submitted 10 January, 2024; originally announced January 2024.

Comments: Submitted to IEEE for possible publication

arXiv:2401.03855 [pdf, other]

PythonSaga: Redefining the Benchmark to Evaluate Code Generating LLMs

Authors: Ankit Yadav, Himanshu Beniwal, Mayank Singh

Abstract: Driven by the surge in code generation using large language models (LLMs), numerous benchmarks have emerged to evaluate these LLMs capabilities. We conducted a large-scale human evaluation of HumanEval and MBPP, two popular benchmarks for Python code generation, analyzing their diversity and difficulty. Our findings unveil a critical bias towards a limited set of programming concepts, neglecting m… ▽ More Driven by the surge in code generation using large language models (LLMs), numerous benchmarks have emerged to evaluate these LLMs capabilities. We conducted a large-scale human evaluation of HumanEval and MBPP, two popular benchmarks for Python code generation, analyzing their diversity and difficulty. Our findings unveil a critical bias towards a limited set of programming concepts, neglecting most of the other concepts entirely. Furthermore, we uncover a worrying prevalence of easy tasks, potentially inflating model performance estimations. To address these limitations, we propose a novel benchmark, PythonSaga, featuring 185 hand-crafted prompts on a balanced representation of 38 programming concepts across diverse difficulty levels. The robustness of our benchmark is demonstrated by the poor performance of existing Code-LLMs. △ Less

Submitted 4 July, 2024; v1 submitted 8 January, 2024; originally announced January 2024.

arXiv:2312.01523 [pdf, other]

SymNoise: Advancing Language Model Fine-tuning with Symmetric Noise

Authors: Abhay Kumar Yadav, Arjun Singh

Abstract: In this paper, we introduce a novel fine-tuning technique for language models, which involves incorporating symmetric noise into the embedding process. This method aims to enhance the model's function by more stringently regulating its local curvature, demonstrating superior performance over the current method, NEFTune. When fine-tuning the LLaMA-2-7B model using Alpaca, standard techniques yield… ▽ More In this paper, we introduce a novel fine-tuning technique for language models, which involves incorporating symmetric noise into the embedding process. This method aims to enhance the model's function by more stringently regulating its local curvature, demonstrating superior performance over the current method, NEFTune. When fine-tuning the LLaMA-2-7B model using Alpaca, standard techniques yield a 29.79% score on AlpacaEval. However, our approach, SymNoise, increases this score significantly to 69.04%, using symmetric noisy embeddings. This is a 6.7% improvement over the state-of-the-art method, NEFTune~(64.69%). Furthermore, when tested on various models and stronger baseline instruction datasets, such as Evol-Instruct, ShareGPT, OpenPlatypus, SymNoise consistently outperforms NEFTune. The current literature, including NEFTune, has underscored the importance of more in-depth research into the application of noise-based strategies in the fine-tuning of language models. Our approach, SymNoise, is another significant step towards this direction, showing notable improvement over the existing state-of-the-art method. △ Less

Submitted 8 December, 2023; v1 submitted 3 December, 2023; originally announced December 2023.

arXiv:2311.16965 [pdf]

Natural Language Processing Through Transfer Learning: A Case Study on Sentiment Analysis

Authors: Aman Yadav, Abhishek Vichare

Abstract: Artificial intelligence and machine learning have significantly bolstered the technological world. This paper explores the potential of transfer learning in natural language processing focusing mainly on sentiment analysis. The models trained on the big data can also be used where data are scarce. The claim is that, compared to training models from scratch, transfer learning, using pre-trained BER… ▽ More Artificial intelligence and machine learning have significantly bolstered the technological world. This paper explores the potential of transfer learning in natural language processing focusing mainly on sentiment analysis. The models trained on the big data can also be used where data are scarce. The claim is that, compared to training models from scratch, transfer learning, using pre-trained BERT models, can increase sentiment classification accuracy. The study adopts a sophisticated experimental design that uses the IMDb dataset of sentimentally labelled movie reviews. Pre-processing includes tokenization and encoding of text data, making it suitable for NLP models. The dataset is used on a BERT based model, measuring its performance using accuracy. The result comes out to be 100 per cent accurate. Although the complete accuracy could appear impressive, it might be the result of overfitting or a lack of generalization. Further analysis is required to ensure the model's ability to handle diverse and unseen data. The findings underscore the effectiveness of transfer learning in NLP, showcasing its potential to excel in sentiment analysis tasks. However, the research calls for a cautious interpretation of perfect accuracy and emphasizes the need for additional measures to validate the model's generalization. △ Less

Submitted 28 November, 2023; originally announced November 2023.

Comments: 12 pages, 1 table, 4 figures

arXiv:2311.04345 [pdf, other]

A Taxonomy of Rater Disagreements: Surveying Challenges & Opportunities from the Perspective of Annotating Online Toxicity

Authors: Wenbo Zhang, Hangzhi Guo, Ian D Kivlichan, Vinodkumar Prabhakaran, Davis Yadav, Amulya Yadav

Abstract: Toxicity is an increasingly common and severe issue in online spaces. Consequently, a rich line of machine learning research over the past decade has focused on computationally detecting and mitigating online toxicity. These efforts crucially rely on human-annotated datasets that identify toxic content of various kinds in social media texts. However, such annotations historically yield low inter-r… ▽ More Toxicity is an increasingly common and severe issue in online spaces. Consequently, a rich line of machine learning research over the past decade has focused on computationally detecting and mitigating online toxicity. These efforts crucially rely on human-annotated datasets that identify toxic content of various kinds in social media texts. However, such annotations historically yield low inter-rater agreement, which was often dealt with by taking the majority vote or other such approaches to arrive at a single ground truth label. Recent research has pointed out the importance of accounting for the subjective nature of this task when building and utilizing these datasets, and this has triggered work on analyzing and better understanding rater disagreements, and how they could be effectively incorporated into the machine learning developmental pipeline. While these efforts are filling an important gap, there is a lack of a broader framework about the root causes of rater disagreement, and therefore, we situate this work within that broader landscape. In this survey paper, we analyze a broad set of literature on the reasons behind rater disagreements focusing on online toxicity, and propose a detailed taxonomy for the same. Further, we summarize and discuss the potential solutions targeting each reason for disagreement. We also discuss several open issues, which could promote the future development of online toxicity research. △ Less

Submitted 7 November, 2023; originally announced November 2023.

Comments: 21 pages, 2 figures

arXiv:2306.17427 [pdf]

Modeling and parametric optimization of 3D tendon-sheath actuator system for upper limb soft exosuit

Authors: Amit Yadav, Nitesh Kumar, Shaurya Surana, Aravind Ramasamy, Abhishek Rudra Pal, Sushma Santapuri, Lalan Kumar, Suriya Prakash Muthukrishnan, Shubhendu Bhasin, Sitikantha Roy

Abstract: This paper presents an analysis of parametric characterization of a motor driven tendon-sheath actuator system for use in upper limb augmentation for applications such as rehabilitation, therapy, and industrial automation. The double tendon sheath system, which uses two sets of cables (agonist and antagonist side) guided through a sheath, is considered to produce smooth and natural-looking movemen… ▽ More This paper presents an analysis of parametric characterization of a motor driven tendon-sheath actuator system for use in upper limb augmentation for applications such as rehabilitation, therapy, and industrial automation. The double tendon sheath system, which uses two sets of cables (agonist and antagonist side) guided through a sheath, is considered to produce smooth and natural-looking movements of the arm. The exoskeleton is equipped with a single motor capable of controlling both the flexion and extension motions. One of the key challenges in the implementation of a double tendon sheath system is the possibility of slack in the tendon, which can impact the overall performance of the system. To address this issue, a robust mathematical model is developed and a comprehensive parametric study is carried out to determine the most effective strategies for overcoming the problem of slack and improving the transmission. The study suggests that incorporating a series spring into the system's tendon leads to a universally applicable design, eliminating the need for individual customization. The results also show that the slack in the tendon can be effectively controlled by changing the pretension, spring constant, and size and geometry of spool mounted on the axle of motor. △ Less

Submitted 10 September, 2023; v1 submitted 30 June, 2023; originally announced June 2023.

arXiv:2305.07118 [pdf, other]

Commitment over Gaussian Unfair Noisy Channels

Authors: Amitalok J. Budkuley, Pranav Joshi, Manideep Mamindlapally, Anuj Kumar Yadav

Abstract: Commitment is a key primitive which resides at the heart of several cryptographic protocols. Noisy channels can help realize information-theoretically secure commitment schemes, however, their imprecise statistical characterization can severely impair such schemes, especially their security guarantees. Keeping our focus on channel unreliability in this work, we study commitment over unreliable con… ▽ More Commitment is a key primitive which resides at the heart of several cryptographic protocols. Noisy channels can help realize information-theoretically secure commitment schemes, however, their imprecise statistical characterization can severely impair such schemes, especially their security guarantees. Keeping our focus on channel unreliability in this work, we study commitment over unreliable continuous alphabet channels called the Gaussian unfair noisy channels or Gaussian UNCs. We present the first results on the optimal throughput or commitment capacity of Gaussian UNCs. It is known that classical Gaussian channels have infinite commitment capacity, even under finite transmit power constraints. For unreliable Gaussian UNCs, we prove the surprising result that their commitment capacity may be finite, and in some cases, zero. When commitment is possible, we present achievable rate lower bounds by constructing positive - throughput protocols under given input power constraint, and (two-sided) channel elasticity at committer Alice and receiver Bob. Our achievability results establish an interesting fact - Gaussian UNCs with zero elasticity have infinite commitment capacity - which brings a completely new perspective to why classic Gaussian channels, i.e., Gaussian UNCs with zero elasticity, have infinite capacity. Finally, we precisely characterize the positive commitment capacity threshold for a Gaussian UNC in terms of the channel elasticity, when the transmit power tends to infinity. △ Less

Submitted 11 May, 2023; originally announced May 2023.

Comments: The paper follows alphabetical author order. AKY, MM, and PJ have equally contributed to this work

arXiv:2305.05745 [pdf, other]

Information Spectrum Converse for Minimum Entropy Couplings and Functional Representations

Authors: Yanina Y. Shkel, Anuj Kumar Yadav

Abstract: Given two jointly distributed random variables $(X,Y)$, a functional representation of $X$ is a random variable $Z$ independent of $Y$, and a deterministic function $g(\cdot, \cdot)$ such that $X=g(Y,Z)$. The problem of finding a minimum entropy functional representation is known to be equivalent to the problem of finding a minimum entropy coupling where, given a collection of probability distribu… ▽ More Given two jointly distributed random variables $(X,Y)$, a functional representation of $X$ is a random variable $Z$ independent of $Y$, and a deterministic function $g(\cdot, \cdot)$ such that $X=g(Y,Z)$. The problem of finding a minimum entropy functional representation is known to be equivalent to the problem of finding a minimum entropy coupling where, given a collection of probability distributions $P_1, \dots, P_m$, the goal is to find a coupling $X_1, \dots, X_m$ ($X_i \sim P_i)$ with the smallest entropy $H_α(X_1, \dots, X_m)$. This paper presents a new information spectrum converse, and applies it to obtain direct lower bounds on minimum entropy in both problems. The new results improve on all known lower bounds, including previous lower bounds based on the concept of majorization. In particular, the presented proofs leverage both - the information spectrum and the majorization - perspectives on minimum entropy couplings and functional representations. △ Less

Submitted 9 May, 2023; originally announced May 2023.

Comments: 2023 IEEE International Symposium on Information Theory (ISIT)

arXiv:2305.05463 [pdf, ps, other]

Multi-Tier Hierarchical Federated Learning-assisted NTN for Intelligent IoT Services

Authors: Amin Farajzadeh, Animesh Yadav, Halim Yanikomeroglu

Abstract: In the ever-expanding landscape of the IoT, managing the intricate network of interconnected devices presents a fundamental challenge. This leads us to ask: "What if we invite the IoT devices to collaboratively participate in real-time network management and IoT data-handling decisions?" This inquiry forms the foundation of our innovative approach, addressing the burgeoning complexities in IoT thr… ▽ More In the ever-expanding landscape of the IoT, managing the intricate network of interconnected devices presents a fundamental challenge. This leads us to ask: "What if we invite the IoT devices to collaboratively participate in real-time network management and IoT data-handling decisions?" This inquiry forms the foundation of our innovative approach, addressing the burgeoning complexities in IoT through the integration of NTN architecture, in particular, VHetNet, and an MT-HFL framework. VHetNets transcend traditional network paradigms by harmonizing terrestrial and non-terrestrial elements, thus ensuring expansive connectivity and resilience, especially crucial in areas with limited terrestrial infrastructure. The incorporation of MT-HFL further revolutionizes this architecture, distributing intelligent data processing across a multi-tiered network spectrum, from edge devices on the ground to aerial platforms and satellites above. This study explores MT-HFL's role in fostering a decentralized, collaborative learning environment, enabling IoT devices to not only contribute but also make informed decisions in network management. This methodology adeptly handles the challenges posed by the non-IID nature of IoT data and efficiently curtails communication overheads prevalent in extensive IoT networks. Significantly, MT-HFL enhances data privacy, a paramount aspect in IoT ecosystems, by facilitating local data processing and limiting the sharing of model updates instead of raw data. By evaluating a case-study, our findings demonstrate that the synergistic integration of MT-HFL within VHetNets creates an intelligent network architecture that is robust, scalable, and dynamically adaptive to the ever-changing demands of IoT environments. This setup ensures efficient data handling, advanced privacy and security measures, and responsive adaptability to fluctuating network conditions. △ Less

Submitted 11 December, 2023; v1 submitted 9 May, 2023; originally announced May 2023.

Comments: Submitted to IEEE for possible publication

arXiv:2304.03323 [pdf, other]

DSVAE: Interpretable Disentangled Representation for Synthetic Speech Detection

Authors: Amit Kumar Singh Yadav, Kratika Bhagtani, Ziyue Xiang, Paolo Bestagini, Stefano Tubaro, Edward J. Delp

Abstract: Tools to generate high quality synthetic speech signal that is perceptually indistinguishable from speech recorded from human speakers are easily available. Several approaches have been proposed for detecting synthetic speech. Many of these approaches use deep learning methods as a black box without providing reasoning for the decisions they make. This limits the interpretability of these approach… ▽ More Tools to generate high quality synthetic speech signal that is perceptually indistinguishable from speech recorded from human speakers are easily available. Several approaches have been proposed for detecting synthetic speech. Many of these approaches use deep learning methods as a black box without providing reasoning for the decisions they make. This limits the interpretability of these approaches. In this paper, we propose Disentangled Spectrogram Variational Auto Encoder (DSVAE) which is a two staged trained variational autoencoder that processes spectrograms of speech using disentangled representation learning to generate interpretable representations of a speech signal for detecting synthetic speech. DSVAE also creates an activation map to highlight the spectrogram regions that discriminate synthetic and bona fide human speech signals. We evaluated the representations obtained from DSVAE using the ASVspoof2019 dataset. Our experimental results show high accuracy (>98%) on detecting synthetic speech from 6 known and 10 out of 11 unknown speech synthesizers. We also visualize the representation obtained from DSVAE for 17 different speech synthesizers and verify that they are indeed interpretable and discriminate bona fide and synthetic speech from each of the synthesizers. △ Less

Submitted 28 July, 2023; v1 submitted 6 April, 2023; originally announced April 2023.

arXiv:2304.00913 [pdf, other]

LAHM : Large Annotated Dataset for Multi-Domain and Multilingual Hate Speech Identification

Authors: Ankit Yadav, Shubham Chandel, Sushant Chatufale, Anil Bandhakavi

Abstract: Current research on hate speech analysis is typically oriented towards monolingual and single classification tasks. In this paper, we present a new multilingual hate speech analysis dataset for English, Hindi, Arabic, French, German and Spanish languages for multiple domains across hate speech - Abuse, Racism, Sexism, Religious Hate and Extremism. To the best of our knowledge, this paper is the fi… ▽ More Current research on hate speech analysis is typically oriented towards monolingual and single classification tasks. In this paper, we present a new multilingual hate speech analysis dataset for English, Hindi, Arabic, French, German and Spanish languages for multiple domains across hate speech - Abuse, Racism, Sexism, Religious Hate and Extremism. To the best of our knowledge, this paper is the first to address the problem of identifying various types of hate speech in these five wide domains in these six languages. In this work, we describe how we created the dataset, created annotations at high level and low level for different domains and how we use it to test the current state-of-the-art multilingual and multitask learning approaches. We evaluate our dataset in various monolingual, cross-lingual and machine translation classification settings and compare it against open source English datasets that we aggregated and merged for this task. Then we discuss how this approach can be used to create large scale hate-speech datasets and how to leverage our annotations in order to improve hate speech detection and classification in general. △ Less

Submitted 3 April, 2023; originally announced April 2023.

arXiv:2303.01054 [pdf]

Deep Learning based Segmentation of Optical Coherence Tomographic Images of Human Saphenous Varicose Vein

Authors: Maryam Viqar, Violeta Madjarova, Amit Kumar Yadav, Desislava Pashkuleva, Alexander S. Machikhin

Abstract: Deep-learning based segmentation model is proposed for Optical Coherence Tomography images of human varicose vein based on the U-Net model employing atrous convolution with residual blocks, which gives an accuracy of 0.9932. Deep-learning based segmentation model is proposed for Optical Coherence Tomography images of human varicose vein based on the U-Net model employing atrous convolution with residual blocks, which gives an accuracy of 0.9932. △ Less

Submitted 2 March, 2023; originally announced March 2023.

arXiv:2302.00163 [pdf, ps, other]

FLSTRA: Federated Learning in Stratosphere

Authors: Amin Farajzadeh, Animesh Yadav, Omid Abbasi, Wael Jaafar, Halim Yanikomeroglu

Abstract: We propose a federated learning (FL) in stratosphere (FLSTRA) system, where a high altitude platform station (HAPS) facilitates a large number of terrestrial clients to collaboratively learn a global model without sharing the training data. FLSTRA overcomes the challenges faced by FL in terrestrial networks, such as slow convergence and high communication delay due to limited client participation… ▽ More We propose a federated learning (FL) in stratosphere (FLSTRA) system, where a high altitude platform station (HAPS) facilitates a large number of terrestrial clients to collaboratively learn a global model without sharing the training data. FLSTRA overcomes the challenges faced by FL in terrestrial networks, such as slow convergence and high communication delay due to limited client participation and multi-hop communications. HAPS leverages its altitude and size to allow the participation of more clients with line-of-sight (LOS) links and the placement of a powerful server. However, handling many clients at once introduces computing and transmission delays. Thus, we aim to obtain a delay-accuracy trade-off for FLSTRA. Specifically, we first develop a joint client selection and resource allocation algorithm for uplink and downlink to minimize the FL delay subject to the energy and quality-of-service (QoS) constraints. Second, we propose a communication and computation resource-aware (CCRA-FL) algorithm to achieve the target FL accuracy while deriving an upper bound for its convergence rate. The formulated problem is non-convex; thus, we propose an iterative algorithm to solve it. Simulation results demonstrate the effectiveness of the proposed FLSTRA system, compared to terrestrial benchmarks, in terms of FL delay and accuracy. △ Less

Submitted 9 June, 2023; v1 submitted 31 January, 2023; originally announced February 2023.

Comments: Accepted to IEEE Transactions on Wireless Communications

arXiv:2301.08863 [pdf, other]

HAPS for 6G Networks: Potential Use Cases, Open Challenges, and Possible Solutions

Authors: Omid Abbasi, Animesh Yadav, Halim Yanikomeroglu, Ngoc Dung Dao, Gamini Senarath, Peiying Zhu

Abstract: High altitude platform station (HAPS), which is deployed in the stratosphere at an altitude of 20-50 kilometres, has attracted much attention in recent years due to their large footprint, line-of-sight links, and fixed position relative to the Earth. Compared with existing network infrastructure, HAPS has a much larger coverage area than terrestrial base stations and is much closer than satellites… ▽ More High altitude platform station (HAPS), which is deployed in the stratosphere at an altitude of 20-50 kilometres, has attracted much attention in recent years due to their large footprint, line-of-sight links, and fixed position relative to the Earth. Compared with existing network infrastructure, HAPS has a much larger coverage area than terrestrial base stations and is much closer than satellites to the ground users. Besides small-cells and macro-cells, a HAPS can offer one mega-cell, which can complement legacy networks in 6G and beyond wireless systems. This paper explores potential use cases and discusses relevant open challenges of integrating HAPS into legacy networks, while also suggesting some solutions to these challenges. The cumulative density functions of spectral efficiency of the integrated network and cell-edge users are studied and compared with terrestrial network. The results show the capacity gains achieved by the integrated network are beneficial to cell-edge users. Furthermore, the advantages of a HAPS for backhauling aerial base stations are demonstrated by the simulation results. △ Less

Submitted 11 April, 2023; v1 submitted 20 January, 2023; originally announced January 2023.

arXiv:2210.08532 [pdf, other]

AskYourDB: An end-to-end system for querying and visualizing relational databases using natural language

Authors: Manu Joseph, Harsh Raj, Anubhav Yadav, Aaryamann Sharma

Abstract: Querying databases for the right information is a time consuming and error-prone task and often requires experienced professionals for the job. Furthermore, the user needs to have some prior knowledge about the database. There have been various efforts to develop an intelligence which can help business users to query databases directly. However, there has been some successes, but very little in te… ▽ More Querying databases for the right information is a time consuming and error-prone task and often requires experienced professionals for the job. Furthermore, the user needs to have some prior knowledge about the database. There have been various efforts to develop an intelligence which can help business users to query databases directly. However, there has been some successes, but very little in terms of testing and deploying those for real world users. In this paper, we propose a semantic parsing approach to address the challenge of converting complex natural language into SQL and institute a product out of it. For this purpose, we modified state-of-the-art models, by various pre and post processing steps which make the significant part when a model is deployed in production. To make the product serviceable to businesses we added an automatic visualization framework over the queried results. △ Less

Submitted 16 October, 2022; originally announced October 2022.

Comments: 9 pages

arXiv:2206.07331 [pdf]

ETMA: Efficient Transformer Based Multilevel Attention framework for Multimodal Fake News Detection

Authors: Ashima Yadav, Shivani Gaba, Haneef Khan, Ishan Budhiraja, Akansha Singh, Krishan Kant Singh

Abstract: In this new digital era, social media has created a severe impact on the lives of people. In recent times, fake news content on social media has become one of the major challenging problems for society. The dissemination of fabricated and false news articles includes multimodal data in the form of text and images. The previous methods have mainly focused on unimodal analysis. Moreover, for multimo… ▽ More In this new digital era, social media has created a severe impact on the lives of people. In recent times, fake news content on social media has become one of the major challenging problems for society. The dissemination of fabricated and false news articles includes multimodal data in the form of text and images. The previous methods have mainly focused on unimodal analysis. Moreover, for multimodal analysis, researchers fail to keep the unique characteristics corresponding to each modality. This paper aims to overcome these limitations by proposing an Efficient Transformer based Multilevel Attention (ETMA) framework for multimodal fake news detection, which comprises the following components: visual attention-based encoder, textual attention-based encoder, and joint attention-based learning. Each component utilizes the different forms of attention mechanism and uniquely deals with multimodal data to detect fraudulent content. The efficacy of the proposed network is validated by conducting several experiments on four real-world fake news datasets: Twitter, Jruvika Fake News Dataset, Pontes Fake News Dataset, and Risdal Fake News Dataset using multiple evaluation metrics. The results show that the proposed method outperforms the baseline methods on all four datasets. Further, the computation time of the model is also lower than the state-of-the-art methods. △ Less

Submitted 13 March, 2023; v1 submitted 15 June, 2022; originally announced June 2022.

Comments: Accepted for publication in IEEE Transactions on Computational Social Systems

arXiv:2206.07005 [pdf, other]

Beyond-Cell Communications via HAPS-RIS

Authors: Safwan Alfattani, Animesh Yadav, Halim Yanikomeroglu, Abbas Yongacoglu

Abstract: The ever-increasing number of users and new services in urban regions can lead terrestrial base stations (BSs) to become overloaded and, consequently, some users to go unserved. Compounding this, users in urban areas can face severe shadowing and blockages, which means that some users do not receive a desired quality of service (QoS). Motivated by the energy and cost benefits of reconfigurable int… ▽ More The ever-increasing number of users and new services in urban regions can lead terrestrial base stations (BSs) to become overloaded and, consequently, some users to go unserved. Compounding this, users in urban areas can face severe shadowing and blockages, which means that some users do not receive a desired quality of service (QoS). Motivated by the energy and cost benefits of reconfigurable intelligent surfaces (RIS) and the advantages of high altitude platform stations (HAPS), including their wide footprint and strong line-of-sight (LoS) links, we propose a solution to service the stranded users using the RISaided HAPS. More specifically, we propose to service the stranded users by a dedicated control station (CS) via a HAPS equipped with RIS (HAPS-RIS). Through this approach, users are not restricted from being serviced by the cell they belong to; hence, we refer to this approach as beyond-cell communication. As we demonstrate in this paper, beyond-cell communication works in tandem with legacy terrestrial networks to support uncovered or unserved users. Optimal transmit power and RIS unit assignment strategies for the users based on different network objectives are introduced. Numerical results demonstrate the benefits of the proposed beyond-cell communication approach. Moreover, the results provide insights into the different optimization objectives and their interplay with minimum quality-of-service (QoS) and network resources, such as transmit power and the number of reflectors. △ Less

Submitted 3 October, 2022; v1 submitted 29 May, 2022; originally announced June 2022.

Comments: 9 pages, 5 fugures, to be presented in IEEE Globecom Workshop 2022

arXiv:2206.00700 [pdf, other]

doi 10.1145/3583780.3615040

RoCourseNet: Distributionally Robust Training of a Prediction Aware Recourse Model

Authors: Hangzhi Guo, Feiran Jia, Jinghui Chen, Anna Squicciarini, Amulya Yadav

Abstract: Counterfactual (CF) explanations for machine learning (ML) models are preferred by end-users, as they explain the predictions of ML models by providing a recourse (or contrastive) case to individuals who are adversely impacted by predicted outcomes. Existing CF explanation methods generate recourses under the assumption that the underlying target ML model remains stationary over time. However, due… ▽ More Counterfactual (CF) explanations for machine learning (ML) models are preferred by end-users, as they explain the predictions of ML models by providing a recourse (or contrastive) case to individuals who are adversely impacted by predicted outcomes. Existing CF explanation methods generate recourses under the assumption that the underlying target ML model remains stationary over time. However, due to commonly occurring distributional shifts in training data, ML models constantly get updated in practice, which might render previously generated recourses invalid and diminish end-users trust in our algorithmic framework. To address this problem, we propose RoCourseNet, a training framework that jointly optimizes predictions and recourses that are robust to future data shifts. This work contains four key contributions: (1) We formulate the robust recourse generation problem as a tri-level optimization problem which consists of two sub-problems: (i) a bi-level problem that finds the worst-case adversarial shift in the training data, and (ii) an outer minimization problem to generate robust recourses against this worst-case shift. (2) We leverage adversarial training to solve this tri-level optimization problem by: (i) proposing a novel virtual data shift (VDS) algorithm to find worst-case shifted ML models via explicitly considering the worst-case data shift in the training dataset, and (ii) a block-wise coordinate descent procedure to optimize for prediction and corresponding robust recourses. (3) We evaluate RoCourseNet's performance on three real-world datasets, and show that RoCourseNet consistently achieves more than 96% robust validity and outperforms state-of-the-art baselines by at least 10% in generating robust CF explanations. (4) Finally, we generalize the RoCourseNet framework to accommodate any parametric post-hoc methods for improving robust validity. △ Less

Submitted 18 August, 2023; v1 submitted 1 June, 2022; originally announced June 2022.

arXiv:2205.06673 [pdf]

Univariate and Multivariate LSTM Model for Short-Term Stock Market Prediction

Authors: Vishal Kuber, Divakar Yadav, Arun Kr Yadav

Abstract: Designing robust and accurate prediction models has been a viable research area since a long time. While proponents of a well-functioning market predictors believe that it is difficult to accurately predict market prices but many scholars disagree. Robust and accurate prediction systems will not only be helpful to the businesses but also to the individuals in making their financial investments. Th… ▽ More Designing robust and accurate prediction models has been a viable research area since a long time. While proponents of a well-functioning market predictors believe that it is difficult to accurately predict market prices but many scholars disagree. Robust and accurate prediction systems will not only be helpful to the businesses but also to the individuals in making their financial investments. This paper presents an LSTM model with two different input approaches for predicting the short-term stock prices of two Indian companies, Reliance Industries and Infosys Ltd. Ten years of historic data (2012-2021) is taken from the yahoo finance website to carry out analysis of proposed approaches. In the first approach, closing prices of two selected companies are directly applied on univariate LSTM model. For the approach second, technical indicators values are calculated from the closing prices and then collectively applied on Multivariate LSTM model. Short term market behaviour for upcoming days is evaluated. Experimental outcomes revel that approach one is useful to determine the future trend but multivariate LSTM model with technical indicators found to be useful in accurately predicting the future price behaviours. △ Less

Submitted 8 May, 2022; originally announced May 2022.

Comments: 24 pages, 20 figures, 8 tables

arXiv:2204.12067 [pdf, other]

An Overview of Recent Work in Media Forensics: Methods and Threats

Authors: Kratika Bhagtani, Amit Kumar Singh Yadav, Emily R. Bartusiak, Ziyue Xiang, Ruiting Shao, Sriram Baireddy, Edward J. Delp

Abstract: In this paper, we review recent work in media forensics for digital images, video, audio (specifically speech), and documents. For each data modality, we discuss synthesis and manipulation techniques that can be used to create and modify digital media. We then review technological advancements for detecting and quantifying such manipulations. Finally, we consider open issues and suggest directions… ▽ More In this paper, we review recent work in media forensics for digital images, video, audio (specifically speech), and documents. For each data modality, we discuss synthesis and manipulation techniques that can be used to create and modify digital media. We then review technological advancements for detecting and quantifying such manipulations. Finally, we consider open issues and suggest directions for future research. △ Less

Submitted 12 May, 2022; v1 submitted 26 April, 2022; originally announced April 2022.

Comments: This is a longer version of a paper accepted to the 2022 IEEE International Conference on Multimedia Information Processing and Retrieval entitled "An Overview of Recent Work in Multimedia Forensics"

arXiv:2204.01849 [pdf]

Automatic Text Summarization Methods: A Comprehensive Review

Authors: Divakar Yadav, Jalpa Desai, Arun Kumar Yadav

Abstract: One of the most pressing issues that have arisen due to the rapid growth of the Internet is known as information overloading. Simplifying the relevant information in the form of a summary will assist many people because the material on any topic is plentiful on the Internet. Manually summarising massive amounts of text is quite challenging for humans. So, it has increased the need for more complex… ▽ More One of the most pressing issues that have arisen due to the rapid growth of the Internet is known as information overloading. Simplifying the relevant information in the form of a summary will assist many people because the material on any topic is plentiful on the Internet. Manually summarising massive amounts of text is quite challenging for humans. So, it has increased the need for more complex and powerful summarizers. Researchers have been trying to improve approaches for creating summaries since the 1950s, such that the machine-generated summary matches the human-created summary. This study provides a detailed state-of-the-art analysis of text summarization concepts such as summarization approaches, techniques used, standard datasets, evaluation metrics and future scopes for research. The most commonly accepted approaches are extractive and abstractive, studied in detail in this work. Evaluating the summary and increasing the development of reusable resources and infrastructure aids in comparing and replicating findings, adding competition to improve the outcomes. Different evaluation methods of generated summaries are also discussed in this study. Finally, at the end of this study, several challenges and research opportunities related to text summarization research are mentioned that may be useful for potential researchers working in this area. △ Less

Submitted 3 March, 2022; originally announced April 2022.

Comments: 20 pages, 7 figures and 4 tables

arXiv:2203.10930 [pdf]

An integrated Auto Encoder-Block Switching defense approach to prevent adversarial attacks

Authors: Anirudh Yadav, Ashutosh Upadhyay, S. Sharanya

Abstract: According to recent studies, the vulnerability of state-of-the-art Neural Networks to adversarial input samples has increased drastically. A neural network is an intermediate path or technique by which a computer learns to perform tasks using Machine learning algorithms. Machine Learning and Artificial Intelligence model has become a fundamental aspect of life, such as self-driving cars [1], smart… ▽ More According to recent studies, the vulnerability of state-of-the-art Neural Networks to adversarial input samples has increased drastically. A neural network is an intermediate path or technique by which a computer learns to perform tasks using Machine learning algorithms. Machine Learning and Artificial Intelligence model has become a fundamental aspect of life, such as self-driving cars [1], smart home devices, so any vulnerability is a significant concern. The smallest input deviations can fool these extremely literal systems and deceive their users as well as administrator into precarious situations. This article proposes a defense algorithm that utilizes the combination of an auto-encoder [3] and block-switching architecture. Auto-coder is intended to remove any perturbations found in input images whereas the block switching method is used to make it more robust against White-box attacks. The attack is planned using FGSM [9] model, and the subsequent counter-attack by the proposed architecture will take place thereby demonstrating the feasibility and security delivered by the algorithm. △ Less

Submitted 11 March, 2022; originally announced March 2022.

arXiv:2111.08477 [pdf, other]

On Reverse Elastic Channels and the Asymmetry of Commitment Capacity under Channel Elasticity

Authors: Amitalok J. Budkuley, Pranav Joshi, Manideep Mamindlapally, Anuj Kumar Yadav

Abstract: Commitment is an important cryptographic primitive. It is well known that noisy channels are a promising resource to realize commitment in an information-theoretically secure manner. However, oftentimes, channel behaviour may be poorly characterized thereby limiting the commitment throughput and/or degrading the security guarantees; particularly problematic is when a dishonest party, unbeknown to… ▽ More Commitment is an important cryptographic primitive. It is well known that noisy channels are a promising resource to realize commitment in an information-theoretically secure manner. However, oftentimes, channel behaviour may be poorly characterized thereby limiting the commitment throughput and/or degrading the security guarantees; particularly problematic is when a dishonest party, unbeknown to the honest one, can maliciously alter the channel characteristics. Reverse elastic channels (RECs) are an interesting class of such unreliable channels, where only a dishonest committer, say, Alice can maliciously alter the channel. RECs have attracted recent interest in the study of several cryptographic primitives. Our principal contribution is the REC commitment capacity characterization; this proves a recent related conjecture. A key result is our tight converse which analyses a specific cheating strategy by Alice. RECs are closely related to the classic unfair noisy channels (UNCs); elastic channels (ECs), where only a dishonest receiver Bob can alter the channel, are similarly related. In stark contrast to UNCs, both RECs and ECs always exhibit positive commitment throughput for all non-trivial parameters. Interestingly, our results show that channels with exclusive one-sided elasticity for dishonest parties, exhibit a fundamental asymmetry where a committer with one-sided elasticity has a more debilitating effect on the commitment throughput than a receiver. △ Less

Submitted 16 November, 2021; originally announced November 2021.

Comments: 16 pages, 3 figures

arXiv:2109.08916 [pdf]

Underwater Image Enhancement Using Convolutional Neural Network

Authors: Anushka Yadav, Mayank Upadhyay, Ghanapriya Singh

Abstract: This work proposes a method for underwater image enhancement using the principle of histogram equalization. Since underwater images have a global strong dominant colour, their colourfulness and contrast are often degraded. Before applying the histogram equalisation technique on the image, the image is converted from coloured image to a gray scale image for further operations. Histogram equalizatio… ▽ More This work proposes a method for underwater image enhancement using the principle of histogram equalization. Since underwater images have a global strong dominant colour, their colourfulness and contrast are often degraded. Before applying the histogram equalisation technique on the image, the image is converted from coloured image to a gray scale image for further operations. Histogram equalization is a technique for adjusting image intensities to enhance contrast. The colours of the image are retained using a convolutional neural network model which is trained by the datasets of underwater images to give better results. △ Less

Submitted 18 September, 2021; originally announced September 2021.

arXiv:2109.07557 [pdf, other]

CounterNet: End-to-End Training of Prediction Aware Counterfactual Explanations

Authors: Hangzhi Guo, Thanh Hong Nguyen, Amulya Yadav

Abstract: This work presents CounterNet, a novel end-to-end learning framework which integrates Machine Learning (ML) model training and the generation of corresponding counterfactual (CF) explanations into a single end-to-end pipeline. Counterfactual explanations offer a contrastive case, i.e., they attempt to find the smallest modification to the feature values of an instance that changes the prediction o… ▽ More This work presents CounterNet, a novel end-to-end learning framework which integrates Machine Learning (ML) model training and the generation of corresponding counterfactual (CF) explanations into a single end-to-end pipeline. Counterfactual explanations offer a contrastive case, i.e., they attempt to find the smallest modification to the feature values of an instance that changes the prediction of the ML model on that instance to a predefined output. Prior techniques for generating CF explanations suffer from two major limitations: (i) all of them are post-hoc methods designed for use with proprietary ML models -- as a result, their procedure for generating CF explanations is uninformed by the training of the ML model, which leads to misalignment between model predictions and explanations; and (ii) most of them rely on solving separate time-intensive optimization problems to find CF explanations for each input data point (which negatively impacts their runtime). This work makes a novel departure from the prevalent post-hoc paradigm (of generating CF explanations) by presenting CounterNet, an end-to-end learning framework which integrates predictive model training and the generation of counterfactual (CF) explanations into a single pipeline. Unlike post-hoc methods, CounterNet enables the optimization of the CF explanation generation only once together with the predictive model. We adopt a block-wise coordinate descent procedure which helps in effectively training CounterNet's network. Our extensive experiments on multiple real-world datasets show that CounterNet generates high-quality predictions, and consistently achieves 100% CF validity and low proximity scores (thereby achieving a well-balanced cost-invalidity trade-off) for any new input instance, and runs 3X faster than existing state-of-the-art baselines. △ Less

Submitted 22 June, 2023; v1 submitted 15 September, 2021; originally announced September 2021.

Journal ref: Proceedings of the 29th SIGKDD Conference on Knowledge Discovery and Data Mining (KDD'23), 2023

arXiv:2105.12399 [pdf, other]

SentEmojiBot: Empathising Conversations Generation with Emojis

Authors: Akhilesh Ravi, Amit Yadav, Jainish Chauhan, Jatin Dholakia, Naman Jain, Mayank Singh

Abstract: The increasing use of dialogue agents makes it extremely desirable for them to understand and acknowledge the implied emotions to respond like humans with empathy. Chatbots using traditional techniques analyze emotions based on the context and meaning of the text and lack the understanding of emotions expressed through face. Emojis representing facial expressions present a promising way to express… ▽ More The increasing use of dialogue agents makes it extremely desirable for them to understand and acknowledge the implied emotions to respond like humans with empathy. Chatbots using traditional techniques analyze emotions based on the context and meaning of the text and lack the understanding of emotions expressed through face. Emojis representing facial expressions present a promising way to express emotions. However, none of the AI systems utilizes emojis for empathetic conversation generation. We propose, SentEmojiBot, based on the SentEmoji dataset, to generate empathetic conversations with a combination of emojis and text. Evaluation metrics show that the BERT-based model outperforms the vanilla transformer model. A user study indicates that the dialogues generated by our model were understandable and adding emojis improved empathetic traits in conversations by 9.8% △ Less

Submitted 26 May, 2021; originally announced May 2021.

arXiv:2104.04517 [pdf, other]

doi 10.7717/peerj-cs.786

AdCOFE: Advanced Contextual Feature Extraction in Conversations for emotion classification

Authors: Vaibhav Bhat, Anita Yadav, Sonal Yadav, Dhivya Chandrasekaran, Vijay Mago

Abstract: Emotion recognition in conversations is an important step in various virtual chat bots which require opinion-based feedback, like in social media threads, online support and many more applications. Current Emotion recognition in conversations models face issues like (a) loss of contextual information in between two dialogues of a conversation, (b) failure to give appropriate importance to signific… ▽ More Emotion recognition in conversations is an important step in various virtual chat bots which require opinion-based feedback, like in social media threads, online support and many more applications. Current Emotion recognition in conversations models face issues like (a) loss of contextual information in between two dialogues of a conversation, (b) failure to give appropriate importance to significant tokens in each utterance and (c) inability to pass on the emotional information from previous utterances.The proposed model of Advanced Contextual Feature Extraction (AdCOFE) addresses these issues by performing unique feature extraction using knowledge graphs, sentiment lexicons and phrases of natural language at all levels (word and position embedding) of the utterances. Experiments on the Emotion recognition in conversations dataset show that AdCOFE is beneficial in capturing emotions in conversations. △ Less

Submitted 9 April, 2021; originally announced April 2021.

Comments: 12 pages, to be published in PeerJ Computer Science Journal

Journal ref: PeerJ Computer Science, 2021

arXiv:2012.13318 [pdf]

Person Re-Identification using Deep Learning Networks: A Systematic Review

Authors: Ankit Yadav, Dinesh Kumar Vishwakarma

Abstract: Person re-identification has received a lot of attention from the research community in recent times. Due to its vital role in security based applications, person re-identification lies at the heart of research relevant to tracking robberies, preventing terrorist attacks and other security critical events. While the last decade has seen tremendous growth in re-id approaches, very little review lit… ▽ More Person re-identification has received a lot of attention from the research community in recent times. Due to its vital role in security based applications, person re-identification lies at the heart of research relevant to tracking robberies, preventing terrorist attacks and other security critical events. While the last decade has seen tremendous growth in re-id approaches, very little review literature exists to comprehend and summarize this progress. This review deals with the latest state-of-the-art deep learning based approaches for person re-identification. While the few existing re-id review works have analysed re-id techniques from a singular aspect, this review evaluates numerous re-id techniques from multiple deep learning aspects such as deep architecture types, common Re-Id challenges (variation in pose, lightning, view, scale, partial or complete occlusion, background clutter), multi-modal Re-Id, cross-domain Re-Id challenges, metric learning approaches and video Re-Id contributions. This review also includes several re-id benchmarks collected over the years, describing their characteristics, specifications and top re-id results obtained on them. The inclusion of the latest deep re-id works makes this a significant contribution to the re-id literature. Lastly, the conclusion and future directions are included. △ Less

Submitted 24 December, 2020; originally announced December 2020.

Comments: 34 pages, 15 figures

arXiv:2012.08256 [pdf]

doi 10.1145/3517139

A Deep Multi-Level Attentive network for Multimodal Sentiment Analysis

Authors: Ashima Yadav, Dinesh Kumar Vishwakarma

Abstract: Multimodal sentiment analysis has attracted increasing attention with broad application prospects. The existing methods focuses on single modality, which fails to capture the social media content for multiple modalities. Moreover, in multi-modal learning, most of the works have focused on simply combining the two modalities, without exploring the complicated correlations between them. This resulte… ▽ More Multimodal sentiment analysis has attracted increasing attention with broad application prospects. The existing methods focuses on single modality, which fails to capture the social media content for multiple modalities. Moreover, in multi-modal learning, most of the works have focused on simply combining the two modalities, without exploring the complicated correlations between them. This resulted in dissatisfying performance for multimodal sentiment classification. Motivated by the status quo, we propose a Deep Multi-Level Attentive network, which exploits the correlation between image and text modalities to improve multimodal learning. Specifically, we generate the bi-attentive visual map along the spatial and channel dimensions to magnify CNNs representation power. Then we model the correlation between the image regions and semantics of the word by extracting the textual features related to the bi-attentive visual features by applying semantic attention. Finally, self-attention is employed to automatically fetch the sentiment-rich multimodal features for the classification. We conduct extensive evaluations on four real-world datasets, namely, MVSA-Single, MVSA-Multiple, Flickr, and Getty Images, which verifies the superiority of our method. △ Less

Submitted 15 December, 2020; originally announced December 2020.

Comments: 11 pages, 7 figures

Journal ref: ACM Transactions on Multimedia Computing, Communications, and Applications, 2022

arXiv:2011.14073 [pdf, other]

On Performance Comparison of Multi-Antenna HD-NOMA, SCMA and PD-NOMA Schemes

Authors: Animesh Yadav, Chen Quan, Pramod K. Varshney, H. Vincent Poor

Abstract: In this paper, we study the uplink channel throughput performance of a proposed novel multiple-antenna hybrid-domain non-orthogonal multiple access (MA-HD-NOMA) scheme. This scheme combines the conventional sparse code multiple access (SCMA) and power-domain NOMA (PD-NOMA) schemes in order to increase the number of users served as compared to conventional NOMA schemes and uses multiple antennas at… ▽ More In this paper, we study the uplink channel throughput performance of a proposed novel multiple-antenna hybrid-domain non-orthogonal multiple access (MA-HD-NOMA) scheme. This scheme combines the conventional sparse code multiple access (SCMA) and power-domain NOMA (PD-NOMA) schemes in order to increase the number of users served as compared to conventional NOMA schemes and uses multiple antennas at the base station. To this end, a joint resource allocation problem for the MA-HD-NOMA scheme is formulated that maximizes the sum rate of the entire system. For a comprehensive comparison, the joint resource allocation problems for the multi-antenna SCMA (MA-SCMA) and multi-antenna PD-NOMA (MA-PD-NOMA) schemes with the same overloading factor are formulated as well. Each of the formulated problems is a mixed-integer non-convex program, and hence, we apply successive convex approximation (SCA)- and reweighted $\ell_1$ minimization-based approaches to obtain rapidly converging solutions. Numerical results reveal that the proposed MA-HD-NOMA scheme has superior performance compared to MA-SCMA and MA-PD-NOMA. △ Less

Submitted 28 November, 2020; originally announced November 2020.

Comments: Accepted to be Published in: IEEE Wireless Communications Letters

arXiv:2011.10358 [pdf]

A Deep Language-independent Network to analyze the impact of COVID-19 on the World via Sentiment Analysis

Authors: Ashima Yadav, Dinesh Kumar Vishwakarma

Abstract: Towards the end of 2019, Wuhan experienced an outbreak of novel coronavirus, which soon spread all over the world, resulting in a deadly pandemic that infected millions of people around the globe. The government and public health agencies followed many strategies to counter the fatal virus. However, the virus severely affected the social and economic lives of the people. In this paper, we extract… ▽ More Towards the end of 2019, Wuhan experienced an outbreak of novel coronavirus, which soon spread all over the world, resulting in a deadly pandemic that infected millions of people around the globe. The government and public health agencies followed many strategies to counter the fatal virus. However, the virus severely affected the social and economic lives of the people. In this paper, we extract and study the opinion of people from the top five worst affected countries by the virus, namely USA, Brazil, India, Russia, and South Africa. We propose a deep language-independent Multilevel Attention-based Conv-BiGRU network (MACBiG-Net), which includes embedding layer, word-level encoded attention, and sentence-level encoded attention mechanism to extract the positive, negative, and neutral sentiments. The embedding layer encodes the sentence sequence into a real-valued vector. The word-level and sentence-level encoding is performed by a 1D Conv-BiGRU based mechanism, followed by word-level and sentence-level attention, respectively. We further develop a COVID-19 Sentiment Dataset by crawling the tweets from Twitter. Extensive experiments on our proposed dataset demonstrate the effectiveness of the proposed MACBiG-Net. Also, attention-weights visualization and in-depth results analysis shows that the proposed network has effectively captured the sentiments of the people. △ Less

Submitted 20 November, 2020; originally announced November 2020.

arXiv:2009.09559 [pdf, other]

Clinical trial of an AI-augmented intervention for HIV prevention in youth experiencing homelessness

Authors: Bryan Wilder, Laura Onasch-Vera, Graham Diguiseppi, Robin Petering, Chyna Hill, Amulya Yadav, Eric Rice, Milind Tambe

Abstract: Youth experiencing homelessness (YEH) are subject to substantially greater risk of HIV infection, compounded both by their lack of access to stable housing and the disproportionate representation of youth of marginalized racial, ethnic, and gender identity groups among YEH. A key goal for health equity is to improve adoption of protective behaviors in this population. One promising strategy for in… ▽ More Youth experiencing homelessness (YEH) are subject to substantially greater risk of HIV infection, compounded both by their lack of access to stable housing and the disproportionate representation of youth of marginalized racial, ethnic, and gender identity groups among YEH. A key goal for health equity is to improve adoption of protective behaviors in this population. One promising strategy for intervention is to recruit peer leaders from the population of YEH to promote behaviors such as condom usage and regular HIV testing to their social contacts. This raises a computational question: which youth should be selected as peer leaders to maximize the overall impact of the intervention? We developed an artificial intelligence system to optimize such social network interventions in a community health setting. We conducted a clinical trial enrolling 713 YEH at drop-in centers in a large US city. The clinical trial compared interventions planned with the algorithm to those where the highest-degree nodes in the youths' social network were recruited as peer leaders (the standard method in public health) and to an observation-only control group. Results from the clinical trial show that youth in the AI group experience statistically significant reductions in key risk behaviors for HIV transmission, while those in the other groups do not. This provides, to our knowledge, the first empirical validation of the usage of AI methods to optimize social network interventions for health. We conclude by discussing lessons learned over the course of the project which may inform future attempts to use AI in community-level interventions. △ Less

Submitted 6 November, 2020; v1 submitted 20 September, 2020; originally announced September 2020.

Report number: Accepted at AAAI 2021

arXiv:2007.07747 [pdf, other]

Preliminary Results from a Peer-Led, Social Network Intervention, Augmented by Artificial Intelligence to Prevent HIV among Youth Experiencing Homelessness

Authors: Eric Rice, Laura Onasch-Vera, Graham T. DiGuiseppi, Bryan Wilder, Robin Petering, Chyna Hill, Amulya Yadav, Milind Tambe

Abstract: Each year, there are nearly 4 million youth experiencing homelessness (YEH) in the United States with HIV prevalence ranging from 3 to 11.5%. Peer change agent (PCA) models for HIV prevention have been used successfully in many populations, but there have been notable failures. In recent years, network interventionists have suggested that these failures could be attributed to PCA selection procedu… ▽ More Each year, there are nearly 4 million youth experiencing homelessness (YEH) in the United States with HIV prevalence ranging from 3 to 11.5%. Peer change agent (PCA) models for HIV prevention have been used successfully in many populations, but there have been notable failures. In recent years, network interventionists have suggested that these failures could be attributed to PCA selection procedures. The change agents themselves who are selected to do the PCA work can often be as important as the messages they convey. To address this concern, we tested a new PCA intervention for YEH, with three arms: (1) an arm using an artificial intelligence (AI) planning algorithm to select PCA, (2) a popularity arm--the standard PCA approach--operationalized as highest degree centrality (DC), and (3) an observation only comparison group (OBS). PCA models that promote HIV testing, HIV knowledge, and condom use are efficacious for YEH. Both the AI and DC arms showed improvements over time. AI-based PCA selection led to better outcomes and increased the speed of intervention effects. Specifically, the changes in behavior observed in the AI arm occurred by 1 month, but not until 3 months in the DC arm. Given the transient nature of YEH and the high risk for HIV infection, more rapid intervention effects are desirable. △ Less

Submitted 10 July, 2020; originally announced July 2020.

arXiv:2007.01580 [pdf, ps, other]

On the Similarity between the Laplace and Neural Tangent Kernels

Authors: Amnon Geifman, Abhay Yadav, Yoni Kasten, Meirav Galun, David Jacobs, Ronen Basri

Abstract: Recent theoretical work has shown that massively overparameterized neural networks are equivalent to kernel regressors that use Neural Tangent Kernels(NTK). Experiments show that these kernel methods perform similarly to real neural networks. Here we show that NTK for fully connected networks is closely related to the standard Laplace kernel. We show theoretically that for normalized data on the h… ▽ More Recent theoretical work has shown that massively overparameterized neural networks are equivalent to kernel regressors that use Neural Tangent Kernels(NTK). Experiments show that these kernel methods perform similarly to real neural networks. Here we show that NTK for fully connected networks is closely related to the standard Laplace kernel. We show theoretically that for normalized data on the hypersphere both kernels have the same eigenfunctions and their eigenvalues decay polynomially at the same rate, implying that their Reproducing Kernel Hilbert Spaces (RKHS) include the same sets of functions. This means that both kernels give rise to classes of functions with the same smoothness properties. The two kernels differ for data off the hypersphere, but experiments indicate that when data is properly normalized these differences are not significant. Finally, we provide experiments on real data comparing NTK and the Laplace kernel, along with a larger class ofγ-exponential kernels. We show that these perform almost identically. Our results suggest that much insight about neural networks can be obtained from analysis of the well-known Laplace kernel, which has a simple closed-form. △ Less

Submitted 14 November, 2020; v1 submitted 3 July, 2020; originally announced July 2020.

arXiv:2006.06865 [pdf, other]

Exploring Algorithmic Fairness in Robust Graph Covering Problems

Authors: Aida Rahmattalabi, Phebe Vayanos, Anthony Fulginiti, Eric Rice, Bryan Wilder, Amulya Yadav, Milind Tambe

Abstract: Fueled by algorithmic advances, AI algorithms are increasingly being deployed in settings subject to unanticipated challenges with complex social effects. Motivated by real-world deployment of AI driven, social-network based suicide prevention and landslide risk management interventions, this paper focuses on robust graph covering problems subject to group fairness constraints. We show that, in th… ▽ More Fueled by algorithmic advances, AI algorithms are increasingly being deployed in settings subject to unanticipated challenges with complex social effects. Motivated by real-world deployment of AI driven, social-network based suicide prevention and landslide risk management interventions, this paper focuses on robust graph covering problems subject to group fairness constraints. We show that, in the absence of fairness constraints, state-of-the-art algorithms for the robust graph covering problem result in biased node coverage: they tend to discriminate individuals (nodes) based on membership in traditionally marginalized groups. To mitigate this issue, we propose a novel formulation of the robust graph covering problem with group fairness constraints and a tractable approximation scheme applicable to real-world instances. We provide a formal analysis of the price of group fairness (PoF) for this problem, where we show that uncertainty can lead to greater PoF. We demonstrate the effectiveness of our approach on several real-world social networks. Our method yields competitive node coverage while significantly improving group fairness relative to state-of-the-art methods. △ Less

Submitted 11 June, 2020; originally announced June 2020.

Comments: Accepted at 2019 Conference on Neural Information Processing Systems

Journal ref: year=2019, pages=15750 to 15761

Showing 1–50 of 75 results for author: Yadav, A