Search | arXiv e-print repository

Code-mixed Sentiment and Hate-speech Prediction

Authors: Anjali Yadav, Tanya Garg, Matej Klemen, Matej Ulcar, Basant Agarwal, Marko Robnik Sikonja

Abstract: Code-mixed discourse combines multiple languages in a single text. It is commonly used in informal discourse in countries with several official languages, but also in many other countries in combination with English or neighboring languages. As recently large language models have dominated most natural language processing tasks, we investigated their performance in code-mixed settings for relevant… ▽ More Code-mixed discourse combines multiple languages in a single text. It is commonly used in informal discourse in countries with several official languages, but also in many other countries in combination with English or neighboring languages. As recently large language models have dominated most natural language processing tasks, we investigated their performance in code-mixed settings for relevant tasks. We first created four new bilingual pre-trained masked language models for English-Hindi and English-Slovene languages, specifically aimed to support informal language. Then we performed an evaluation of monolingual, bilingual, few-lingual, and massively multilingual models on several languages, using two tasks that frequently contain code-mixed text, in particular, sentiment analysis and offensive language detection in social media texts. The results show that the most successful classifiers are fine-tuned bilingual models and multilingual models, specialized for social media texts, followed by non-specialized massively multilingual and monolingual models, while huge generative models are not competitive. For our affective problems, the models mostly perform slightly better on code-mixed data compared to non-code-mixed data. △ Less

Submitted 21 May, 2024; originally announced May 2024.

arXiv:2310.11001 [pdf, other]

Spatially-resolved hyperlocal weather prediction and anomaly detection using IoT sensor networks and machine learning techniques

Authors: Anita B. Agarwal, Rohit Rajesh, Nitin Arul

Abstract: Accurate and timely hyperlocal weather predictions are essential for various applications, ranging from agriculture to disaster management. In this paper, we propose a novel approach that combines hyperlocal weather prediction and anomaly detection using IoT sensor networks and advanced machine learning techniques. Our approach leverages data from multiple spatially-distributed yet relatively clos… ▽ More Accurate and timely hyperlocal weather predictions are essential for various applications, ranging from agriculture to disaster management. In this paper, we propose a novel approach that combines hyperlocal weather prediction and anomaly detection using IoT sensor networks and advanced machine learning techniques. Our approach leverages data from multiple spatially-distributed yet relatively close locations and IoT sensors to create high-resolution weather models capable of predicting short-term, localized weather conditions such as temperature, pressure, and humidity. By monitoring changes in weather parameters across these locations, our system is able to enhance the spatial resolution of predictions and effectively detect anomalies in real-time. Additionally, our system employs unsupervised learning algorithms to identify unusual weather patterns, providing timely alerts. Our findings indicate that this system has the potential to enhance decision-making. △ Less

Submitted 17 October, 2023; originally announced October 2023.

Comments: Submitted to IEEE Modelling Simulation & Intelligent Computing, 2023

arXiv:2106.14350 [pdf]

Deep Learning Image Recognition for Non-images

Authors: Boris Kovalerchuk, Divya Chandrika Kalla, Bedant Agarwal

Abstract: Powerful deep learning algorithms open an opportunity for solving non-image Machine Learning (ML) problems by transforming these problems to into the image recognition problems. The CPC-R algorithm presented in this chapter converts non-image data into images by visualizing non-image data. Then deep learning CNN algorithms solve the learning problems on these images. The design of the CPC-R algori… ▽ More Powerful deep learning algorithms open an opportunity for solving non-image Machine Learning (ML) problems by transforming these problems to into the image recognition problems. The CPC-R algorithm presented in this chapter converts non-image data into images by visualizing non-image data. Then deep learning CNN algorithms solve the learning problems on these images. The design of the CPC-R algorithm allows preserving all high-dimensional information in 2-D images. The use of pair values mapping instead of single value mapping used in the alternative approaches allows encoding each n-D point with 2 times fewer visual elements. The attributes of an n-D point are divided into pairs of its values and each pair is visualized as 2-D points in the same 2-D Cartesian coordinates. Next, grey scale or color intensity values are assigned to each pair to encode the order of pairs. This is resulted in the heatmap image. The computational experiments with CPC-R are conducted for different CNN architectures, and methods to optimize the CPC-R images showing that the combined CPC-R and deep learning CNN algorithms are able to solve non-image ML problems reaching high accuracy on the benchmark datasets. This chapter expands our prior work by adding more experiments to test accuracy of classification, exploring saliency and informativeness of discovered features to test their interpretability, and generalizing the approach. △ Less

Submitted 9 February, 2022; v1 submitted 27 June, 2021; originally announced June 2021.

Comments: 33 pages, 17 figures, 18 tables

arXiv:2104.13352 [pdf]

Tracking Peaceful Tractors on Social Media -- XAI-enabled analysis of Red Fort Riots 2021

Authors: Ajay Agarwal, Basant Agarwal

Abstract: On 26 January 2021, India witnessed a national embarrassment from the demographic least expected from - farmers. People across the nation watched in horror as a pseudo-patriotic mob of farmers stormed capital Delhi and vandalized the national pride- Red Fort. Investigations that followed the event revealed the existence of a social media trail that led to the likes of such an event. Consequently,… ▽ More On 26 January 2021, India witnessed a national embarrassment from the demographic least expected from - farmers. People across the nation watched in horror as a pseudo-patriotic mob of farmers stormed capital Delhi and vandalized the national pride- Red Fort. Investigations that followed the event revealed the existence of a social media trail that led to the likes of such an event. Consequently, it became essential and necessary to archive this trail for social media analysis - not only to understand the bread-crumbs that are dispersed across the trail but also to visualize the role played by misinformation and fake news in this event. In this paper, we propose the tractor2twitter dataset which contains around 0.05 million tweets that were posted before, during, and after this event. Also, we benchmark our dataset with an Explainable AI ML model for classification of each tweet into either of the three categories - disinformation, misinformation, and opinion. △ Less

Submitted 13 June, 2021; v1 submitted 24 April, 2021; originally announced April 2021.

arXiv:1901.07038 [pdf, other]

doi 10.1103/PhysRevD.100.064003

Physics of eccentric binary black hole mergers: A numerical relativity perspective

Authors: E. A. Huerta, Roland Haas, Sarah Habib, Anushri Gupta, Adam Rebei, Vishnu Chavva, Daniel Johnson, Shawn Rosofsky, Erik Wessel, Bhanu Agarwal, Diyu Luo, Wei Ren

Abstract: Gravitational wave observations of eccentric binary black hole mergers will provide unequivocal evidence for the formation of these systems through dynamical assembly in dense stellar environments. The study of these astrophysically motivated sources is timely in view of electromagnetic observations, consistent with the existence of stellar mass black holes in the globular cluster M22 and in the G… ▽ More Gravitational wave observations of eccentric binary black hole mergers will provide unequivocal evidence for the formation of these systems through dynamical assembly in dense stellar environments. The study of these astrophysically motivated sources is timely in view of electromagnetic observations, consistent with the existence of stellar mass black holes in the globular cluster M22 and in the Galactic center, and the proven detection capabilities of ground-based gravitational wave detectors. In order to get insights into the physics of these objects in the dynamical, strong-field gravity regime, we present a catalog of 89 numerical relativity waveforms that describe binary systems of non-spinning black holes with mass-ratios $1\leq q \leq 10$, and initial eccentricities as high as $e_0=0.18$ fifteen cycles before merger. We use this catalog to quantify the loss of energy and angular momentum through gravitational radiation, and the astrophysical properties of the black hole remnant, including its final mass and spin, and recoil velocity. We discuss the implications of these results for gravitational wave source modeling, and the design of algorithms to search for and identify eccentric binary black hole mergers in realistic detection scenarios. △ Less

Submitted 5 September, 2019; v1 submitted 21 January, 2019; originally announced January 2019.

Comments: 11 pages, 5 figures, 2 appendices. A visualization of this numerical relativity waveform catalog is available at https://gravity.ncsa.illinois.edu/products/outreach/; v2: 13 pages, 5 figures, calculations for angular momentum emission and recoil velocities are now included, references added. Accepted to Phys. Rev. D

ACM Class: J.2

Journal ref: Phys. Rev. D 100, 064003 (2019)

arXiv:1712.02820 [pdf, other]

doi 10.1016/j.ipm.2018.06.005

A Deep Network Model for Paraphrase Detection in Short Text Messages

Authors: Basant Agarwal, Heri Ramampiaro, Helge Langseth, Massimiliano Ruocco

Abstract: This paper is concerned with paraphrase detection. The ability to detect similar sentences written in natural language is crucial for several applications, such as text mining, text summarization, plagiarism detection, authorship authentication and question answering. Given two sentences, the objective is to detect whether they are semantically identical. An important insight from this work is tha… ▽ More This paper is concerned with paraphrase detection. The ability to detect similar sentences written in natural language is crucial for several applications, such as text mining, text summarization, plagiarism detection, authorship authentication and question answering. Given two sentences, the objective is to detect whether they are semantically identical. An important insight from this work is that existing paraphrase systems perform well when applied on clean texts, but they do not necessarily deliver good performance against noisy texts. Challenges with paraphrase detection on user generated short texts, such as Twitter, include language irregularity and noise. To cope with these challenges, we propose a novel deep neural network-based approach that relies on coarse-grained sentence modeling using a convolutional neural network and a long short-term memory model, combined with a specific fine-grained word-level similarity matching model. Our experimental results show that the proposed approach outperforms existing state-of-the-art approaches on user-generated noisy social media data, such as Twitter texts, and achieves highly competitive performance on a cleaner corpus. △ Less

Submitted 7 December, 2017; originally announced December 2017.

Journal ref: B Agarwal, H. Ramampiaro, H Langseth, M Ruocco, (2018), "A Deep Network Model for Paraphrase Detection in Short Text Messages". In Information Processing & Management Journal (IPM), 54(6), pp. 922-937. Elsevier

arXiv:1404.1664 [pdf]

Icon Based Information Retrieval and Disease Identification in Agriculture

Authors: Namita Mittal, Basant Agarwal, Ajay Gupta, Hemant Madhur

Abstract: Recent developments in the ICT industry in past few decades has enabled the quick and easy access to the information available on the internet. But, digital literacy is the pre-requisite for its use. The main purpose of this paper is to provide an interface for digitally illiterate users, especially farmers to efficiently and effectively retrieve information through Internet. In addition, to enabl… ▽ More Recent developments in the ICT industry in past few decades has enabled the quick and easy access to the information available on the internet. But, digital literacy is the pre-requisite for its use. The main purpose of this paper is to provide an interface for digitally illiterate users, especially farmers to efficiently and effectively retrieve information through Internet. In addition, to enable the farmers to identify the disease in their crop, its cause and symptoms using digital image processing and pattern recognition instantly without waiting for an expert to visit the farms and identify the disease. △ Less

Submitted 7 April, 2014; originally announced April 2014.

Comments: Iconic Interface, Image Processing, Pattern Recognition, Data Mining, Information Retrieval

Journal ref: International Journal of Advanced Studies in Computer Science & Engineering IJASCSE, Volume 3, Issue 3, 2014

Showing 1–7 of 7 results for author: Agarwal, B