Search | arXiv e-print repository

Exemplar-free Continual Representation Learning via Learnable Drift Compensation

Authors: Alex Gomez-Villa, Dipam Goswami, Kai Wang, Andrew D. Bagdanov, Bartlomiej Twardowski, Joost van de Weijer

Abstract: Exemplar-free class-incremental learning using a backbone trained from scratch and starting from a small first task presents a significant challenge for continual representation learning. Prototype-based approaches, when continually updated, face the critical issue of semantic drift due to which the old class prototypes drift to different positions in the new feature space. Through an analysis of… ▽ More Exemplar-free class-incremental learning using a backbone trained from scratch and starting from a small first task presents a significant challenge for continual representation learning. Prototype-based approaches, when continually updated, face the critical issue of semantic drift due to which the old class prototypes drift to different positions in the new feature space. Through an analysis of prototype-based continual learning, we show that forgetting is not due to diminished discriminative power of the feature extractor, and can potentially be corrected by drift compensation. To address this, we propose Learnable Drift Compensation (LDC), which can effectively mitigate drift in any moving backbone, whether supervised or unsupervised. LDC is fast and straightforward to integrate on top of existing continual learning approaches. Furthermore, we showcase how LDC can be applied in combination with self-supervised CL methods, resulting in the first exemplar-free semi-supervised continual learning approach. We achieve state-of-the-art performance in both supervised and semi-supervised settings across multiple datasets. Code is available at \url{https://github.com/alviur/ldc}. △ Less

Submitted 11 July, 2024; originally announced July 2024.

Comments: Accepted to ECCV 2024

arXiv:2407.00581 [pdf, other]

MasonTigers at SemEval-2024 Task 10: Emotion Discovery and Flip Reasoning in Conversation with Ensemble of Transformers and Prompting

Authors: Al Nahian Bin Emran, Amrita Ganguly, Sadiya Sayara Chowdhury Puspo, Nishat Raihan, Dhiman Goswami

Abstract: In this paper, we present MasonTigers' participation in SemEval-2024 Task 10, a shared task aimed at identifying emotions and understanding the rationale behind their flips within monolingual English and Hindi-English code-mixed dialogues. This task comprises three distinct subtasks - emotion recognition in conversation for Hindi-English code-mixed dialogues, emotion flip reasoning for Hindi-Engli… ▽ More In this paper, we present MasonTigers' participation in SemEval-2024 Task 10, a shared task aimed at identifying emotions and understanding the rationale behind their flips within monolingual English and Hindi-English code-mixed dialogues. This task comprises three distinct subtasks - emotion recognition in conversation for Hindi-English code-mixed dialogues, emotion flip reasoning for Hindi-English code-mixed dialogues, and emotion flip reasoning for English dialogues. Our team, MasonTigers, contributed to each subtask, focusing on developing methods for accurate emotion recognition and reasoning. By leveraging our approaches, we attained impressive F1-scores of 0.78 for the first task and 0.79 for both the second and third tasks. This performance not only underscores the effectiveness of our methods across different aspects of the task but also secured us the top rank in the first and third subtasks, and the 2nd rank in the second subtask. Through extensive experimentation and analysis, we provide insights into our system's performance and contributions to each subtask. △ Less

Submitted 29 June, 2024; originally announced July 2024.

arXiv:2407.00535 [pdf, other]

AI-powered multimodal modeling of personalized hemodynamics in aortic stenosis

Authors: Caglar Ozturk, Daniel H. Pak, Luca Rosalia, Debkalpa Goswami, Mary E. Robakowski, Raymond McKay, Christopher T. Nguyen, James S. Duncan, Ellen T. Roche

Abstract: Aortic stenosis (AS) is the most common valvular heart disease in developed countries. High-fidelity preclinical models can improve AS management by enabling therapeutic innovation, early diagnosis, and tailored treatment planning. However, their use is currently limited by complex workflows necessitating lengthy expert-driven manual operations. Here, we propose an AI-powered computational framewo… ▽ More Aortic stenosis (AS) is the most common valvular heart disease in developed countries. High-fidelity preclinical models can improve AS management by enabling therapeutic innovation, early diagnosis, and tailored treatment planning. However, their use is currently limited by complex workflows necessitating lengthy expert-driven manual operations. Here, we propose an AI-powered computational framework for accelerated and democratized patient-specific modeling of AS hemodynamics from computed tomography. First, we demonstrate that our automated meshing algorithms can generate task-ready geometries for both computational and benchtop simulations with higher accuracy and 100 times faster than existing approaches. Then, we show that our approach can be integrated with fluid-structure interaction and soft robotics models to accurately recapitulate a broad spectrum of clinical hemodynamic measurements of diverse AS patients. The efficiency and reliability of these algorithms make them an ideal complementary tool for personalized high-fidelity modeling of AS biomechanics, hemodynamics, and treatment planning. △ Less

Submitted 29 June, 2024; originally announced July 2024.

Comments: CO and DHP contributed equally to this work. JSD and ETR are corresponding authors

arXiv:2405.19074 [pdf, other]

Resurrecting Old Classes with New Data for Exemplar-Free Continual Learning

Authors: Dipam Goswami, Albin Soutif--Cormerais, Yuyang Liu, Sandesh Kamath, Bartłomiej Twardowski, Joost van de Weijer

Abstract: Continual learning methods are known to suffer from catastrophic forgetting, a phenomenon that is particularly hard to counter for methods that do not store exemplars of previous tasks. Therefore, to reduce potential drift in the feature extractor, existing exemplar-free methods are typically evaluated in settings where the first task is significantly larger than subsequent tasks. Their performanc… ▽ More Continual learning methods are known to suffer from catastrophic forgetting, a phenomenon that is particularly hard to counter for methods that do not store exemplars of previous tasks. Therefore, to reduce potential drift in the feature extractor, existing exemplar-free methods are typically evaluated in settings where the first task is significantly larger than subsequent tasks. Their performance drops drastically in more challenging settings starting with a smaller first task. To address this problem of feature drift estimation for exemplar-free methods, we propose to adversarially perturb the current samples such that their embeddings are close to the old class prototypes in the old model embedding space. We then estimate the drift in the embedding space from the old to the new model using the perturbed images and compensate the prototypes accordingly. We exploit the fact that adversarial samples are transferable from the old to the new feature space in a continual learning setting. The generation of these images is simple and computationally cheap. We demonstrate in our experiments that the proposed approach better tracks the movement of prototypes in embedding space and outperforms existing methods on several standard continual learning benchmarks as well as on fine-grained datasets. Code is available at https://github.com/dipamgoswami/ADC. △ Less

Submitted 29 May, 2024; originally announced May 2024.

Comments: Accepted at CVPR 2024

arXiv:2405.06922 [pdf, other]

EmoMix-3L: A Code-Mixed Dataset for Bangla-English-Hindi Emotion Detection

Authors: Nishat Raihan, Dhiman Goswami, Antara Mahmud, Antonios Anastasopoulos, Marcos Zampieri

Abstract: Code-mixing is a well-studied linguistic phenomenon that occurs when two or more languages are mixed in text or speech. Several studies have been conducted on building datasets and performing downstream NLP tasks on code-mixed data. Although it is not uncommon to observe code-mixing of three or more languages, most available datasets in this domain contain code-mixed data from only two languages.… ▽ More Code-mixing is a well-studied linguistic phenomenon that occurs when two or more languages are mixed in text or speech. Several studies have been conducted on building datasets and performing downstream NLP tasks on code-mixed data. Although it is not uncommon to observe code-mixing of three or more languages, most available datasets in this domain contain code-mixed data from only two languages. In this paper, we introduce EmoMix-3L, a novel multi-label emotion detection dataset containing code-mixed data from three different languages. We experiment with several models on EmoMix-3L and we report that MuRIL outperforms other models on this dataset. △ Less

Submitted 11 May, 2024; originally announced May 2024.

Comments: arXiv admin note: substantial text overlap with arXiv:2310.18387, arXiv:2310.18023

arXiv:2404.06622 [pdf, other]

Calibrating Higher-Order Statistics for Few-Shot Class-Incremental Learning with Pre-trained Vision Transformers

Authors: Dipam Goswami, Bartłomiej Twardowski, Joost van de Weijer

Abstract: Few-shot class-incremental learning (FSCIL) aims to adapt the model to new classes from very few data (5 samples) without forgetting the previously learned classes. Recent works in many-shot CIL (MSCIL) (using all available training data) exploited pre-trained models to reduce forgetting and achieve better plasticity. In a similar fashion, we use ViT models pre-trained on large-scale datasets for… ▽ More Few-shot class-incremental learning (FSCIL) aims to adapt the model to new classes from very few data (5 samples) without forgetting the previously learned classes. Recent works in many-shot CIL (MSCIL) (using all available training data) exploited pre-trained models to reduce forgetting and achieve better plasticity. In a similar fashion, we use ViT models pre-trained on large-scale datasets for few-shot settings, which face the critical issue of low plasticity. FSCIL methods start with a many-shot first task to learn a very good feature extractor and then move to the few-shot setting from the second task onwards. While the focus of most recent studies is on how to learn the many-shot first task so that the model generalizes to all future few-shot tasks, we explore in this work how to better model the few-shot data using pre-trained models, irrespective of how the first task is trained. Inspired by recent works in MSCIL, we explore how using higher-order feature statistics can influence the classification of few-shot classes. We identify the main challenge of obtaining a good covariance matrix from few-shot data and propose to calibrate the covariance matrix for new classes based on semantic similarity to the many-shot base classes. Using the calibrated feature statistics in combination with existing methods significantly improves few-shot continual classification on several FSCIL benchmarks. Code is available at https://github.com/dipamgoswami/FSCIL-Calibration. △ Less

Submitted 9 April, 2024; originally announced April 2024.

Comments: Accepted at CLVision workshop (CVPR 2024)

arXiv:2404.02540 [pdf, ps, other]

CSEPrompts: A Benchmark of Introductory Computer Science Prompts

Authors: Nishat Raihan, Dhiman Goswami, Sadiya Sayara Chowdhury Puspo, Christian Newman, Tharindu Ranasinghe, Marcos Zampieri

Abstract: Recent advances in AI, machine learning, and NLP have led to the development of a new generation of Large Language Models (LLMs) that are trained on massive amounts of data and often have trillions of parameters. Commercial applications (e.g., ChatGPT) have made this technology available to the general public, thus making it possible to use LLMs to produce high-quality texts for academic and profe… ▽ More Recent advances in AI, machine learning, and NLP have led to the development of a new generation of Large Language Models (LLMs) that are trained on massive amounts of data and often have trillions of parameters. Commercial applications (e.g., ChatGPT) have made this technology available to the general public, thus making it possible to use LLMs to produce high-quality texts for academic and professional purposes. Schools and universities are aware of the increasing use of AI-generated content by students and they have been researching the impact of this new technology and its potential misuse. Educational programs in Computer Science (CS) and related fields are particularly affected because LLMs are also capable of generating programming code in various programming languages. To help understand the potential impact of publicly available LLMs in CS education, we introduce CSEPrompts, a framework with hundreds of programming exercise prompts and multiple-choice questions retrieved from introductory CS and programming courses. We also provide experimental results on CSEPrompts to evaluate the performance of several LLMs with respect to generating Python code and answering basic computer science and programming questions. △ Less

Submitted 4 April, 2024; v1 submitted 3 April, 2024; originally announced April 2024.

arXiv:2403.19806 [pdf, other]

Feature-Based Echo-State Networks: A Step Towards Interpretability and Minimalism in Reservoir Computer

Authors: Debdipta Goswami

Abstract: This paper proposes a novel and interpretable recurrent neural-network structure using the echo-state network (ESN) paradigm for time-series prediction. While the traditional ESNs perform well for dynamical systems prediction, it needs a large dynamic reservoir with increased computational complexity. It also lacks interpretability to discern contributions from different input combinations to the… ▽ More This paper proposes a novel and interpretable recurrent neural-network structure using the echo-state network (ESN) paradigm for time-series prediction. While the traditional ESNs perform well for dynamical systems prediction, it needs a large dynamic reservoir with increased computational complexity. It also lacks interpretability to discern contributions from different input combinations to the output. Here, a systematic reservoir architecture is developed using smaller parallel reservoirs driven by different input combinations, known as features, and then they are nonlinearly combined to produce the output. The resultant feature-based ESN (Feat-ESN) outperforms the traditional single-reservoir ESN with less reservoir nodes. The predictive capability of the proposed architecture is demonstrated on three systems: two synthetic datasets from chaotic dynamical systems and a set of real-time traffic data. △ Less

Submitted 28 March, 2024; originally announced March 2024.

Comments: 6 pages, 12 figures, 1 table. arXiv admin note: substantial text overlap with arXiv:2304.00198, arXiv:2211.05992

arXiv:2403.14990 [pdf, other]

MasonTigers at SemEval-2024 Task 1: An Ensemble Approach for Semantic Textual Relatedness

Authors: Dhiman Goswami, Sadiya Sayara Chowdhury Puspo, Md Nishat Raihan, Al Nahian Bin Emran, Amrita Ganguly, Marcos Zampieri

Abstract: This paper presents the MasonTigers entry to the SemEval-2024 Task 1 - Semantic Textual Relatedness. The task encompasses supervised (Track A), unsupervised (Track B), and cross-lingual (Track C) approaches across 14 different languages. MasonTigers stands out as one of the two teams who participated in all languages across the three tracks. Our approaches achieved rankings ranging from 11th to 21… ▽ More This paper presents the MasonTigers entry to the SemEval-2024 Task 1 - Semantic Textual Relatedness. The task encompasses supervised (Track A), unsupervised (Track B), and cross-lingual (Track C) approaches across 14 different languages. MasonTigers stands out as one of the two teams who participated in all languages across the three tracks. Our approaches achieved rankings ranging from 11th to 21st in Track A, from 1st to 8th in Track B, and from 5th to 12th in Track C. Adhering to the task-specific constraints, our best performing approaches utilize ensemble of statistical machine learning approaches combined with language-specific BERT based models and sentence transformers. △ Less

Submitted 5 April, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.14989 [pdf, other]

MasonTigers at SemEval-2024 Task 8: Performance Analysis of Transformer-based Models on Machine-Generated Text Detection

Authors: Sadiya Sayara Chowdhury Puspo, Md Nishat Raihan, Dhiman Goswami, Al Nahian Bin Emran, Amrita Ganguly, Ozlem Uzuner

Abstract: This paper presents the MasonTigers entry to the SemEval-2024 Task 8 - Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection. The task encompasses Binary Human-Written vs. Machine-Generated Text Classification (Track A), Multi-Way Machine-Generated Text Classification (Track B), and Human-Machine Mixed Text Detection (Track C). Our best performing approaches util… ▽ More This paper presents the MasonTigers entry to the SemEval-2024 Task 8 - Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection. The task encompasses Binary Human-Written vs. Machine-Generated Text Classification (Track A), Multi-Way Machine-Generated Text Classification (Track B), and Human-Machine Mixed Text Detection (Track C). Our best performing approaches utilize mainly the ensemble of discriminator transformer models along with sentence transformer and statistical machine learning approaches in specific cases. Moreover, zero-shot prompting and fine-tuning of FLAN-T5 are used for Track A and B. △ Less

Submitted 5 April, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.14982 [pdf, other]

MasonTigers at SemEval-2024 Task 9: Solving Puzzles with an Ensemble of Chain-of-Thoughts

Authors: Md Nishat Raihan, Dhiman Goswami, Al Nahian Bin Emran, Sadiya Sayara Chowdhury Puspo, Amrita Ganguly, Marcos Zampieri

Abstract: Our paper presents team MasonTigers submission to the SemEval-2024 Task 9 - which provides a dataset of puzzles for testing natural language understanding. We employ large language models (LLMs) to solve this task through several prompting techniques. Zero-shot and few-shot prompting generate reasonably good results when tested with proprietary LLMs, compared to the open-source models. We obtain f… ▽ More Our paper presents team MasonTigers submission to the SemEval-2024 Task 9 - which provides a dataset of puzzles for testing natural language understanding. We employ large language models (LLMs) to solve this task through several prompting techniques. Zero-shot and few-shot prompting generate reasonably good results when tested with proprietary LLMs, compared to the open-source models. We obtain further improved results with chain-of-thought prompting, an iterative prompting method that breaks down the reasoning process step-by-step. We obtain our best results by utilizing an ensemble of chain-of-thought prompts, placing 2nd in the word puzzle subtask and 13th in the sentence puzzle subtask. The strong performance of prompted LLMs demonstrates their capability for complex reasoning when provided with a decomposition of the thought process. Our work sheds light on how step-wise explanatory prompts can unlock more of the knowledge encoded in the parameters of large models. △ Less

Submitted 3 April, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

arXiv:2403.12335 [pdf, other]

Temporally-Consistent Koopman Autoencoders for Forecasting Dynamical Systems

Authors: Indranil Nayak, Debdipta Goswami, Mrinal Kumar, Fernando Teixeira

Abstract: Absence of sufficiently high-quality data often poses a key challenge in data-driven modeling of high-dimensional spatio-temporal dynamical systems. Koopman Autoencoders (KAEs) harness the expressivity of deep neural networks (DNNs), the dimension reduction capabilities of autoencoders, and the spectral properties of the Koopman operator to learn a reduced-order feature space with simpler, linear… ▽ More Absence of sufficiently high-quality data often poses a key challenge in data-driven modeling of high-dimensional spatio-temporal dynamical systems. Koopman Autoencoders (KAEs) harness the expressivity of deep neural networks (DNNs), the dimension reduction capabilities of autoencoders, and the spectral properties of the Koopman operator to learn a reduced-order feature space with simpler, linear dynamics. However, the effectiveness of KAEs is hindered by limited and noisy training datasets, leading to poor generalizability. To address this, we introduce the Temporally-Consistent Koopman Autoencoder (tcKAE), designed to generate accurate long-term predictions even with constrained and noisy training data. This is achieved through a consistency regularization term that enforces prediction coherence across different time steps, thus enhancing the robustness and generalizability of tcKAE over existing models. We provide analytical justification for this approach based on Koopman spectral theory and empirically demonstrate tcKAE's superior performance over state-of-the-art KAE models across a variety of test cases, including simple pendulum oscillations, kinetic plasmas, fluid flows, and sea surface temperature data. △ Less

Submitted 18 March, 2024; originally announced March 2024.

arXiv:2403.00306 [pdf, other]

qPMS Sigma -- An Efficient and Exact Parallel Algorithm for the Planted $(l, d)$ Motif Search Problem

Authors: Saurav Dhar, Amlan Saha, Dhiman Goswami, Md. Abul Kashem Mia

Abstract: Motif finding is an important step for the detection of rare events occurring in a set of DNA or protein sequences. Extraction of information about these rare events can lead to new biological discoveries. Motifs are some important patterns that have numerous applications including the identification of transcription factors and their binding sites, composite regulatory patterns, similarity betwee… ▽ More Motif finding is an important step for the detection of rare events occurring in a set of DNA or protein sequences. Extraction of information about these rare events can lead to new biological discoveries. Motifs are some important patterns that have numerous applications including the identification of transcription factors and their binding sites, composite regulatory patterns, similarity between families of proteins, etc. Although several flavors of motif searching algorithms have been studied in the literature, we study the version known as $ (l, d) $-motif search or Planted Motif Search (PMS). In PMS, given two integers $ l $, $ d $ and $ n $ input sequences we try to find all the patterns of length $ l $ that appear in each of the $ n $ input sequences with at most $ d $ mismatches. We also discuss the quorum version of PMS in our work that finds motifs that are not planted in all the input sequences but at least in $ q $ of the sequences. Our algorithm is mainly based on the algorithms qPMSPrune, qPMS7, TraverStringRef and PMS8. We introduce some techniques to compress the input strings and make faster comparison between strings with bitwise operations. Our algorithm performs a little better than the existing exact algorithms to solve the qPMS problem in DNA sequence. We have also proposed an idea for parallel implementation of our algorithm. △ Less

Submitted 1 March, 2024; originally announced March 2024.

arXiv:2402.01976 [pdf, other]

MasonPerplexity at ClimateActivism 2024: Integrating Advanced Ensemble Techniques and Data Augmentation for Climate Activism Stance and Hate Event Identification

Authors: Al Nahian Bin Emran, Amrita Ganguly, Sadiya Sayara Chowdhury Puspo, Dhiman Goswami, Md Nishat Raihan

Abstract: The task of identifying public opinions on social media, particularly regarding climate activism and the detection of hate events, has emerged as a critical area of research in our rapidly changing world. With a growing number of people voicing either to support or oppose to climate-related issues - understanding these diverse viewpoints has become increasingly vital. Our team, MasonPerplexity, pa… ▽ More The task of identifying public opinions on social media, particularly regarding climate activism and the detection of hate events, has emerged as a critical area of research in our rapidly changing world. With a growing number of people voicing either to support or oppose to climate-related issues - understanding these diverse viewpoints has become increasingly vital. Our team, MasonPerplexity, participates in a significant research initiative focused on this subject. We extensively test various models and methods, discovering that our most effective results are achieved through ensemble modeling, enhanced by data augmentation techniques like back-translation. In the specific components of this research task, our team achieved notable positions, ranking 5th, 1st, and 6th in the respective sub-tasks, thereby illustrating the effectiveness of our approach in this important field of study. △ Less

Submitted 2 February, 2024; originally announced February 2024.

arXiv:2402.01967 [pdf, other]

MasonPerplexity at Multimodal Hate Speech Event Detection 2024: Hate Speech and Target Detection Using Transformer Ensembles

Authors: Amrita Ganguly, Al Nahian Bin Emran, Sadiya Sayara Chowdhury Puspo, Md Nishat Raihan, Dhiman Goswami, Marcos Zampieri

Abstract: The automatic identification of offensive language such as hate speech is important to keep discussions civil in online communities. Identifying hate speech in multimodal content is a particularly challenging task because offensiveness can be manifested in either words or images or a juxtaposition of the two. This paper presents the MasonPerplexity submission for the Shared Task on Multimodal Hate… ▽ More The automatic identification of offensive language such as hate speech is important to keep discussions civil in online communities. Identifying hate speech in multimodal content is a particularly challenging task because offensiveness can be manifested in either words or images or a juxtaposition of the two. This paper presents the MasonPerplexity submission for the Shared Task on Multimodal Hate Speech Event Detection at CASE 2024 at EACL 2024. The task is divided into two sub-tasks: sub-task A focuses on the identification of hate speech and sub-task B focuses on the identification of targets in text-embedded images during political events. We use an XLM-roBERTa-large model for sub-task A and an ensemble approach combining XLM-roBERTa-base, BERTweet-large, and BERT-base for sub-task B. Our approach obtained 0.8347 F1-score in sub-task A and 0.6741 F1-score in sub-task B ranking 3rd on both sub-tasks. △ Less

Submitted 18 February, 2024; v1 submitted 2 February, 2024; originally announced February 2024.

arXiv:2401.14681 [pdf, other]

MasonTigers@LT-EDI-2024: An Ensemble Approach Towards Detecting Homophobia and Transphobia in Social Media Comments

Authors: Dhiman Goswami, Sadiya Sayara Chowdhury Puspo, Md Nishat Raihan, Al Nahian Bin Emran

Abstract: In this paper, we describe our approaches and results for Task 2 of the LT-EDI 2024 Workshop, aimed at detecting homophobia and/or transphobia across ten languages. Our methodologies include monolingual transformers and ensemble methods, capitalizing on the strengths of each to enhance the performance of the models. The ensemble models worked well, placing our team, MasonTigers, in the top five fo… ▽ More In this paper, we describe our approaches and results for Task 2 of the LT-EDI 2024 Workshop, aimed at detecting homophobia and/or transphobia across ten languages. Our methodologies include monolingual transformers and ensemble methods, capitalizing on the strengths of each to enhance the performance of the models. The ensemble models worked well, placing our team, MasonTigers, in the top five for eight of the ten languages, as measured by the macro F1 score. Our work emphasizes the efficacy of ensemble methods in multilingual scenarios, addressing the complexities of language-specific tasks. △ Less

Submitted 15 February, 2024; v1 submitted 26 January, 2024; originally announced January 2024.

arXiv:2311.15032 [pdf, other]

nlpBDpatriots at BLP-2023 Task 2: A Transfer Learning Approach to Bangla Sentiment Analysis

Authors: Dhiman Goswami, Md Nishat Raihan, Sadiya Sayara Chowdhury Puspo, Marcos Zampieri

Abstract: In this paper, we discuss the nlpBDpatriots entry to the shared task on Sentiment Analysis of Bangla Social Media Posts organized at the first workshop on Bangla Language Processing (BLP) co-located with EMNLP. The main objective of this task is to identify the polarity of social media content using a Bangla dataset annotated with positive, neutral, and negative labels provided by the shared task… ▽ More In this paper, we discuss the nlpBDpatriots entry to the shared task on Sentiment Analysis of Bangla Social Media Posts organized at the first workshop on Bangla Language Processing (BLP) co-located with EMNLP. The main objective of this task is to identify the polarity of social media content using a Bangla dataset annotated with positive, neutral, and negative labels provided by the shared task organizers. Our best system for this task is a transfer learning approach with data augmentation which achieved a micro F1 score of 0.71. Our best system ranked 12th among 30 teams that participated in the competition. △ Less

Submitted 25 November, 2023; originally announced November 2023.

arXiv:2311.15029 [pdf, other]

nlpBDpatriots at BLP-2023 Task 1: A Two-Step Classification for Violence Inciting Text Detection in Bangla

Authors: Md Nishat Raihan, Dhiman Goswami, Sadiya Sayara Chowdhury Puspo, Marcos Zampieri

Abstract: In this paper, we discuss the nlpBDpatriots entry to the shared task on Violence Inciting Text Detection (VITD) organized as part of the first workshop on Bangla Language Processing (BLP) co-located with EMNLP. The aim of this task is to identify and classify the violent threats, that provoke further unlawful violent acts. Our best-performing approach for the task is two-step classification using… ▽ More In this paper, we discuss the nlpBDpatriots entry to the shared task on Violence Inciting Text Detection (VITD) organized as part of the first workshop on Bangla Language Processing (BLP) co-located with EMNLP. The aim of this task is to identify and classify the violent threats, that provoke further unlawful violent acts. Our best-performing approach for the task is two-step classification using back translation and multilinguality which ranked 6th out of 27 teams with a macro F1 score of 0.74. △ Less

Submitted 25 November, 2023; originally announced November 2023.

arXiv:2310.18387 [pdf, other]

OffMix-3L: A Novel Code-Mixed Dataset in Bangla-English-Hindi for Offensive Language Identification

Authors: Dhiman Goswami, Md Nishat Raihan, Antara Mahmud, Antonios Anastasopoulos, Marcos Zampieri

Abstract: Code-mixing is a well-studied linguistic phenomenon when two or more languages are mixed in text or speech. Several works have been conducted on building datasets and performing downstream NLP tasks on code-mixed data. Although it is not uncommon to observe code-mixing of three or more languages, most available datasets in this domain contain code-mixed data from only two languages. In this paper,… ▽ More Code-mixing is a well-studied linguistic phenomenon when two or more languages are mixed in text or speech. Several works have been conducted on building datasets and performing downstream NLP tasks on code-mixed data. Although it is not uncommon to observe code-mixing of three or more languages, most available datasets in this domain contain code-mixed data from only two languages. In this paper, we introduce OffMix-3L, a novel offensive language identification dataset containing code-mixed data from three different languages. We experiment with several models on this dataset and observe that BanglishBERT outperforms other transformer-based models and GPT-3.5. △ Less

Submitted 25 November, 2023; v1 submitted 27 October, 2023; originally announced October 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2310.18023

arXiv:2310.18023 [pdf, other]

SentMix-3L: A Bangla-English-Hindi Code-Mixed Dataset for Sentiment Analysis

Authors: Md Nishat Raihan, Dhiman Goswami, Antara Mahmud, Antonios Anastasopoulos, Marcos Zampieri

Abstract: Code-mixing is a well-studied linguistic phenomenon when two or more languages are mixed in text or speech. Several datasets have been build with the goal of training computational models for code-mixing. Although it is very common to observe code-mixing with multiple languages, most datasets available contain code-mixed between only two languages. In this paper, we introduce SentMix-3L, a novel d… ▽ More Code-mixing is a well-studied linguistic phenomenon when two or more languages are mixed in text or speech. Several datasets have been build with the goal of training computational models for code-mixing. Although it is very common to observe code-mixing with multiple languages, most datasets available contain code-mixed between only two languages. In this paper, we introduce SentMix-3L, a novel dataset for sentiment analysis containing code-mixed data between three languages Bangla, English, and Hindi. We carry out a comprehensive evaluation using SentMix-3L. We show that zero-shot prompting with GPT-3.5 outperforms all transformer-based models on SentMix-3L. △ Less

Submitted 29 November, 2023; v1 submitted 27 October, 2023; originally announced October 2023.

arXiv:2310.13533 [pdf, other]

Technical Report for ICCV 2023 Visual Continual Learning Challenge: Continuous Test-time Adaptation for Semantic Segmentation

Authors: Damian Sójka, Yuyang Liu, Dipam Goswami, Sebastian Cygert, Bartłomiej Twardowski, Joost van de Weijer

Abstract: The goal of the challenge is to develop a test-time adaptation (TTA) method, which could adapt the model to gradually changing domains in video sequences for semantic segmentation task. It is based on a synthetic driving video dataset - SHIFT. The source model is trained on images taken during daytime in clear weather. Domain changes at test-time are mainly caused by varying weather conditions and… ▽ More The goal of the challenge is to develop a test-time adaptation (TTA) method, which could adapt the model to gradually changing domains in video sequences for semantic segmentation task. It is based on a synthetic driving video dataset - SHIFT. The source model is trained on images taken during daytime in clear weather. Domain changes at test-time are mainly caused by varying weather conditions and times of day. The TTA methods are evaluated in each image sequence (video) separately, meaning the model is reset to the source model state before the next sequence. Images come one by one and a prediction has to be made at the arrival of each frame. Each sequence is composed of 401 images and starts with the source domain, then gradually drifts to a different one (changing weather or time of day) until the middle of the sequence. In the second half of the sequence, the domain gradually shifts back to the source one. Ground truth data is available only for the validation split of the SHIFT dataset, in which there are only six sequences that start and end with the source domain. We conduct an analysis specifically on those sequences. Ground truth data for test split, on which the developed TTA methods are evaluated for leader board ranking, are not publicly available. The proposed solution secured a 3rd place in a challenge and received an innovation award. Contrary to the solutions that scored better, we did not use any external pretrained models or specialized data augmentations, to keep the solutions as general as possible. We have focused on analyzing the distributional shift and developing a method that could adapt to changing data dynamics and generalize across different scenarios. △ Less

Submitted 20 October, 2023; originally announced October 2023.

arXiv:2309.14062 [pdf, other]

FeCAM: Exploiting the Heterogeneity of Class Distributions in Exemplar-Free Continual Learning

Authors: Dipam Goswami, Yuyang Liu, Bartłomiej Twardowski, Joost van de Weijer

Abstract: Exemplar-free class-incremental learning (CIL) poses several challenges since it prohibits the rehearsal of data from previous tasks and thus suffers from catastrophic forgetting. Recent approaches to incrementally learning the classifier by freezing the feature extractor after the first task have gained much attention. In this paper, we explore prototypical networks for CIL, which generate new cl… ▽ More Exemplar-free class-incremental learning (CIL) poses several challenges since it prohibits the rehearsal of data from previous tasks and thus suffers from catastrophic forgetting. Recent approaches to incrementally learning the classifier by freezing the feature extractor after the first task have gained much attention. In this paper, we explore prototypical networks for CIL, which generate new class prototypes using the frozen feature extractor and classify the features based on the Euclidean distance to the prototypes. In an analysis of the feature distributions of classes, we show that classification based on Euclidean metrics is successful for jointly trained features. However, when learning from non-stationary data, we observe that the Euclidean metric is suboptimal and that feature distributions are heterogeneous. To address this challenge, we revisit the anisotropic Mahalanobis distance for CIL. In addition, we empirically show that modeling the feature covariance relations is better than previous attempts at sampling features from normal distributions and training a linear classifier. Unlike existing methods, our approach generalizes to both many- and few-shot CIL settings, as well as to domain-incremental settings. Interestingly, without updating the backbone network, our method obtains state-of-the-art results on several standard continual learning benchmarks. Code is available at https://github.com/dipamgoswami/FeCAM. △ Less

Submitted 12 January, 2024; v1 submitted 25 September, 2023; originally announced September 2023.

Comments: Accepted at NeurIPS 2023

arXiv:2309.10272 [pdf, other]

Mixed-Distil-BERT: Code-mixed Language Modeling for Bangla, English, and Hindi

Authors: Md Nishat Raihan, Dhiman Goswami, Antara Mahmud

Abstract: One of the most popular downstream tasks in the field of Natural Language Processing is text classification. Text classification tasks have become more daunting when the texts are code-mixed. Though they are not exposed to such text during pre-training, different BERT models have demonstrated success in tackling Code-Mixed NLP challenges. Again, in order to enhance their performance, Code-Mixed NL… ▽ More One of the most popular downstream tasks in the field of Natural Language Processing is text classification. Text classification tasks have become more daunting when the texts are code-mixed. Though they are not exposed to such text during pre-training, different BERT models have demonstrated success in tackling Code-Mixed NLP challenges. Again, in order to enhance their performance, Code-Mixed NLP models have depended on combining synthetic data with real-world data. It is crucial to understand how the BERT models' performance is impacted when they are pretrained using corresponding code-mixed languages. In this paper, we introduce Tri-Distil-BERT, a multilingual model pre-trained on Bangla, English, and Hindi, and Mixed-Distil-BERT, a model fine-tuned on code-mixed data. Both models are evaluated across multiple NLP tasks and demonstrate competitive performance against larger models like mBERT and XLM-R. Our two-tiered pre-training approach offers efficient alternatives for multilingual and code-mixed language understanding, contributing to advancements in the field. △ Less

Submitted 14 March, 2024; v1 submitted 18 September, 2023; originally announced September 2023.

arXiv:2307.05587 [pdf, other]

Active Learning for Video Classification with Frame Level Queries

Authors: Debanjan Goswami, Shayok Chakraborty

Abstract: Deep learning algorithms have pushed the boundaries of computer vision research and have depicted commendable performance in a variety of applications. However, training a robust deep neural network necessitates a large amount of labeled training data, acquiring which involves significant time and human effort. This problem is even more serious for an application like video classification, where a… ▽ More Deep learning algorithms have pushed the boundaries of computer vision research and have depicted commendable performance in a variety of applications. However, training a robust deep neural network necessitates a large amount of labeled training data, acquiring which involves significant time and human effort. This problem is even more serious for an application like video classification, where a human annotator has to watch an entire video end-to-end to furnish a label. Active learning algorithms automatically identify the most informative samples from large amounts of unlabeled data; this tremendously reduces the human annotation effort in inducing a machine learning model, as only the few samples that are identified by the algorithm, need to be labeled manually. In this paper, we propose a novel active learning framework for video classification, with the goal of further reducing the labeling onus on the human annotators. Our framework identifies a batch of exemplar videos, together with a set of informative frames for each video; the human annotator needs to merely review the frames and provide a label for each video. This involves much less manual work than watching the complete video to come up with a label. We formulate a criterion based on uncertainty and diversity to identify the informative videos and exploit representative sampling techniques to extract a set of exemplar frames from each video. To the best of our knowledge, this is the first research effort to develop an active learning framework for video classification, where the annotators need to inspect only a few frames to produce a label, rather than watching the end-to-end video. △ Less

Submitted 10 July, 2023; originally announced July 2023.

arXiv:2304.00198 [pdf, other]

Sequential Learning from Noisy Data: Data-Assimilation Meets Echo-State Network

Authors: Debdipta Goswami

Abstract: This paper explores the problem of training a recurrent neural network from noisy data. While neural network based dynamic predictors perform well with noise-free training data, prediction with noisy inputs during training phase poses a significant challenge. Here a sequential training algorithm is developed for an echo-state network (ESN) by incorporating noisy observations using an ensemble Kalm… ▽ More This paper explores the problem of training a recurrent neural network from noisy data. While neural network based dynamic predictors perform well with noise-free training data, prediction with noisy inputs during training phase poses a significant challenge. Here a sequential training algorithm is developed for an echo-state network (ESN) by incorporating noisy observations using an ensemble Kalman filter. The resultant Kalman-trained echo-state network (KalT-ESN) outperforms the traditionally trained ESN with least square algorithm while still being computationally cheap. The proposed method is demonstrated on noisy observations from three systems: two synthetic datasets from chaotic dynamical systems and a set of real-time traffic data. △ Less

Submitted 31 March, 2023; originally announced April 2023.

Comments: 7 pages, 9 figures, 1 table. arXiv admin note: text overlap with arXiv:2211.05992

arXiv:2302.02811 [pdf, other]

Unified Software Design Patterns for Simulated Annealing

Authors: Rohit Goswami, Ruhila S., Amrita Goswami, Sonaly Goswami, Debabrata Goswami

Abstract: Any optimization algorithm programming interface can be seen as a black-box function with additional free parameters. In this spirit, simulated annealing (SA) can be implemented in pseudo-code within the dimensions of a single slide with free parameters relating to the annealing schedule. Such an implementation, however, necessarily neglects much of the structure necessary to take advantage of adv… ▽ More Any optimization algorithm programming interface can be seen as a black-box function with additional free parameters. In this spirit, simulated annealing (SA) can be implemented in pseudo-code within the dimensions of a single slide with free parameters relating to the annealing schedule. Such an implementation, however, necessarily neglects much of the structure necessary to take advantage of advances in computing resources and algorithmic breakthroughs. Simulated annealing is often introduced in myriad disciplines, from discrete examples like the Traveling Salesman Problem (TSP) to molecular cluster potential energy exploration or even explorations of a protein's configurational space. Theoretical guarantees also demand a stricter structure in terms of statistical quantities, which cannot simply be left to the user. We will introduce several standard paradigms and demonstrate how these can be "lifted" into a unified framework using object-oriented programming in Python. We demonstrate how clean, interoperable, reproducible programming libraries can be used to access and rapidly iterate on variants of Simulated Annealing in a manner which can be extended to serve as a best practices blueprint or design pattern for a data-driven optimization library. △ Less

Submitted 23 February, 2023; v1 submitted 6 February, 2023; originally announced February 2023.

Comments: 4 figures, submitted as an invited chapter to InTech

arXiv:2211.06104 [pdf, other]

Bounding Box Priors for Cell Detection with Point Annotations

Authors: Hari Om Aggrawal, Dipam Goswami, Vinti Agarwal

Abstract: The size of an individual cell type, such as a red blood cell, does not vary much among humans. We use this knowledge as a prior for classifying and detecting cells in images with only a few ground truth bounding box annotations, while most of the cells are annotated with points. This setting leads to weakly semi-supervised learning. We propose replacing points with either stochastic (ST) boxes or… ▽ More The size of an individual cell type, such as a red blood cell, does not vary much among humans. We use this knowledge as a prior for classifying and detecting cells in images with only a few ground truth bounding box annotations, while most of the cells are annotated with points. This setting leads to weakly semi-supervised learning. We propose replacing points with either stochastic (ST) boxes or bounding box predictions during the training process. The proposed "mean-IOU" ST box maximizes the overlap with all the boxes belonging to the sample space with a class-specific approximated prior probability distribution of bounding boxes. Our method trains with both box- and point-labelled images in conjunction, unlike the existing methods, which train first with box- and then point-labelled images. In the most challenging setting, when only 5% images are box-labelled, quantitative experiments on a urine dataset show that our one-stage method outperforms two-stage methods by 5.56 mAP. Furthermore, we suggest an approach that partially answers "how many box-labelled annotations are necessary?" before training a machine learning model. △ Less

Submitted 11 November, 2022; originally announced November 2022.

arXiv:2211.05992 [pdf, other]

Delay Embedded Echo-State Network: A Predictor for Partially Observed Systems

Authors: Debdipta Goswami

Abstract: This paper considers the problem of data-driven prediction of partially observed systems using a recurrent neural network. While neural network based dynamic predictors perform well with full-state training data, prediction with partial observation during training phase poses a significant challenge. Here a predictor for partial observations is developed using an echo-state network (ESN) and time… ▽ More This paper considers the problem of data-driven prediction of partially observed systems using a recurrent neural network. While neural network based dynamic predictors perform well with full-state training data, prediction with partial observation during training phase poses a significant challenge. Here a predictor for partial observations is developed using an echo-state network (ESN) and time delay embedding of the partially observed state. The proposed method is theoretically justified with Taken's embedding theorem and strong observability of a nonlinear system. The efficacy of the proposed method is demonstrated on three systems: two synthetic datasets from chaotic dynamical systems and a set of real-time traffic data. △ Less

Submitted 5 April, 2023; v1 submitted 10 November, 2022; originally announced November 2022.

Comments: 7 pages, 10 figures

arXiv:2210.07207 [pdf, other]

Attribution-aware Weight Transfer: A Warm-Start Initialization for Class-Incremental Semantic Segmentation

Authors: Dipam Goswami, René Schuster, Joost van de Weijer, Didier Stricker

Abstract: In class-incremental semantic segmentation (CISS), deep learning architectures suffer from the critical problems of catastrophic forgetting and semantic background shift. Although recent works focused on these issues, existing classifier initialization methods do not address the background shift problem and assign the same initialization weights to both background and new foreground class classifi… ▽ More In class-incremental semantic segmentation (CISS), deep learning architectures suffer from the critical problems of catastrophic forgetting and semantic background shift. Although recent works focused on these issues, existing classifier initialization methods do not address the background shift problem and assign the same initialization weights to both background and new foreground class classifiers. We propose to address the background shift with a novel classifier initialization method which employs gradient-based attribution to identify the most relevant weights for new classes from the classifier's weights for the previous background and transfers these weights to the new classifier. This warm-start weight initialization provides a general solution applicable to several CISS methods. Furthermore, it accelerates learning of new classes while mitigating forgetting. Our experiments demonstrate significant improvement in mIoU compared to the state-of-the-art CISS methods on the Pascal-VOC 2012, ADE20K and Cityscapes datasets. △ Less

Submitted 13 October, 2022; originally announced October 2022.

Comments: Accepted at WACV 2023

arXiv:2209.13836 [pdf, other]

Mutual Information Assisted Ensemble Recommender System for Identifying Critical Risk Factors in Healthcare Prognosis

Authors: Abhishek Dey, Debayan Goswami, Rahul Roy, Susmita Ghosh, Yu Shrike Zhang, Jonathan H. Chan

Abstract: Purpose: Health recommenders act as important decision support systems, aiding patients and medical professionals in taking actions that lead to patients' well-being. These systems extract the information which may be of particular relevance to the end-user, helping them in making appropriate decisions. The present study proposes a feature recommender, as a part of a disease management system, tha… ▽ More Purpose: Health recommenders act as important decision support systems, aiding patients and medical professionals in taking actions that lead to patients' well-being. These systems extract the information which may be of particular relevance to the end-user, helping them in making appropriate decisions. The present study proposes a feature recommender, as a part of a disease management system, that identifies and recommends the most important risk factors for an illness. Methods: A novel mutual information and ensemble-based feature ranking approach for identifying critical risk factors in healthcare prognosis is proposed. Results: To establish the effectiveness of the proposed method, experiments have been conducted on four benchmark datasets of diverse diseases (clear cell renal cell carcinoma (ccRCC), chronic kidney disease, Indian liver patient, and cervical cancer risk factors). The performance of the proposed recommender is compared with four state-of-the-art methods using recommender systems' performance metrics like average precision@K, precision@K, recall@K, F1@K, reciprocal rank@K. The method is able to recommend all relevant critical risk factors for ccRCC. It also attains a higher accuracy (96.6% and 98.6% using support vector machine and neural network, respectively) for ccRCC staging with a reduced feature set as compared to existing methods. Moreover, the top two features recommended using the proposed method with ccRCC, viz. size of tumor and metastasis status, are medically validated from the existing TNM system. Results are also found to be superior for the other three datasets. Conclusion: The proposed recommender can identify and recommend risk factors that have the most discriminating power for detecting diseases. △ Less

Submitted 1 July, 2024; v1 submitted 28 September, 2022; originally announced September 2022.

arXiv:2208.03565 [pdf, ps, other]

Analysis of Temporal Robustness in Massive Machine Type Communications

Authors: Debjani Goswami, Merim Dzaferagic, Harun Siljak, Suvra Sekhar Das, Nicola Marchetti

Abstract: The evolution of fifth generation (5G) networks needs to support the latest use cases, which demand robust network connectivity for the collaborative performance of the network agents, like multi-robot systems and vehicle to anything (V2X) communication. Unfortunately, the user device's limited communication range and battery constraint confirm the unfitness of known robustness metrics suggested f… ▽ More The evolution of fifth generation (5G) networks needs to support the latest use cases, which demand robust network connectivity for the collaborative performance of the network agents, like multi-robot systems and vehicle to anything (V2X) communication. Unfortunately, the user device's limited communication range and battery constraint confirm the unfitness of known robustness metrics suggested for fixed networks, when applied to time-switching communication graphs. Furthermore, the calculation of most of the existing robustness metrics involves non-deterministic polynomial-time complexity, and hence are best-fitted only for small networks. Despite a large volume of works, the complete analysis of a $\textit{low-complexity}$ temporal robustness metric for a communication network is absent in the literature, and the present work aims to fill this gap. More in detail, our work provides a stochastic analysis of network robustness for a massive machine type communication (mMTC) network. The numerical investigation corroborates the exactness of the proposed analytical framework for temporal robustness metric. Along with studying the impact on network robustness of various system parameters, such as cluster head (CH) probability, power threshold value, network size, and node failure probability, we justify the observed trend of numerical results probabilistically. △ Less

Submitted 6 August, 2022; originally announced August 2022.

arXiv:2111.10374 [pdf, other]

Urine Microscopic Image Dataset

Authors: Dipam Goswami, Hari Om Aggrawal, Rajiv Gupta, Vinti Agarwal

Abstract: Urinalysis is a standard diagnostic test to detect urinary system related problems. The automation of urinalysis will reduce the overall diagnostic time. Recent studies used urine microscopic datasets for designing deep learning based algorithms to classify and detect urine cells. But these datasets are not publicly available for further research. To alleviate the need for urine datsets, we prepar… ▽ More Urinalysis is a standard diagnostic test to detect urinary system related problems. The automation of urinalysis will reduce the overall diagnostic time. Recent studies used urine microscopic datasets for designing deep learning based algorithms to classify and detect urine cells. But these datasets are not publicly available for further research. To alleviate the need for urine datsets, we prepare our urine sediment microscopic image (UMID) dataset comprising of around 3700 cell annotations and 3 categories of cells namely RBC, pus and epithelial cells. We discuss the several challenges involved in preparing the dataset and the annotations. We make the dataset publicly available. △ Less

Submitted 19 November, 2021; originally announced November 2021.

Comments: 7 pages, 1 image

arXiv:2110.02038 [pdf, other]

Semi-Supervised Deep Learning for Multiplex Networks

Authors: Anasua Mitra, Priyesh Vijayan, Ranbir Sanasam, Diganta Goswami, Srinivasan Parthasarathy, Balaraman Ravindran

Abstract: Multiplex networks are complex graph structures in which a set of entities are connected to each other via multiple types of relations, each relation representing a distinct layer. Such graphs are used to investigate many complex biological, social, and technological systems. In this work, we present a novel semi-supervised approach for structure-aware representation learning on multiplex networks… ▽ More Multiplex networks are complex graph structures in which a set of entities are connected to each other via multiple types of relations, each relation representing a distinct layer. Such graphs are used to investigate many complex biological, social, and technological systems. In this work, we present a novel semi-supervised approach for structure-aware representation learning on multiplex networks. Our approach relies on maximizing the mutual information between local node-wise patch representations and label correlated structure-aware global graph representations to model the nodes and cluster structures jointly. Specifically, it leverages a novel cluster-aware, node-contextualized global graph summary generation strategy for effective joint-modeling of node and cluster representations across the layers of a multiplex network. Empirically, we demonstrate that the proposed architecture outperforms state-of-the-art methods in a range of tasks: classification, clustering, visualization, and similarity search on seven real-world multiplex networks for various experiment settings. △ Less

Submitted 5 October, 2021; originally announced October 2021.

arXiv:2108.09131 [pdf, other]

doi 10.1038/s41598-023-31737-y

Transfer-Recursive-Ensemble Learning for Multi-Day COVID-19 Prediction in India using Recurrent Neural Networks

Authors: Debasrita Chakraborty, Debayan Goswami, Susmita Ghosh, Ashish Ghosh, Jonathan H. Chan

Abstract: The current COVID-19 pandemic has put a huge challenge on the Indian health infrastructure. With more and more people getting affected during the second wave, the hospitals were over-burdened, running out of supplies and oxygen. In this scenario, prediction of the number of COVID-19 cases beforehand might have helped in the better utilization of limited resources and supplies. This manuscript deal… ▽ More The current COVID-19 pandemic has put a huge challenge on the Indian health infrastructure. With more and more people getting affected during the second wave, the hospitals were over-burdened, running out of supplies and oxygen. In this scenario, prediction of the number of COVID-19 cases beforehand might have helped in the better utilization of limited resources and supplies. This manuscript deals with the prediction of new COVID-19 cases, new deaths and total active cases for multiple days in advance. The proposed method uses gated recurrent unit networks as the main predicting model. A study is conducted by building four models that are pre-trained on the data from four different countries (United States of America, Brazil, Spain and Bangladesh) and are fine-tuned or retrained on India's data. Since the four countries chosen have experienced different types of infection curves, the pre-training provides a transfer learning to the models incorporating diverse situations into account. Each of the four models then give a multiple days ahead predictions using recursive learning method for the Indian test data. The final prediction comes from an ensemble of the predictions of the combination of different models. This method with two countries, Spain and Brazil, is seen to achieve the best performance amongst all the combinations as well as compared to other traditional regression models. △ Less

Submitted 26 April, 2023; v1 submitted 20 August, 2021; originally announced August 2021.

Comments: 8 pages, 7 figures

Journal ref: Sci Rep 13, 6795 (2023)

arXiv:1410.5738 [pdf, ps, other]

Investigation of A Collective Decision Making System of Different Neighbourhood-Size Based on Hyper-Geometric Distribution

Authors: Debdipta Goswami, Heiko Hamann

Abstract: The study of collective decision making system has become the central part of the Swarm- Intelligence Related research in recent years. The most challenging task of modelling a collec- tive decision making system is to develop the macroscopic stochastic equation from its microscopic model. In this report we have investigated the behaviour of a collective decision making system with specified micro… ▽ More The study of collective decision making system has become the central part of the Swarm- Intelligence Related research in recent years. The most challenging task of modelling a collec- tive decision making system is to develop the macroscopic stochastic equation from its microscopic model. In this report we have investigated the behaviour of a collective decision making system with specified microscopic rules that resemble the chemical reaction and used different group size. Then we ventured to derive a generalized analytical model of a collective-decision system using hyper-geometric distribution. Index Terms-swarm; collective decision making; noise; group size; hyper-geometric distribution △ Less

Submitted 21 October, 2014; originally announced October 2014.

Comments: 9 pages, 20 figures

arXiv:1410.3864 [pdf, ps, other]

Multi-Agent Shape Formation and Tracking Inspired from a Social Foraging Dynamics

Authors: Debdipta Goswami, Chiranjib Saha, Kunal Pal, Swagatam Das

Abstract: Principle of Swarm Intelligence has recently found widespread application in formation control and automated tracking by the automated multi-agent system. This article proposes an elegant and effective method inspired by foraging dynamics to produce geometric-patterns by the search agents. Starting from a random initial orientation, it is investigated how the foraging dynamics can be modified to a… ▽ More Principle of Swarm Intelligence has recently found widespread application in formation control and automated tracking by the automated multi-agent system. This article proposes an elegant and effective method inspired by foraging dynamics to produce geometric-patterns by the search agents. Starting from a random initial orientation, it is investigated how the foraging dynamics can be modified to achieve convergence of the agents on the desired pattern with almost uniform density. Guided through the proposed dynamics, the agents can also track a moving point by continuously circulating around the point. An analytical treatment supported with computer simulation results is provided to better understand the convergence behaviour of the system. △ Less

Submitted 16 October, 2014; v1 submitted 14 October, 2014; originally announced October 2014.

MSC Class: 70F04

arXiv:1312.4116 [pdf, other]

Quantum Algorithm to Solve a Maze: Converting the Maze Problem into a Search Problem

Authors: Niraj Kumar, Debabrata Goswami

Abstract: We propose a different methodology towards approaching a Maze problem. We convert the problem into a Quantum Search Problem (QSP), and its solutions are sought for using the iterative Grover's Search Algorithm. Though the category of mazes we are looking at are of the NP complete class, we have redirected such a NP complete problem into a QSP. Our solution deals with two dimensional perfect mazes… ▽ More We propose a different methodology towards approaching a Maze problem. We convert the problem into a Quantum Search Problem (QSP), and its solutions are sought for using the iterative Grover's Search Algorithm. Though the category of mazes we are looking at are of the NP complete class, we have redirected such a NP complete problem into a QSP. Our solution deals with two dimensional perfect mazes with no closed loops. We encode all possible individual paths from the starting point of the maze into a quantum register. A quantum fitness operator applied on the register encodes each individual with its fitness value. We propose an oracle design which marks all the individuals above a certain fitness value and use the Grover search algorithm to find one of the marked states. Iterating over this method, we approach towards the optimum solution. △ Less

Submitted 15 December, 2013; originally announced December 2013.

Comments: 5 pages, 5 figures,Appeared in Asian Quantum Information Science(AQIS'13) Conference, Chennai, India, August 2013. http://www.imsc.res.in/ aqis13/submissions/

arXiv:1305.3354 [pdf, other]

Approximate Congestion Games for Load Balancing in Distributed Environment

Authors: Sandip Chakraborty, Soumyadip Majumder, Diganta Goswami

Abstract: The use of game theoretic models has been quite successful in describing various cooperative and non-cooperative optimization problems in networks and other domains of computer systems. In this paper, we study an application of game theoretic models in the domain of distributed system, where nodes play a game to balance the total processing loads among themselves. We have used congestion gaming mo… ▽ More The use of game theoretic models has been quite successful in describing various cooperative and non-cooperative optimization problems in networks and other domains of computer systems. In this paper, we study an application of game theoretic models in the domain of distributed system, where nodes play a game to balance the total processing loads among themselves. We have used congestion gaming model, a model of game theory where many agents compete for allocating resources, and studied the existence of Nash Equilibrium for such types of games. As the classical congestion game is known to be PLS-Complete, we use an approximation, called the ε-Congestion game, which converges to ε-Nash equilibrium within finite number of steps under selected conditions. Our focus is to define the load balancing problem using the model of ε-congestion games, and finally provide a greedy algorithm for load balancing in distributed systems. We have simulated our proposed system to show the effect of ε-congestion game, and the distribution of load at equilibrium state. △ Less

Submitted 15 May, 2013; originally announced May 2013.

Comments: A version of this work has been presented at International Workshop on Distributed System (IWDS) 2010, IIT Kanpur, India, as a "work-in-progress" report

Showing 1–38 of 38 results for author: Goswami, D