-
A Double-Difference Doppler Shift-Based Positioning Framework with Ephemeris Error Correction of LEO Satellites
Authors:
Md. Ali Hasan,
M. Humayun Kabir,
Md. Shafiqul Islam,
Sangmin Han,
Wonjae Shin
Abstract:
In signals of opportunity (SOPs)-based positioning utilizing low Earth orbit (LEO) satellites, ephemeris data derived from two-line element files can introduce increasing error over time. To handle the erroneous measurement, an additional base receiver with a known position is often used to compensate for the effect of ephemeris error when positioning the user terminal (UT). However, this approach…
▽ More
In signals of opportunity (SOPs)-based positioning utilizing low Earth orbit (LEO) satellites, ephemeris data derived from two-line element files can introduce increasing error over time. To handle the erroneous measurement, an additional base receiver with a known position is often used to compensate for the effect of ephemeris error when positioning the user terminal (UT). However, this approach is insufficient for the long baseline (the distance between the base receiver and UT) as it fails to adequately correct Doppler shift measurement errors caused by ephemeris inaccuracies, resulting in degraded positioning performance. Moreover, the lack of clock synchronization between the base receiver and UT exacerbates erroneous Doppler shift measurements. To address these challenges, we put forth a robust double-difference Doppler shift-based positioning framework, coined 3DPose, to handle the clock synchronization issue between the base receiver and UT, and positioning degradation due to the long baseline. The proposed 3DPose framework leverages double-difference Doppler shift measurements to eliminate the clock synchronization issue and incorporates a novel ephemeris error correction algorithm to enhance UT positioning accuracy in case of the long baseline. The algorithm specifically characterizes and corrects the Doppler shift measurement errors arising from erroneous ephemeris data, focusing on satellite position errors in the tangential direction. To validate the effectiveness of the proposed framework, we conduct comparative analyses across three different scenarios, contrasting its performance with the existing differential Doppler positioning method. The results demonstrate that the proposed 3DPose framework achieves an average reduction of 90% in 3-dimensional positioning errors compared to the existing differential Doppler approach.
△ Less
Submitted 8 September, 2024;
originally announced September 2024.
-
Bengali Sign Language Recognition through Hand Pose Estimation using Multi-Branch Spatial-Temporal Attention Model
Authors:
Abu Saleh Musa Miah,
Md. Al Mehedi Hasan,
Md Hadiuzzaman,
Muhammad Nazrul Islam,
Jungpil Shin
Abstract:
Hand gesture-based sign language recognition (SLR) is one of the most advanced applications of machine learning, and computer vision uses hand gestures. Although, in the past few years, many researchers have widely explored and studied how to address BSL problems, specific unaddressed issues remain, such as skeleton and transformer-based BSL recognition. In addition, the lack of evaluation of the…
▽ More
Hand gesture-based sign language recognition (SLR) is one of the most advanced applications of machine learning, and computer vision uses hand gestures. Although, in the past few years, many researchers have widely explored and studied how to address BSL problems, specific unaddressed issues remain, such as skeleton and transformer-based BSL recognition. In addition, the lack of evaluation of the BSL model in various concealed environmental conditions can prove the generalized property of the existing model by facing daily life signs. As a consequence, existing BSL recognition systems provide a limited perspective of their generalisation ability as they are tested on datasets containing few BSL alphabets that have a wide disparity in gestures and are easy to differentiate. To overcome these limitations, we propose a spatial-temporal attention-based BSL recognition model considering hand joint skeletons extracted from the sequence of images. The main aim of utilising hand skeleton-based BSL data is to ensure the privacy and low-resolution sequence of images, which need minimum computational cost and low hardware configurations. Our model captures discriminative structural displacements and short-range dependency based on unified joint features projected onto high-dimensional feature space. Specifically, the use of Separable TCN combined with a powerful multi-head spatial-temporal attention architecture generated high-performance accuracy. The extensive experiments with a proposed dataset and two benchmark BSL datasets with a wide range of evaluations, such as intra- and inter-dataset evaluation settings, demonstrated that our proposed models achieve competitive performance with extremely low computational complexity and run faster than existing models.
△ Less
Submitted 26 August, 2024;
originally announced August 2024.
-
Computer-Aided Fall Recognition Using a Three-Stream Spatial-Temporal GCN Model with Adaptive Feature Aggregation
Authors:
Jungpil Shin,
Abu Saleh Musa Miah,
Rei Egawa1,
Koki Hirooka,
Md. Al Mehedi Hasan,
Yoichi Tomioka,
Yong Seok Hwang
Abstract:
The prevention of falls is paramount in modern healthcare, particularly for the elderly, as falls can lead to severe injuries or even fatalities. Additionally, the growing incidence of falls among the elderly, coupled with the urgent need to prevent suicide attempts resulting from medication overdose, underscores the critical importance of accurate and efficient fall detection methods. In this sce…
▽ More
The prevention of falls is paramount in modern healthcare, particularly for the elderly, as falls can lead to severe injuries or even fatalities. Additionally, the growing incidence of falls among the elderly, coupled with the urgent need to prevent suicide attempts resulting from medication overdose, underscores the critical importance of accurate and efficient fall detection methods. In this scenario, a computer-aided fall detection system is inevitable to save elderly people's lives worldwide. Many researchers have been working to develop fall detection systems. However, the existing fall detection systems often struggle with issues such as unsatisfactory performance accuracy, limited robustness, high computational complexity, and sensitivity to environmental factors due to a lack of effective features. In response to these challenges, this paper proposes a novel three-stream spatial-temporal feature-based fall detection system. Our system incorporates joint skeleton-based spatial and temporal Graph Convolutional Network (GCN) features, joint motion-based spatial and temporal GCN features, and residual connections-based features. Each stream employs adaptive graph-based feature aggregation and consecutive separable convolutional neural networks (Sep-TCN), significantly reducing computational complexity and model parameters compared to prior systems. Experimental results across multiple datasets demonstrate the superior effectiveness and efficiency of our proposed system, with accuracies of 99.51\%, 99.15\%, 99.79\% and 99.85 \% achieved on the ImViA, UR-Fall, Fall-UP and FU-Kinect datasets, respectively. The remarkable performance of our system highlights its superiority, efficiency, and generalizability in real-world fall detection scenarios, offering significant advancements in healthcare and societal well-being.
△ Less
Submitted 22 August, 2024;
originally announced August 2024.
-
Cervical Cancer Detection Using Multi-Branch Deep Learning Model
Authors:
Tatsuhiro Baba,
Abu Saleh Musa Miah,
Jungpil Shin,
Md. Al Mehedi Hasan
Abstract:
Cervical cancer is a crucial global health concern for women, and the persistent infection of High-risk HPV mainly triggers this remains a global health challenge, with young women diagnosis rates soaring from 10\% to 40\% over three decades. While Pap smear screening is a prevalent diagnostic method, visual image analysis can be lengthy and often leads to mistakes. Early detection of the disease…
▽ More
Cervical cancer is a crucial global health concern for women, and the persistent infection of High-risk HPV mainly triggers this remains a global health challenge, with young women diagnosis rates soaring from 10\% to 40\% over three decades. While Pap smear screening is a prevalent diagnostic method, visual image analysis can be lengthy and often leads to mistakes. Early detection of the disease can contribute significantly to improving patient outcomes. In recent decades, many researchers have employed machine learning techniques that achieved promise in cervical cancer detection processes based on medical images. In recent years, many researchers have employed various deep-learning techniques to achieve high-performance accuracy in detecting cervical cancer but are still facing various challenges. This research proposes an innovative and novel approach to automate cervical cancer image classification using Multi-Head Self-Attention (MHSA) and convolutional neural networks (CNNs). The proposed method leverages the strengths of both MHSA mechanisms and CNN to effectively capture both local and global features within cervical images in two streams. MHSA facilitates the model's ability to focus on relevant regions of interest, while CNN extracts hierarchical features that contribute to accurate classification. Finally, we combined the two stream features and fed them into the classification module to refine the feature and the classification. To evaluate the performance of the proposed approach, we used the SIPaKMeD dataset, which classifies cervical cells into five categories. Our model achieved a remarkable accuracy of 98.522\%. This performance has high recognition accuracy of medical image classification and holds promise for its applicability in other medical image recognition tasks.
△ Less
Submitted 19 August, 2024;
originally announced August 2024.
-
Do Large Language Models Speak All Languages Equally? A Comparative Study in Low-Resource Settings
Authors:
Md. Arid Hasan,
Prerona Tarannum,
Krishno Dey,
Imran Razzak,
Usman Naseem
Abstract:
Large language models (LLMs) have garnered significant interest in natural language processing (NLP), particularly their remarkable performance in various downstream tasks in resource-rich languages. Recent studies have highlighted the limitations of LLMs in low-resource languages, primarily focusing on binary classification tasks and giving minimal attention to South Asian languages. These limita…
▽ More
Large language models (LLMs) have garnered significant interest in natural language processing (NLP), particularly their remarkable performance in various downstream tasks in resource-rich languages. Recent studies have highlighted the limitations of LLMs in low-resource languages, primarily focusing on binary classification tasks and giving minimal attention to South Asian languages. These limitations are primarily attributed to constraints such as dataset scarcity, computational costs, and research gaps specific to low-resource languages. To address this gap, we present datasets for sentiment and hate speech tasks by translating from English to Bangla, Hindi, and Urdu, facilitating research in low-resource language processing. Further, we comprehensively examine zero-shot learning using multiple LLMs in English and widely spoken South Asian languages. Our findings indicate that GPT-4 consistently outperforms Llama 2 and Gemini, with English consistently demonstrating superior performance across diverse tasks compared to low-resource languages. Furthermore, our analysis reveals that natural language inference (NLI) exhibits the highest performance among the evaluated tasks, with GPT-4 demonstrating superior capabilities.
△ Less
Submitted 5 August, 2024;
originally announced August 2024.
-
Distributionally Robust Optimization as a Scalable Framework to Characterize Extreme Value Distributions
Authors:
Patrick Kuiper,
Ali Hasan,
Wenhao Yang,
Yuting Ng,
Hoda Bidkhori,
Jose Blanchet,
Vahid Tarokh
Abstract:
The goal of this paper is to develop distributionally robust optimization (DRO) estimators, specifically for multidimensional Extreme Value Theory (EVT) statistics. EVT supports using semi-parametric models called max-stable distributions built from spatial Poisson point processes. While powerful, these models are only asymptotically valid for large samples. However, since extreme data is by defin…
▽ More
The goal of this paper is to develop distributionally robust optimization (DRO) estimators, specifically for multidimensional Extreme Value Theory (EVT) statistics. EVT supports using semi-parametric models called max-stable distributions built from spatial Poisson point processes. While powerful, these models are only asymptotically valid for large samples. However, since extreme data is by definition scarce, the potential for model misspecification error is inherent to these applications, thus DRO estimators are natural. In order to mitigate over-conservative estimates while enhancing out-of-sample performance, we study DRO estimators informed by semi-parametric max-stable constraints in the space of point processes. We study both tractable convex formulations for some problems of interest (e.g. CVaR) and more general neural network based estimators. Both approaches are validated using synthetically generated data, recovering prescribed characteristics, and verifying the efficacy of the proposed techniques. Additionally, the proposed method is applied to a real data set of financial returns for comparison to a previous analysis. We established the proposed model as a novel formulation in the multivariate EVT domain, and innovative with respect to performance when compared to relevant alternate proposals.
△ Less
Submitted 31 July, 2024;
originally announced August 2024.
-
Study of Topological Phenomena Through Berry Phase in Classical Nonlinear Elastic Granules
Authors:
Kazi T. Mahmood,
M. Arif Hasan
Abstract:
The geometric of Berry phase concept, traditionally rooted in quantum mechanics, has been found to be increasingly significant in classical mechanics, particularly for understanding the dynamics of linear and nonlinear systems. In this study, we demonstrate the controlled accumulation of the Berry phase in a classical system using a two-level time-dependent elastic bit, analogous to a quantum bit,…
▽ More
The geometric of Berry phase concept, traditionally rooted in quantum mechanics, has been found to be increasingly significant in classical mechanics, particularly for understanding the dynamics of linear and nonlinear systems. In this study, we demonstrate the controlled accumulation of the Berry phase in a classical system using a two-level time-dependent elastic bit, analogous to a quantum bit, within a nonlinear environment generated by a two-granular network. The nonlinearity of the granular beads is modulated through the frequency and amplitude of external harmonic excitation, along with static preloading. By employing the orthonormal basis of the nonlinear responses and mapping the displacement coefficients in Bloch states, we reveal how time influences the manipulation of the elastic bit and its states. Our analytical and experimental investigations uncover the Berry phase's role in exposing the various topological characteristics of the classical granular network. This research establishes a crucial link between classical and quantum realms via the Berry phase of an elastic bit, with significant implications for decoherence-free and robust data transfer and information processing.
△ Less
Submitted 29 July, 2024; v1 submitted 14 July, 2024;
originally announced July 2024.
-
A multi-functional fiber positioning system for Extremely Large Telescopes
Authors:
Manjunath Bestha,
T. Sivarani,
Arun Surya,
Sudharsan Yadav,
Athira Unni,
Parvathy M,
Devika Divakar,
S. Sriram,
Ajin Prakash,
Amirul Hasan
Abstract:
We present a conceptual design for a fiber positioning system for multi-object high-resolution spectroscopy, designed to be compatible with the upcoming large telescopes with a wide field of view. The design incorporates multiple Atmospheric Dispersion Correctors (ADCs) and tip-tilt mirrors that receive non-telecentric input from individual targets and direct it to the ADCs. Here, we introduce a m…
▽ More
We present a conceptual design for a fiber positioning system for multi-object high-resolution spectroscopy, designed to be compatible with the upcoming large telescopes with a wide field of view. The design incorporates multiple Atmospheric Dispersion Correctors (ADCs) and tip-tilt mirrors that receive non-telecentric input from individual targets and direct it to the ADCs. Here, we introduce a mechanical design for the fiber positioner that accommodates the optics and operates in a curved focal plane with a Radius of Curvature (R) of 3m. This mechanical design provides four degrees of freedom to access the focal volume, enhancing targeting efficiency. The proposed design and an efficient target allocation algorithm ensure a targeting efficiency of approximately 80-100% for a primary observation session. We also present a methodology for target assignment, positioning, and quantification based on sequential and Monte Carlo (MC) algorithms. This method has been tested on realistic fields with varying target densities to validate its performance.
△ Less
Submitted 22 July, 2024;
originally announced July 2024.
-
Lessons in Cooperation: A Qualitative Analysis of Driver Sentiments towards Real-Time Advisory Systems from a Driving Simulator User Study
Authors:
Aamir Hasan,
Neeloy Chakraborty,
Haonan Chen,
Cathy Wu,
Katherine Driggs-Campbell
Abstract:
Real-time Advisory (RTA) systems, such as navigational and eco-driving assistants, are becoming increasingly ubiquitous in vehicles due to their benefits for users and society. Until autonomous vehicles mature, such advisory systems will continue to expand their ability to cooperate with drivers, enabling safer and more eco-friendly driving practices while improving user experience. However, the i…
▽ More
Real-time Advisory (RTA) systems, such as navigational and eco-driving assistants, are becoming increasingly ubiquitous in vehicles due to their benefits for users and society. Until autonomous vehicles mature, such advisory systems will continue to expand their ability to cooperate with drivers, enabling safer and more eco-friendly driving practices while improving user experience. However, the interactions between these systems and drivers have not been studied extensively. To this end, we conduct a driving simulator study (N=16) to capture driver reactions to a Cooperative RTA system. Through a case study with a congestion mitigation assistant, we qualitatively analyze the sentiments of drivers towards advisory systems and discuss driver preferences for various aspects of the interaction. We comment on how the advice should be communicated, the effects of the advice on driver trust, and how drivers adapt to the system. We present recommendations to inform the future design of Cooperative RTA systems.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Base Models for Parabolic Partial Differential Equations
Authors:
Xingzi Xu,
Ali Hasan,
Jie Ding,
Vahid Tarokh
Abstract:
Parabolic partial differential equations (PDEs) appear in many disciplines to model the evolution of various mathematical objects, such as probability flows, value functions in control theory, and derivative prices in finance. It is often necessary to compute the solutions or a function of the solutions to a parametric PDE in multiple scenarios corresponding to different parameters of this PDE. Th…
▽ More
Parabolic partial differential equations (PDEs) appear in many disciplines to model the evolution of various mathematical objects, such as probability flows, value functions in control theory, and derivative prices in finance. It is often necessary to compute the solutions or a function of the solutions to a parametric PDE in multiple scenarios corresponding to different parameters of this PDE. This process often requires resolving the PDEs from scratch, which is time-consuming. To better employ existing simulations for the PDEs, we propose a framework for finding solutions to parabolic PDEs across different scenarios by meta-learning an underlying base distribution. We build upon this base distribution to propose a method for computing solutions to parametric PDEs under different parameter settings. Finally, we illustrate the application of the proposed methods through extensive experiments in generative modeling, stochastic control, and finance. The empirical results suggest that the proposed approach improves generalization to solving PDEs under new parameter regimes.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Exploring Gender-Specific Speech Patterns in Automatic Suicide Risk Assessment
Authors:
Maurice Gerczuk,
Shahin Amiriparian,
Justina Lutz,
Wolfgang Strube,
Irina Papazova,
Alkomiet Hasan,
Björn W. Schuller
Abstract:
In emergency medicine, timely intervention for patients at risk of suicide is often hindered by delayed access to specialised psychiatric care. To bridge this gap, we introduce a speech-based approach for automatic suicide risk assessment. Our study involves a novel dataset comprising speech recordings of 20 patients who read neutral texts. We extract four speech representations encompassing inter…
▽ More
In emergency medicine, timely intervention for patients at risk of suicide is often hindered by delayed access to specialised psychiatric care. To bridge this gap, we introduce a speech-based approach for automatic suicide risk assessment. Our study involves a novel dataset comprising speech recordings of 20 patients who read neutral texts. We extract four speech representations encompassing interpretable and deep features. Further, we explore the impact of gender-based modelling and phrase-level normalisation. By applying gender-exclusive modelling, features extracted from an emotion fine-tuned wav2vec2.0 model can be utilised to discriminate high- from low- suicide risk with a balanced accuracy of 81%. Finally, our analysis reveals a discrepancy in the relationship of speech characteristics and suicide risk between female and male subjects. For men in our dataset, suicide risk increases together with agitation while voice characteristics of female subjects point the other way.
△ Less
Submitted 26 June, 2024;
originally announced July 2024.
-
NativQA: Multilingual Culturally-Aligned Natural Query for LLMs
Authors:
Md. Arid Hasan,
Maram Hasanain,
Fatema Ahmad,
Sahinur Rahman Laskar,
Sunaya Upadhyay,
Vrunda N Sukhadia,
Mucahid Kutlu,
Shammur Absar Chowdhury,
Firoj Alam
Abstract:
Natural Question Answering (QA) datasets play a crucial role in developing and evaluating the capabilities of large language models (LLMs), ensuring their effective usage in real-world applications. Despite the numerous QA datasets that have been developed, there is a notable lack of region-specific datasets generated by native users in their own languages. This gap hinders the effective benchmark…
▽ More
Natural Question Answering (QA) datasets play a crucial role in developing and evaluating the capabilities of large language models (LLMs), ensuring their effective usage in real-world applications. Despite the numerous QA datasets that have been developed, there is a notable lack of region-specific datasets generated by native users in their own languages. This gap hinders the effective benchmarking of LLMs for regional and cultural specificities. In this study, we propose a scalable framework, NativQA, to seamlessly construct culturally and regionally aligned QA datasets in native languages, for LLM evaluation and tuning. Moreover, to demonstrate the efficacy of the proposed framework, we designed a multilingual natural QA dataset, MultiNativQA, consisting of ~72K QA pairs in seven languages, ranging from high to extremely low resource, based on queries from native speakers covering 18 topics. We benchmark the MultiNativQA dataset with open- and closed-source LLMs. We made both the framework NativQA and MultiNativQA dataset publicly available for the community. (https://nativqa.gitlab.io)
△ Less
Submitted 13 July, 2024;
originally announced July 2024.
-
Berry Phase and Topological Insights in a Qubit-Inspired Classical Two-Level Elastic Bit
Authors:
Kazi T. Mahmood,
M. Arif Hasan
Abstract:
The exploration of the Berry phase in classical mechanics has opened new frontiers in understanding the dynamics of physical systems, analogous to quantum mechanics. Here, we show controlled accumulation of the Berry phase in a two-level elastic bit, which are classical counterparts of qubits, achieved by manipulating coupled granules with external drivers. Employing the Bloch sphere representatio…
▽ More
The exploration of the Berry phase in classical mechanics has opened new frontiers in understanding the dynamics of physical systems, analogous to quantum mechanics. Here, we show controlled accumulation of the Berry phase in a two-level elastic bit, which are classical counterparts of qubits, achieved by manipulating coupled granules with external drivers. Employing the Bloch sphere representation, the paper demonstrates the manipulation of elastic bit states and the realization of quantum-analogue logic gates. A key achievement is the calculation of the Berry phase for various system states, revealing insights into the system's topological nature. Unique to this study is the use of external parameters to explore topological transitions, contrasting with traditional approaches focusing on internal system modifications. By linking the classical and quantum worlds through the Berry phase of an elastic bit, this work extends the potential applications of topological concepts in designing new materials and computational models.
△ Less
Submitted 10 July, 2024;
originally announced July 2024.
-
CANDID DAC: Leveraging Coupled Action Dimensions with Importance Differences in DAC
Authors:
Philipp Bordne,
M. Asif Hasan,
Eddie Bergman,
Noor Awad,
André Biedenkapp
Abstract:
High-dimensional action spaces remain a challenge for dynamic algorithm configuration (DAC). Interdependencies and varying importance between action dimensions are further known key characteristics of DAC problems. We argue that these Coupled Action Dimensions with Importance Differences (CANDID) represent aspects of the DAC problem that are not yet fully explored. To address this gap, we introduc…
▽ More
High-dimensional action spaces remain a challenge for dynamic algorithm configuration (DAC). Interdependencies and varying importance between action dimensions are further known key characteristics of DAC problems. We argue that these Coupled Action Dimensions with Importance Differences (CANDID) represent aspects of the DAC problem that are not yet fully explored. To address this gap, we introduce a new white-box benchmark within the DACBench suite that simulates the properties of CANDID. Further, we propose sequential policies as an effective strategy for managing these properties. Such policies factorize the action space and mitigate exponential growth by learning a policy per action dimension. At the same time, these policies accommodate the interdependence of action dimensions by fostering implicit coordination. We show this in an experimental study of value-based policies on our new benchmark. This study demonstrates that sequential policies significantly outperform independent learning of factorized policies in CANDID action spaces. In addition, they overcome the scalability limitations associated with learning a single policy across all action dimensions. The code used for our experiments is available under https://github.com/PhilippBordne/candidDAC.
△ Less
Submitted 8 July, 2024;
originally announced July 2024.
-
Beyond Check-in Counts: Redefining Popularity for POI Recommendation with Users and Recency
Authors:
Alif Al Hasan,
Md Musfique Anwar
Abstract:
The next POI (point of interest) recommendation aims to predict users' immediate future movements based on their prior records and present circumstances, which will be very beneficial to service providers as well as users. The popularity of the POI over time is one of the primary deciding factors for choosing the next POI to visit. The majority of research in recent times has paid more attention t…
▽ More
The next POI (point of interest) recommendation aims to predict users' immediate future movements based on their prior records and present circumstances, which will be very beneficial to service providers as well as users. The popularity of the POI over time is one of the primary deciding factors for choosing the next POI to visit. The majority of research in recent times has paid more attention to the number of check-ins to define the popularity of a point of interest, disregarding the temporal impact or number of people checking in for a particular POI. In this paper, we propose a recency-oriented definition of popularity that takes into account the temporal effect on POI's popularity, the number of check-ins, as well as the number of people who registered those check-ins. Thus, recent check-ins get prioritized with more weight compared to the older ones. Experimental results demonstrate that performance is better with recency-aware popularity definitions for POIs than with solely check-in count-based popularity definitions.
△ Less
Submitted 7 July, 2024;
originally announced July 2024.
-
ArAIEval Shared Task: Propagandistic Techniques Detection in Unimodal and Multimodal Arabic Content
Authors:
Maram Hasanain,
Md. Arid Hasan,
Fatema Ahmed,
Reem Suwaileh,
Md. Rafiul Biswas,
Wajdi Zaghouani,
Firoj Alam
Abstract:
We present an overview of the second edition of the ArAIEval shared task, organized as part of the ArabicNLP 2024 conference co-located with ACL 2024. In this edition, ArAIEval offers two tasks: (i) detection of propagandistic textual spans with persuasion techniques identification in tweets and news articles, and (ii) distinguishing between propagandistic and non-propagandistic memes. A total of…
▽ More
We present an overview of the second edition of the ArAIEval shared task, organized as part of the ArabicNLP 2024 conference co-located with ACL 2024. In this edition, ArAIEval offers two tasks: (i) detection of propagandistic textual spans with persuasion techniques identification in tweets and news articles, and (ii) distinguishing between propagandistic and non-propagandistic memes. A total of 14 teams participated in the final evaluation phase, with 6 and 9 teams participating in Tasks 1 and 2, respectively. Finally, 11 teams submitted system description papers. Across both tasks, we observed that fine-tuning transformer models such as AraBERT was at the core of the majority of the participating systems. We provide a description of the task setup, including a description of the dataset construction and the evaluation setup. We further provide a brief overview of the participating systems. All datasets and evaluation scripts are released to the research community (https://araieval.gitlab.io/). We hope this will enable further research on these important tasks in Arabic.
△ Less
Submitted 5 July, 2024;
originally announced July 2024.
-
Cooperative Advisory Residual Policies for Congestion Mitigation
Authors:
Aamir Hasan,
Neeloy Chakraborty,
Haonan Chen,
Jung-Hoon Cho,
Cathy Wu,
Katherine Driggs-Campbell
Abstract:
Fleets of autonomous vehicles can mitigate traffic congestion through simple actions, thus improving many socioeconomic factors such as commute time and gas costs. However, these approaches are limited in practice as they assume precise control over autonomous vehicle fleets, incur extensive installation costs for a centralized sensor ecosystem, and also fail to account for uncertainty in driver b…
▽ More
Fleets of autonomous vehicles can mitigate traffic congestion through simple actions, thus improving many socioeconomic factors such as commute time and gas costs. However, these approaches are limited in practice as they assume precise control over autonomous vehicle fleets, incur extensive installation costs for a centralized sensor ecosystem, and also fail to account for uncertainty in driver behavior. To this end, we develop a class of learned residual policies that can be used in cooperative advisory systems and only require the use of a single vehicle with a human driver. Our policies advise drivers to behave in ways that mitigate traffic congestion while accounting for diverse driver behaviors, particularly drivers' reactions to instructions, to provide an improved user experience. To realize such policies, we introduce an improved reward function that explicitly addresses congestion mitigation and driver attitudes to advice. We show that our residual policies can be personalized by conditioning them on an inferred driver trait that is learned in an unsupervised manner with a variational autoencoder. Our policies are trained in simulation with our novel instruction adherence driver model, and evaluated in simulation and through a user study (N=16) to capture the sentiments of human drivers. Our results show that our approaches successfully mitigate congestion while adapting to different driver behaviors, with up to 20% and 40% improvement as measured by a combination metric of speed and deviations in speed across time over baselines in our simulation tests and user study, respectively. Our user study further shows that our policies are human-compatible and personalize to drivers.
△ Less
Submitted 29 June, 2024;
originally announced July 2024.
-
Root Cause Analysis of Anomalies in 5G RAN Using Graph Neural Network and Transformer
Authors:
Antor Hasan,
Conrado Boeira,
Khaleda Papry,
Yue Ju,
Zhongwen Zhu,
Israat Haque
Abstract:
The emergence of 5G technology marks a significant milestone in developing telecommunication networks, enabling exciting new applications such as augmented reality and self-driving vehicles. However, these improvements bring an increased management complexity and a special concern in dealing with failures, as the applications 5G intends to support heavily rely on high network performance and low l…
▽ More
The emergence of 5G technology marks a significant milestone in developing telecommunication networks, enabling exciting new applications such as augmented reality and self-driving vehicles. However, these improvements bring an increased management complexity and a special concern in dealing with failures, as the applications 5G intends to support heavily rely on high network performance and low latency. Thus, automatic self-healing solutions have become effective in dealing with this requirement, allowing a learning-based system to automatically detect anomalies and perform Root Cause Analysis (RCA). However, there are inherent challenges to the implementation of such intelligent systems. First, there is a lack of suitable data for anomaly detection and RCA, as labelled data for failure scenarios is uncommon. Secondly, current intelligent solutions are tailored to LTE networks and do not fully capture the spatio-temporal characteristics present in the data. Considering this, we utilize a calibrated simulator, Simu5G, and generate open-source data for normal and failure scenarios. Using this data, we propose Simba, a state-of-the-art approach for anomaly detection and root cause analysis in 5G Radio Access Networks (RANs). We leverage Graph Neural Networks to capture spatial relationships while a Transformer model is used to learn the temporal dependencies of the data. We implement a prototype of Simba and evaluate it over multiple failures. The outcomes are compared against existing solutions to confirm the superiority of Simba.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Harnessing Knowledge Retrieval with Large Language Models for Clinical Report Error Correction
Authors:
Jinge Wu,
Zhaolong Wu,
Abul Hasan,
Yunsoo Kim,
Jason P. Y. Cheung,
Teng Zhang,
Honghan Wu
Abstract:
This study proposes an approach for error correction in clinical radiology reports, leveraging large language models (LLMs) and retrieval-augmented generation (RAG) techniques. The proposed framework employs internal and external retrieval mechanisms to extract relevant medical entities and relations from the report and external knowledge sources. A three-stage inference process is introduced, dec…
▽ More
This study proposes an approach for error correction in clinical radiology reports, leveraging large language models (LLMs) and retrieval-augmented generation (RAG) techniques. The proposed framework employs internal and external retrieval mechanisms to extract relevant medical entities and relations from the report and external knowledge sources. A three-stage inference process is introduced, decomposing the task into error detection, localization, and correction subtasks, which enhances the explainability and performance of the system. The effectiveness of the approach is evaluated using a benchmark dataset created by corrupting real-world radiology reports with realistic errors, guided by domain experts. Experimental results demonstrate the benefits of the proposed methods, with the combination of internal and external retrieval significantly improving the accuracy of error detection, localization, and correction across various state-of-the-art LLMs. The findings contribute to the development of more robust and reliable error correction systems for clinical documentation.
△ Less
Submitted 21 June, 2024;
originally announced June 2024.
-
Infusing clinical knowledge into tokenisers for language models
Authors:
Abul Hasan,
Jinge Wu,
Quang Ngoc Nguyen,
Salomé Andres,
Imane Guellil,
Huayu Zhang,
Arlene Casey,
Beatrice Alex,
Bruce Guthrie,
Honghan Wu
Abstract:
This study introduces a novel knowledge enhanced tokenisation mechanism, K-Tokeniser, for clinical text processing. Technically, at initialisation stage, K-Tokeniser populates global representations of tokens based on semantic types of domain concepts (such as drugs or diseases) from either a domain ontology like Unified Medical Language System or the training data of the task related corpus. At t…
▽ More
This study introduces a novel knowledge enhanced tokenisation mechanism, K-Tokeniser, for clinical text processing. Technically, at initialisation stage, K-Tokeniser populates global representations of tokens based on semantic types of domain concepts (such as drugs or diseases) from either a domain ontology like Unified Medical Language System or the training data of the task related corpus. At training or inference stage, sentence level localised context will be utilised for choosing the optimal global token representation to realise the semantic-based tokenisation. To avoid pretraining using the new tokeniser, an embedding initialisation approach is proposed to generate representations for new tokens. Using three transformer-based language models, a comprehensive set of experiments are conducted on four real-world datasets for evaluating K-Tokeniser in a wide range of clinical text analytics tasks including clinical concept and relation extraction, automated clinical coding, clinical phenotype identification, and clinical research article classification. Overall, our models demonstrate consistent improvements over their counterparts in all tasks. In particular, substantial improvements are observed in the automated clinical coding task with 13\% increase on Micro $F_1$ score. Furthermore, K-Tokeniser also shows significant capacities in facilitating quicker converge of language models. Specifically, using K-Tokeniser, the language models would only require 50\% of the training data to achieve the best performance of the baseline tokeniser using all training data in the concept extraction task and less than 20\% of the data for the automated coding task. It is worth mentioning that all these improvements require no pre-training process, making the approach generalisable.
△ Less
Submitted 20 June, 2024;
originally announced June 2024.
-
Chain-of-Though (CoT) prompting strategies for medical error detection and correction
Authors:
Zhaolong Wu,
Abul Hasan,
Jinge Wu,
Yunsoo Kim,
Jason P. Y. Cheung,
Teng Zhang,
Honghan Wu
Abstract:
This paper describes our submission to the MEDIQA-CORR 2024 shared task for automatically detecting and correcting medical errors in clinical notes. We report results for three methods of few-shot In-Context Learning (ICL) augmented with Chain-of-Thought (CoT) and reason prompts using a large language model (LLM). In the first method, we manually analyse a subset of train and validation dataset to…
▽ More
This paper describes our submission to the MEDIQA-CORR 2024 shared task for automatically detecting and correcting medical errors in clinical notes. We report results for three methods of few-shot In-Context Learning (ICL) augmented with Chain-of-Thought (CoT) and reason prompts using a large language model (LLM). In the first method, we manually analyse a subset of train and validation dataset to infer three CoT prompts by examining error types in the clinical notes. In the second method, we utilise the training dataset to prompt the LLM to deduce reasons about their correctness or incorrectness. The constructed CoTs and reasons are then augmented with ICL examples to solve the tasks of error detection, span identification, and error correction. Finally, we combine the two methods using a rule-based ensemble method. Across the three sub-tasks, our ensemble method achieves a ranking of 3rd for both sub-task 1 and 2, while securing 7th place in sub-task 3 among all submissions.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
Intrinsic compressibility effects in near-wall turbulence
Authors:
Asif Manzoor Hasan,
Pedro Costa,
Johan Larsson,
Sergio Pirozzoli,
Rene Pecnik
Abstract:
The impact of intrinsic compressibility effects -- changes in fluid volume due to pressure variations -- on high-speed wall-bounded turbulence has often been overlooked or incorrectly attributed to mean property variations. To unambiguously quantify these intrinsic compressibility effects, we perform direct numerical simulations of compressible turbulent channel flows with nearly uniform mean prop…
▽ More
The impact of intrinsic compressibility effects -- changes in fluid volume due to pressure variations -- on high-speed wall-bounded turbulence has often been overlooked or incorrectly attributed to mean property variations. To unambiguously quantify these intrinsic compressibility effects, we perform direct numerical simulations of compressible turbulent channel flows with nearly uniform mean properties. Our simulations reveal that intrinsic compressibility effects yield a significant upward shift in the logarithmic mean velocity profile that can be attributed to the reduction in the turbulent shear stress. This reduction stems from the weakening of the near-wall quasi-streamwise vortices. We in turn attribute this weakening to the spontaneous opposition of sweeps and ejections from the near-wall expansions and contractions of the fluid, and provide a theoretical explanation for this mechanism. Our results also demonstrate that intrinsic compressibility effects are responsible for the increase in the inner-scaled streamwise turbulence intensity in compressible flows compared to incompressible flows, previously regarded to be an effect of mean property variations.
△ Less
Submitted 11 June, 2024;
originally announced June 2024.
-
ArMeme: Propagandistic Content in Arabic Memes
Authors:
Firoj Alam,
Abul Hasnat,
Fatema Ahmed,
Md Arid Hasan,
Maram Hasanain
Abstract:
With the rise of digital communication, memes have become a significant medium for cultural and political expression that is often used to mislead audiences. Identification of such misleading and persuasive multimodal content has become more important among various stakeholders, including social media platforms, policymakers, and the broader society as they often cause harm to individuals, organiz…
▽ More
With the rise of digital communication, memes have become a significant medium for cultural and political expression that is often used to mislead audiences. Identification of such misleading and persuasive multimodal content has become more important among various stakeholders, including social media platforms, policymakers, and the broader society as they often cause harm to individuals, organizations, and/or society. While there has been effort to develop AI-based automatic systems for resource-rich languages (e.g., English), it is relatively little to none for medium to low resource languages. In this study, we focused on developing an Arabic memes dataset with manual annotations of propagandistic content. We annotated ~6K Arabic memes collected from various social media platforms, which is a first resource for Arabic multimodal research. We provide a comprehensive analysis aiming to develop computational tools for their detection. We will make them publicly available for the community.
△ Less
Submitted 6 June, 2024;
originally announced June 2024.
-
RadBARTsum: Domain Specific Adaption of Denoising Sequence-to-Sequence Models for Abstractive Radiology Report Summarization
Authors:
Jinge Wu,
Abul Hasan,
Honghan Wu
Abstract:
Radiology report summarization is a crucial task that can help doctors quickly identify clinically significant findings without the need to review detailed sections of reports. This study proposes RadBARTsum, a domain-specific and ontology facilitated adaptation of the BART model for abstractive radiology report summarization. The approach involves two main steps: 1) re-training the BART model on…
▽ More
Radiology report summarization is a crucial task that can help doctors quickly identify clinically significant findings without the need to review detailed sections of reports. This study proposes RadBARTsum, a domain-specific and ontology facilitated adaptation of the BART model for abstractive radiology report summarization. The approach involves two main steps: 1) re-training the BART model on a large corpus of radiology reports using a novel entity masking strategy to improving biomedical domain knowledge learning, and 2) fine-tuning the model for the summarization task using the Findings and Background sections to predict the Impression section. Experiments are conducted using different masking strategies. Results show that the re-training process with domain knowledge facilitated masking improves performances consistently across various settings. This work contributes a domain-specific generative language model for radiology report summarization and a method for utilising medical knowledge to realise entity masking language model. The proposed approach demonstrates a promising direction of enhancing the efficiency of language models by deepening its understanding of clinical knowledge in radiology reports.
△ Less
Submitted 5 June, 2024;
originally announced June 2024.
-
The $SL_2(\mathbb{R})$ duality and the non-invertible $U(1)$ symmetry of Maxwell theory
Authors:
Azeem Hasan,
Shani Meynet,
Daniele Migliorati
Abstract:
Recent proposals for the Symmetry Topological Field Theory (SymTFT) of Maxwell theory admit a 0-form symmetry compatible with the classical $SL_2(\mathbb{R})$ duality of electromagnetism. We describe how to realize these automorphisms of the SymTFT in terms of its operators and we detail their effects on the dynamical theory and its global variants. In the process, we show that the classical…
▽ More
Recent proposals for the Symmetry Topological Field Theory (SymTFT) of Maxwell theory admit a 0-form symmetry compatible with the classical $SL_2(\mathbb{R})$ duality of electromagnetism. We describe how to realize these automorphisms of the SymTFT in terms of its operators and we detail their effects on the dynamical theory and its global variants. In the process, we show that the classical $U(1)$ symmetry, corresponding to the stabilizer of $SL_2(\mathbb{R})$, can be restored as a non-invertible one, by means of an infinite series of discrete gauging. This provides an example of the reemergence of a classical symmetry in the quantum regime, which was not broken by anomalies, but rather by the quantization of electromagnetic fluxes. However, this procedure comes at the price of introducing "continuous" condensates that trivialize all line operators.
△ Less
Submitted 11 June, 2024; v1 submitted 29 May, 2024;
originally announced May 2024.
-
WeatherFormer: A Pretrained Encoder Model for Learning Robust Weather Representations from Small Datasets
Authors:
Adib Hasan,
Mardavij Roozbehani,
Munther Dahleh
Abstract:
This paper introduces WeatherFormer, a transformer encoder-based model designed to learn robust weather features from minimal observations. It addresses the challenge of modeling complex weather dynamics from small datasets, a bottleneck for many prediction tasks in agriculture, epidemiology, and climate science. WeatherFormer was pretrained on a large pretraining dataset comprised of 39 years of…
▽ More
This paper introduces WeatherFormer, a transformer encoder-based model designed to learn robust weather features from minimal observations. It addresses the challenge of modeling complex weather dynamics from small datasets, a bottleneck for many prediction tasks in agriculture, epidemiology, and climate science. WeatherFormer was pretrained on a large pretraining dataset comprised of 39 years of satellite measurements across the Americas. With a novel pretraining task and fine-tuning, WeatherFormer achieves state-of-the-art performance in county-level soybean yield prediction and influenza forecasting. Technical innovations include a unique spatiotemporal encoding that captures geographical, annual, and seasonal variations, adapting the transformer architecture to continuous weather data, and a pretraining strategy to learn representations that are robust to missing weather features. This paper for the first time demonstrates the effectiveness of pretraining large transformer encoder models for weather-dependent applications across multiple domains.
△ Less
Submitted 22 May, 2024;
originally announced May 2024.
-
Wind Power Prediction across Different Locations using Deep Domain Adaptive Learning
Authors:
Md Saiful Islam Sajol,
Md Shazid Islam,
A S M Jahid Hasan,
Md Saydur Rahman,
Jubair Yusuf
Abstract:
Accurate prediction of wind power is essential for the grid integration of this intermittent renewable source and aiding grid planners in forecasting available wind capacity. Spatial differences lead to discrepancies in climatological data distributions between two geographically dispersed regions, consequently making the prediction task more difficult. Thus, a prediction model that learns from th…
▽ More
Accurate prediction of wind power is essential for the grid integration of this intermittent renewable source and aiding grid planners in forecasting available wind capacity. Spatial differences lead to discrepancies in climatological data distributions between two geographically dispersed regions, consequently making the prediction task more difficult. Thus, a prediction model that learns from the data of a particular climatic region can suffer from being less robust. A deep neural network (DNN) based domain adaptive approach is proposed to counter this drawback. Effective weather features from a large set of weather parameters are selected using a random forest approach. A pre-trained model from the source domain is utilized to perform the prediction task, assuming no source data is available during target domain prediction. The weights of only the last few layers of the DNN model are updated throughout the task, keeping the rest of the network unchanged, making the model faster compared to the traditional approaches. The proposed approach demonstrates higher accuracy ranging from 6.14% to even 28.44% compared to the traditional non-adaptive method.
△ Less
Submitted 18 May, 2024;
originally announced May 2024.
-
Generative Artificial Intelligence: A Systematic Review and Applications
Authors:
Sandeep Singh Sengar,
Affan Bin Hasan,
Sanjay Kumar,
Fiona Carroll
Abstract:
In recent years, the study of artificial intelligence (AI) has undergone a paradigm shift. This has been propelled by the groundbreaking capabilities of generative models both in supervised and unsupervised learning scenarios. Generative AI has shown state-of-the-art performance in solving perplexing real-world conundrums in fields such as image translation, medical diagnostics, textual imagery fu…
▽ More
In recent years, the study of artificial intelligence (AI) has undergone a paradigm shift. This has been propelled by the groundbreaking capabilities of generative models both in supervised and unsupervised learning scenarios. Generative AI has shown state-of-the-art performance in solving perplexing real-world conundrums in fields such as image translation, medical diagnostics, textual imagery fusion, natural language processing, and beyond. This paper documents the systematic review and analysis of recent advancements and techniques in Generative AI with a detailed discussion of their applications including application-specific models. Indeed, the major impact that generative AI has made to date, has been in language generation with the development of large language models, in the field of image translation and several other interdisciplinary applications of generative AI. Moreover, the primary contribution of this paper lies in its coherent synthesis of the latest advancements in these areas, seamlessly weaving together contemporary breakthroughs in the field. Particularly, how it shares an exploration of the future trajectory for generative AI. In conclusion, the paper ends with a discussion of Responsible AI principles, and the necessary ethical considerations for the sustainability and growth of these generative models.
△ Less
Submitted 17 May, 2024;
originally announced May 2024.
-
Improving Pediatric Pneumonia Diagnosis with Adult Chest X-ray Images Utilizing Contrastive Learning and Embedding Similarity
Authors:
Mohammad Zunaed,
Anwarul Hasan,
Taufiq Hasan
Abstract:
Despite the advancement of deep learning-based computer-aided diagnosis (CAD) methods for pneumonia from adult chest x-ray (CXR) images, the performance of CAD methods applied to pediatric images remains suboptimal, mainly due to the lack of large-scale annotated pediatric imaging datasets. Establishing a proper framework to leverage existing adult large-scale CXR datasets can thus enhance pediatr…
▽ More
Despite the advancement of deep learning-based computer-aided diagnosis (CAD) methods for pneumonia from adult chest x-ray (CXR) images, the performance of CAD methods applied to pediatric images remains suboptimal, mainly due to the lack of large-scale annotated pediatric imaging datasets. Establishing a proper framework to leverage existing adult large-scale CXR datasets can thus enhance pediatric pneumonia detection performance. In this paper, we propose a three-branch parallel path learning-based framework that utilizes both adult and pediatric datasets to improve the performance of deep learning models on pediatric test datasets. The paths are trained with pediatric only, adult only, and both types of CXRs, respectively. Our proposed framework utilizes the multi-positive contrastive loss to cluster the classwise embeddings and the embedding similarity loss among these three parallel paths to make the classwise embeddings as close as possible to reduce the effect of domain shift. Experimental evaluations on open-access adult and pediatric CXR datasets show that the proposed method achieves a superior AUROC score of 0.8464 compared to 0.8348 obtained using the conventional approach of join training on both datasets. The proposed approach thus paves the way for generalized CAD models that are effective for both adult and pediatric age groups.
△ Less
Submitted 19 April, 2024;
originally announced April 2024.
-
Enhancing Suicide Risk Assessment: A Speech-Based Automated Approach in Emergency Medicine
Authors:
Shahin Amiriparian,
Maurice Gerczuk,
Justina Lutz,
Wolfgang Strube,
Irina Papazova,
Alkomiet Hasan,
Alexander Kathan,
Björn W. Schuller
Abstract:
The delayed access to specialized psychiatric assessments and care for patients at risk of suicidal tendencies in emergency departments creates a notable gap in timely intervention, hindering the provision of adequate mental health support during critical situations. To address this, we present a non-invasive, speech-based approach for automatic suicide risk assessment. For our study, we have coll…
▽ More
The delayed access to specialized psychiatric assessments and care for patients at risk of suicidal tendencies in emergency departments creates a notable gap in timely intervention, hindering the provision of adequate mental health support during critical situations. To address this, we present a non-invasive, speech-based approach for automatic suicide risk assessment. For our study, we have collected a novel dataset of speech recordings from $20$ patients from which we extract three sets of features, including wav2vec, interpretable speech and acoustic features, and deep learning-based spectral representations. We proceed by conducting a binary classification to assess suicide risk in a leave-one-subject-out fashion. Our most effective speech model achieves a balanced accuracy of $66.2\,\%$. Moreover, we show that integrating our speech model with a series of patients' metadata, such as the history of suicide attempts or access to firearms, improves the overall result. The metadata integration yields a balanced accuracy of $94.4\,\%$, marking an absolute improvement of $28.2\,\%$, demonstrating the efficacy of our proposed approaches for automatic suicide risk assessment in emergency medicine.
△ Less
Submitted 18 April, 2024;
originally announced April 2024.
-
Binder: Hierarchical Concept Representation through Order Embedding of Binary Vectors
Authors:
Croix Gyurek,
Niloy Talukder,
Mohammad Al Hasan
Abstract:
For natural language understanding and generation, embedding concepts using an order-based representation is an essential task. Unlike traditional point vector based representation, an order-based representation imposes geometric constraints on the representation vectors for explicitly capturing various semantic relationships that may exist between a pair of concepts. In existing literature, sever…
▽ More
For natural language understanding and generation, embedding concepts using an order-based representation is an essential task. Unlike traditional point vector based representation, an order-based representation imposes geometric constraints on the representation vectors for explicitly capturing various semantic relationships that may exist between a pair of concepts. In existing literature, several approaches on order-based embedding have been proposed, mostly focusing on capturing hierarchical relationships; examples include vectors in Euclidean space, complex, Hyperbolic, order, and Box Embedding. Box embedding creates region-based rich representation of concepts, but along the process it sacrifices simplicity, requiring a custom-made optimization scheme for learning the representation. Hyperbolic embedding improves embedding quality by exploiting the ever-expanding property of Hyperbolic space, but it also suffers from the same fate as box embedding as gradient descent like optimization is not simple in the Hyperbolic space. In this work, we propose Binder, a novel approach for order-based representation. Binder uses binary vectors for embedding, so the embedding vectors are compact with an order of magnitude smaller footprint than other methods. Binder uses a simple and efficient optimization scheme for learning representation vectors with a linear time complexity. Our comprehensive experimental results show that Binder is very accurate, yielding competitive results on the representation task. But Binder stands out from its competitors on the transitive closure link prediction task as it can learn concept embeddings just from the direct edges, whereas all existing order-based approaches rely on the indirect edges.
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
A Calibrated and Automated Simulator for Innovations in 5G
Authors:
Conrado Boeira,
Antor Hasan,
Khaleda Papry,
Yue Ju,
Zhongwen Zhu,
Israat Haque
Abstract:
The rise of 5G deployments has created the environment for many emerging technologies to flourish. Self-driving vehicles, Augmented and Virtual Reality, and remote operations are examples of applications that leverage 5G networks' support for extremely low latency, high bandwidth, and increased throughput. However, the complex architecture of 5G hinders innovation due to the lack of accessibility…
▽ More
The rise of 5G deployments has created the environment for many emerging technologies to flourish. Self-driving vehicles, Augmented and Virtual Reality, and remote operations are examples of applications that leverage 5G networks' support for extremely low latency, high bandwidth, and increased throughput. However, the complex architecture of 5G hinders innovation due to the lack of accessibility to testbeds or realistic simulators with adequate 5G functionalities. Also, configuring and managing simulators are complex and time consuming. Finally, the lack of adequate representative data hinders the data-driven designs in 5G campaigns. Thus, we calibrated a system-level open-source simulator, Simu5G, following 3GPP guidelines to enable faster innovation in the 5G domain. Furthermore, we developed an API for automatic simulator configuration without knowing the underlying architectural details. Finally, we demonstrate the usage of the calibrated and automated simulator by developing an ML-based anomaly detection in a 5G Radio Access Network (RAN).
△ Less
Submitted 16 April, 2024;
originally announced April 2024.
-
Neural McKean-Vlasov Processes: Distributional Dependence in Diffusion Processes
Authors:
Haoming Yang,
Ali Hasan,
Yuting Ng,
Vahid Tarokh
Abstract:
McKean-Vlasov stochastic differential equations (MV-SDEs) provide a mathematical description of the behavior of an infinite number of interacting particles by imposing a dependence on the particle density. As such, we study the influence of explicitly including distributional information in the parameterization of the SDE. We propose a series of semi-parametric methods for representing MV-SDEs, an…
▽ More
McKean-Vlasov stochastic differential equations (MV-SDEs) provide a mathematical description of the behavior of an infinite number of interacting particles by imposing a dependence on the particle density. As such, we study the influence of explicitly including distributional information in the parameterization of the SDE. We propose a series of semi-parametric methods for representing MV-SDEs, and corresponding estimators for inferring parameters from data based on the properties of the MV-SDE. We analyze the characteristics of the different architectures and estimators, and consider their applicability in relevant machine learning problems. We empirically compare the performance of the different architectures and estimators on real and synthetic datasets for time series and probabilistic modeling. The results suggest that explicitly including distributional dependence in the parameterization of the SDE is effective in modeling temporal data with interaction under an exchangeability assumption while maintaining strong performance for standard Itô-SDEs due to the richer class of probability flows associated with MV-SDEs.
△ Less
Submitted 14 April, 2024;
originally announced April 2024.
-
Classification of Short Segment Pediatric Heart Sounds Based on a Transformer-Based Convolutional Neural Network
Authors:
Md Hassanuzzaman,
Nurul Akhtar Hasan,
Mohammad Abdullah Al Mamun,
Khawza I Ahmed,
Ahsan H Khandoker,
Raqibul Mostafa
Abstract:
Congenital anomalies arising as a result of a defect in the structure of the heart and great vessels are known as congenital heart diseases or CHDs. A PCG can provide essential details about the mechanical conduction system of the heart and point out specific patterns linked to different kinds of CHD. This study aims to investigate the minimum signal duration required for the automatic classificat…
▽ More
Congenital anomalies arising as a result of a defect in the structure of the heart and great vessels are known as congenital heart diseases or CHDs. A PCG can provide essential details about the mechanical conduction system of the heart and point out specific patterns linked to different kinds of CHD. This study aims to investigate the minimum signal duration required for the automatic classification of heart sounds. This study also investigated the optimum signal quality assessment indicator (Root Mean Square of Successive Differences) RMSSD and (Zero Crossings Rate) ZCR value. Mel-frequency cepstral coefficients (MFCCs) based feature is used as an input to build a Transformer-Based residual one-dimensional convolutional neural network, which is then used for classifying the heart sound. The study showed that 0.4 is the ideal threshold for getting suitable signals for the RMSSD and ZCR indicators. Moreover, a minimum signal length of 5s is required for effective heart sound classification. It also shows that a shorter signal (3 s heart sound) does not have enough information to categorize heart sounds accurately, and the longer signal (15 s heart sound) may contain more noise. The best accuracy, 93.69%, is obtained for the 5s signal to distinguish the heart sound.
△ Less
Submitted 30 March, 2024;
originally announced April 2024.
-
Does the Redshift Distribution of Swift Long GRBs Trace the Star-Formation Rate?
Authors:
Ali M. Hasan,
Walid J. Azzam
Abstract:
Gamma-ray bursts (GRBs) are extremely powerful explosions that have been traditionally classified into two categories: long bursts (LGRBs) with an observed duration T90 > 2 s, and short bursts (SGRBs) with an observed duration T90 < 2 s, where T90 is the time interval during which 90% of the fluence is detected. LGRBs are believed to emanate from the core-collapse of massive stars, while SGRBs are…
▽ More
Gamma-ray bursts (GRBs) are extremely powerful explosions that have been traditionally classified into two categories: long bursts (LGRBs) with an observed duration T90 > 2 s, and short bursts (SGRBs) with an observed duration T90 < 2 s, where T90 is the time interval during which 90% of the fluence is detected. LGRBs are believed to emanate from the core-collapse of massive stars, while SGRBs are believed to result from the merging of two compact objects, like two neutron stars. Because LGRBs are produced by the violent death of massive stars, we expect that their redshift distribution should trace the star-formation rate (SFR). The purpose of our study is to investigate the extent to which the redshift distribution of LGRBs follows and reflects the SFR. We use a sample of 370 LGRBs taken from the Swift catalog, and we investigate different models for the LGRB redshift distribution. We also carry out Monte Carlo simulations to check the consistency of our results. Our results indicate that the SFR can describe the LGRB redshift distribution well for high redshift bursts, but it needs an evolution term to fit the distribution well at low redshift.
△ Less
Submitted 22 March, 2024;
originally announced March 2024.
-
An Empirical Study on Developers Shared Conversations with ChatGPT in GitHub Pull Requests and Issues
Authors:
Huizi Hao,
Kazi Amit Hasan,
Hong Qin,
Marcos Macedo,
Yuan Tian,
Steven H. H. Ding,
Ahmed E. Hassan
Abstract:
ChatGPT has significantly impacted software development practices, providing substantial assistance to developers in a variety of tasks, including coding, testing, and debugging. Despite its widespread adoption, the impact of ChatGPT as an assistant in collaborative coding remains largely unexplored. In this paper, we analyze a dataset of 210 and 370 developers shared conversations with ChatGPT in…
▽ More
ChatGPT has significantly impacted software development practices, providing substantial assistance to developers in a variety of tasks, including coding, testing, and debugging. Despite its widespread adoption, the impact of ChatGPT as an assistant in collaborative coding remains largely unexplored. In this paper, we analyze a dataset of 210 and 370 developers shared conversations with ChatGPT in GitHub pull requests (PRs) and issues. We manually examined the content of the conversations and characterized the dynamics of the sharing behavior, i.e., understanding the rationale behind the sharing, identifying the locations where the conversations were shared, and determining the roles of the developers who shared them. Our main observations are: (1) Developers seek ChatGPT assistance across 16 types of software engineering inquiries. In both conversations shared in PRs and issues, the most frequently encountered inquiry categories include code generation, conceptual questions, how-to guides, issue resolution, and code review. (2) Developers frequently engage with ChatGPT via multi-turn conversations where each prompt can fulfill various roles, such as unveiling initial or new tasks, iterative follow-up, and prompt refinement. Multi-turn conversations account for 33.2% of the conversations shared in PRs and 36.9% in issues. (3) In collaborative coding, developers leverage shared conversations with ChatGPT to facilitate their role-specific contributions, whether as authors of PRs or issues, code reviewers, or collaborators on issues. Our work serves as the first step towards understanding the dynamics between developers and ChatGPT in collaborative software development and opens up new directions for future research on the topic.
△ Less
Submitted 15 March, 2024;
originally announced March 2024.
-
Designing a K-state P-bit Engine
Authors:
Mohammad Khairul Bashar,
Abir Hasan,
Nikhil Shukla
Abstract:
Probabilistic bit (p-bit)-based compute engines utilize the unique capability of a p-bit to probabilistically switch between two states to solve computationally challenging problems. However, when solving problems that require more than two states (e.g., problems such as Max-3-Cut, verifying if a graph is K-partite (K>2) etc.), additional pre-processing steps such as graph reduction are required t…
▽ More
Probabilistic bit (p-bit)-based compute engines utilize the unique capability of a p-bit to probabilistically switch between two states to solve computationally challenging problems. However, when solving problems that require more than two states (e.g., problems such as Max-3-Cut, verifying if a graph is K-partite (K>2) etc.), additional pre-processing steps such as graph reduction are required to make the problem compatible with a two-state p-bit platform. Moreover, this not only increases the problem size by entailing the use of auxiliary variables but can also degrade the solution quality. In this work, we develop a unique framework for implementing a K-state (K>2) p-bit engine. Furthermore, from an implementation standpoint, we show that such a K-state p-bit engine can be implemented using N traditional (2-state) p-bits, and one multi-state p-bit -- a novel concept proposed here. Augmenting traditional p-bit platforms, our approach enables us to solve an archetypal combinatoric problem class requiring multiple states, namely Max-K-Cut (K=3, 4 shown here), without using any additional auxiliary variables. Thus, our work fundamentally advances the functional capability of p-bit engines, enabling them to solve a broader class of computationally challenging problems more efficiently.
△ Less
Submitted 27 March, 2024; v1 submitted 11 March, 2024;
originally announced March 2024.
-
Ensemble Language Models for Multilingual Sentiment Analysis
Authors:
Md Arid Hasan
Abstract:
The rapid advancement of social media enables us to analyze user opinions. In recent times, sentiment analysis has shown a prominent research gap in understanding human sentiment based on the content shared on social media. Although sentiment analysis for commonly spoken languages has advanced significantly, low-resource languages like Arabic continue to get little research due to resource limitat…
▽ More
The rapid advancement of social media enables us to analyze user opinions. In recent times, sentiment analysis has shown a prominent research gap in understanding human sentiment based on the content shared on social media. Although sentiment analysis for commonly spoken languages has advanced significantly, low-resource languages like Arabic continue to get little research due to resource limitations. In this study, we explore sentiment analysis on tweet texts from SemEval-17 and the Arabic Sentiment Tweet dataset. Moreover, We investigated four pretrained language models and proposed two ensemble language models. Our findings include monolingual models exhibiting superior performance and ensemble models outperforming the baseline while the majority voting ensemble outperforms the English language.
△ Less
Submitted 9 March, 2024;
originally announced March 2024.
-
Beyond the Dashboard: Investigating Distracted Driver Communication Preferences for ADAS
Authors:
Aamir Hasan,
D. Livingston McPherson,
Melissa Miles,
Katherine Driggs-Campbell
Abstract:
Distracted driving is a major cause of road fatalities. With improvements in driver (in)attention detection, these distracted situations can be caught early to alert drivers and improve road safety and comfort. However, drivers may have differing preferences for the modes of such communication based on the driving scenario and their current distraction state. To this end, we present an (N=147) whe…
▽ More
Distracted driving is a major cause of road fatalities. With improvements in driver (in)attention detection, these distracted situations can be caught early to alert drivers and improve road safety and comfort. However, drivers may have differing preferences for the modes of such communication based on the driving scenario and their current distraction state. To this end, we present an (N=147) where videos of simulated driving scenarios were utilized to learn drivers preferences for modes of communication and their evolution with the drivers changing attention. The survey queried participants preferred modes of communication for scenarios such as collisions or stagnation at a green light. that inform the future of communication between drivers and their vehicles. We showcase the different driver preferences based on the nature of the driving scenario and also show that they evolve as the drivers distraction state changes
△ Less
Submitted 23 June, 2024; v1 submitted 5 March, 2024;
originally announced March 2024.
-
On efficient normal bases over binary fields
Authors:
Mohamadou Sall,
M. Anwar Hasan
Abstract:
Binary field extensions are fundamental to many applications, such as multivariate public key cryptography, code-based cryptography, and error-correcting codes. Their implementation requires a foundation in number theory and algebraic geometry and necessitates the utilization of efficient bases. The continuous increase in the power of computation, and the design of new (quantum) computers increase…
▽ More
Binary field extensions are fundamental to many applications, such as multivariate public key cryptography, code-based cryptography, and error-correcting codes. Their implementation requires a foundation in number theory and algebraic geometry and necessitates the utilization of efficient bases. The continuous increase in the power of computation, and the design of new (quantum) computers increase the threat to the security of systems and impose increasingly demanding encryption standards with huge polynomial or extension degrees. For cryptographic purposes or other common implementations of finite fields arithmetic, it is essential to explore a wide range of implementations with diverse bases. Unlike some bases, polynomial and Gaussian normal bases are well-documented and widely employed. In this paper, we explore other forms of bases of $\mathbb{F}_{2^n}$ over $\mathbb{F}_2$ to demonstrate efficient implementation of operations within different ranges. To achieve this, we leverage results on fast computations and elliptic periods introduced by Couveignes and Lercier, and subsequently expanded upon by Ezome and Sall. This leads to the establishment of new tables for efficient computation over binary fields.
△ Less
Submitted 18 February, 2024;
originally announced February 2024.
-
Numerical solution of the Newtonian plane Couette flow with linear dynamic wall slip
Authors:
Muner M. A. Hasan,
Ethar A. A. Ahmed,
Ahmed F. Ghaleb,
Moustafa S. Abou-Dina,
Georgios C. Georgiou
Abstract:
An efficient numerical approach based on weighted average finite differences is used to solve the Newtonian plane Couette flow with wall slip, obeying a dynamic slip law that generalizes the Navier slip law with the inclusion of a relaxation term. Slip is exhibited only along the fixed plate, and the motion is triggered by the motion of the other plate. Three different cases are considered for the…
▽ More
An efficient numerical approach based on weighted average finite differences is used to solve the Newtonian plane Couette flow with wall slip, obeying a dynamic slip law that generalizes the Navier slip law with the inclusion of a relaxation term. Slip is exhibited only along the fixed plate, and the motion is triggered by the motion of the other plate. Three different cases are considered for the motion of the moving plate, i.e., constant speed, oscillating speed, and a single-period sinusoidal speed. The velocity and the volumetric flow rate are calculated in all cases and comparisons are made with the results of other methods and available results in the literature. The numerical outcomes confirm the damping with time and the lagging effects arising from the Navier and dynamic wall slip conditions and demonstrate the hysteretic behavior of the slip velocity in following the harmonic boundary motion.
△ Less
Submitted 8 February, 2024;
originally announced February 2024.
-
A Framework for Assurance Audits of Algorithmic Systems
Authors:
Khoa Lam,
Benjamin Lange,
Borhane Blili-Hamelin,
Jovana Davidovic,
Shea Brown,
Ali Hasan
Abstract:
An increasing number of regulations propose AI audits as a mechanism for achieving transparency and accountability for artificial intelligence (AI) systems. Despite some converging norms around various forms of AI auditing, auditing for the purpose of compliance and assurance currently lacks agreed-upon practices, procedures, taxonomies, and standards. We propose the criterion audit as an operatio…
▽ More
An increasing number of regulations propose AI audits as a mechanism for achieving transparency and accountability for artificial intelligence (AI) systems. Despite some converging norms around various forms of AI auditing, auditing for the purpose of compliance and assurance currently lacks agreed-upon practices, procedures, taxonomies, and standards. We propose the criterion audit as an operationalizable compliance and assurance external audit framework. We model elements of this approach after financial auditing practices, and argue that AI audits should similarly provide assurance to their stakeholders about AI organizations' ability to govern their algorithms in ways that mitigate harms and uphold human values. We discuss the necessary conditions for the criterion audit and provide a procedural blueprint for performing an audit engagement in practice. We illustrate how this framework can be adapted to current regulations by deriving the criteria on which bias audits can be performed for in-scope hiring algorithms, as required by the recently effective New York City Local Law 144 of 2021. We conclude by offering a critical discussion on the benefits, inherent limitations, and implementation challenges of applying practices of the more mature financial auditing industry to AI auditing where robust guardrails against quality assurance issues are only starting to emerge. Our discussion -- informed by experiences in performing these audits in practice -- highlights the critical role that an audit ecosystem plays in ensuring the effectiveness of audits.
△ Less
Submitted 28 May, 2024; v1 submitted 26 January, 2024;
originally announced January 2024.
-
Location Agnostic Source-Free Domain Adaptive Learning to Predict Solar Power Generation
Authors:
Md Shazid Islam,
A S M Jahid Hasan,
Md Saydur Rahman,
Jubair Yusuf,
Md Saiful Islam Sajol,
Farhana Akter Tumpa
Abstract:
The prediction of solar power generation is a challenging task due to its dependence on climatic characteristics that exhibit spatial and temporal variability. The performance of a prediction model may vary across different places due to changes in data distribution, resulting in a model that works well in one region but not in others. Furthermore, as a consequence of global warming, there is a no…
▽ More
The prediction of solar power generation is a challenging task due to its dependence on climatic characteristics that exhibit spatial and temporal variability. The performance of a prediction model may vary across different places due to changes in data distribution, resulting in a model that works well in one region but not in others. Furthermore, as a consequence of global warming, there is a notable acceleration in the alteration of weather patterns on an annual basis. This phenomenon introduces the potential for diminished efficacy of existing models, even within the same geographical region, as time progresses. In this paper, a domain adaptive deep learning-based framework is proposed to estimate solar power generation using weather features that can solve the aforementioned challenges. A feed-forward deep convolutional network model is trained for a known location dataset in a supervised manner and utilized to predict the solar power of an unknown location later. This adaptive data-driven approach exhibits notable advantages in terms of computing speed, storage efficiency, and its ability to improve outcomes in scenarios where state-of-the-art non-adaptive methods fail. Our method has shown an improvement of $10.47 \%$, $7.44 \%$, $5.11\%$ in solar power prediction accuracy compared to best performing non-adaptive method for California (CA), Florida (FL) and New York (NY), respectively.
△ Less
Submitted 6 February, 2024; v1 submitted 23 January, 2024;
originally announced January 2024.
-
Pruning for Protection: Increasing Jailbreak Resistance in Aligned LLMs Without Fine-Tuning
Authors:
Adib Hasan,
Ileana Rugina,
Alex Wang
Abstract:
Large Language Models (LLMs) are susceptible to `jailbreaking' prompts, which can induce the generation of harmful content. This paper demonstrates that moderate WANDA pruning (Sun et al., 2023) can increase their resistance to such attacks without the need for fine-tuning, while maintaining performance on standard benchmarks. Our findings suggest that the benefits of pruning correlate with the in…
▽ More
Large Language Models (LLMs) are susceptible to `jailbreaking' prompts, which can induce the generation of harmful content. This paper demonstrates that moderate WANDA pruning (Sun et al., 2023) can increase their resistance to such attacks without the need for fine-tuning, while maintaining performance on standard benchmarks. Our findings suggest that the benefits of pruning correlate with the initial safety levels of the model, indicating a regularizing effect of WANDA pruning. We introduce a dataset of 225 harmful tasks across five categories to systematically evaluate this safety enhancement. We argue that safety improvements can be understood through a regularization perspective. First, we show that pruning helps LLMs focus more effectively on task-relevant tokens within jailbreaking prompts. Then, we analyze the effects of pruning on the perplexity of malicious prompts before and after their integration into jailbreak templates. Finally, we demonstrate statistically significant performance improvements under domain shifts when applying WANDA to linear models.
△ Less
Submitted 28 April, 2024; v1 submitted 19 January, 2024;
originally announced January 2024.
-
Robust Node Representation Learning via Graph Variational Diffusion Networks
Authors:
Jun Zhuang,
Mohammad Al Hasan
Abstract:
Node representation learning by using Graph Neural Networks (GNNs) has been widely explored. However, in recent years, compelling evidence has revealed that GNN-based node representation learning can be substantially deteriorated by delicately-crafted perturbations in a graph structure. To learn robust node representation in the presence of perturbations, various works have been proposed to safegu…
▽ More
Node representation learning by using Graph Neural Networks (GNNs) has been widely explored. However, in recent years, compelling evidence has revealed that GNN-based node representation learning can be substantially deteriorated by delicately-crafted perturbations in a graph structure. To learn robust node representation in the presence of perturbations, various works have been proposed to safeguard GNNs. Within these existing works, Bayesian label transition has been proven to be more effective, but this method is extensively reliant on a well-built prior distribution. The variational inference could address this limitation by sampling the latent node embedding from a Gaussian prior distribution. Besides, leveraging the Gaussian distribution (noise) in hidden layers is an appealing strategy to strengthen the robustness of GNNs. However, our experiments indicate that such a strategy can cause over-smoothing issues during node aggregation. In this work, we propose the Graph Variational Diffusion Network (GVDN), a new node encoder that effectively manipulates Gaussian noise to safeguard robustness on perturbed graphs while alleviating over-smoothing issues through two mechanisms: Gaussian diffusion and node embedding propagation. Thanks to these two mechanisms, our model can generate robust node embeddings for recovery. Specifically, we design a retraining mechanism using the generated node embedding to recover the performance of node classifications in the presence of perturbations. The experiments verify the effectiveness of our proposed model across six public datasets.
△ Less
Submitted 17 December, 2023;
originally announced December 2023.
-
MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training
Authors:
Hongwu Peng,
Xi Xie,
Kaustubh Shivdikar,
MD Amit Hasan,
Jiahui Zhao,
Shaoyi Huang,
Omer Khan,
David Kaeli,
Caiwen Ding
Abstract:
In the acceleration of deep neural network training, the GPU has become the mainstream platform. GPUs face substantial challenges on GNNs, such as workload imbalance and memory access irregularities, leading to underutilized hardware. Existing solutions such as PyG, DGL with cuSPARSE, and GNNAdvisor frameworks partially address these challenges but memory traffic is still significant.
We argue t…
▽ More
In the acceleration of deep neural network training, the GPU has become the mainstream platform. GPUs face substantial challenges on GNNs, such as workload imbalance and memory access irregularities, leading to underutilized hardware. Existing solutions such as PyG, DGL with cuSPARSE, and GNNAdvisor frameworks partially address these challenges but memory traffic is still significant.
We argue that drastic performance improvements can only be achieved by the vertical optimization of algorithm and system innovations, rather than treating the speedup optimization as an "after-thought" (i.e., (i) given a GNN algorithm, designing an accelerator, or (ii) given hardware, mainly optimizing the GNN algorithm). In this paper, we present MaxK-GNN, an advanced high-performance GPU training system integrating algorithm and system innovation. (i) We introduce the MaxK nonlinearity and provide a theoretical analysis of MaxK nonlinearity as a universal approximator, and present the Compressed Balanced Sparse Row (CBSR) format, designed to store the data and index of the feature matrix after nonlinearity; (ii) We design a coalescing enhanced forward computation with row-wise product-based SpGEMM Kernel using CBSR for input feature matrix fetching and strategic placement of a sparse output accumulation buffer in shared memory; (iii) We develop an optimized backward computation with outer product-based and SSpMM Kernel.
We conduct extensive evaluations of MaxK-GNN and report the end-to-end system run-time. Experiments show that MaxK-GNN system could approach the theoretical speedup limit according to Amdahl's law. We achieve comparable accuracy to SOTA GNNs, but at a significantly increased speed: 3.22/4.24 times speedup (vs. theoretical limits, 5.52/7.27 times) on Reddit compared to DGL and GNNAdvisor implementations.
△ Less
Submitted 18 March, 2024; v1 submitted 14 December, 2023;
originally announced December 2023.
-
Deep learning in computed tomography pulmonary angiography imaging: a dual-pronged approach for pulmonary embolism detection
Authors:
Fabiha Bushra,
Muhammad E. H. Chowdhury,
Rusab Sarmun,
Saidul Kabir,
Menatalla Said,
Sohaib Bassam Zoghoul,
Adam Mushtak,
Israa Al-Hashimi,
Abdulrahman Alqahtani,
Anwarul Hasan
Abstract:
The increasing reliance on Computed Tomography Pulmonary Angiography (CTPA) for Pulmonary Embolism (PE) diagnosis presents challenges and a pressing need for improved diagnostic solutions. The primary objective of this study is to leverage deep learning techniques to enhance the Computer Assisted Diagnosis (CAD) of PE. With this aim, we propose a classifier-guided detection approach that effective…
▽ More
The increasing reliance on Computed Tomography Pulmonary Angiography (CTPA) for Pulmonary Embolism (PE) diagnosis presents challenges and a pressing need for improved diagnostic solutions. The primary objective of this study is to leverage deep learning techniques to enhance the Computer Assisted Diagnosis (CAD) of PE. With this aim, we propose a classifier-guided detection approach that effectively leverages the classifier's probabilistic inference to direct the detection predictions, marking a novel contribution in the domain of automated PE diagnosis. Our classification system includes an Attention-Guided Convolutional Neural Network (AG-CNN) that uses local context by employing an attention mechanism. This approach emulates a human expert's attention by looking at both global appearances and local lesion regions before making a decision. The classifier demonstrates robust performance on the FUMPE dataset, achieving an AUROC of 0.927, sensitivity of 0.862, specificity of 0.879, and an F1-score of 0.805 with the Inception-v3 backbone architecture. Moreover, AG-CNN outperforms the baseline DenseNet-121 model, achieving an 8.1% AUROC gain. While previous research has mostly focused on finding PE in the main arteries, our use of cutting-edge object detection models and ensembling techniques greatly improves the accuracy of detecting small embolisms in the peripheral arteries. Finally, our proposed classifier-guided detection approach further refines the detection metrics, contributing new state-of-the-art to the community: mAP$_{50}$, sensitivity, and F1-score of 0.846, 0.901, and 0.779, respectively, outperforming the former benchmark with a significant 3.7% improvement in mAP$_{50}$. Our research aims to elevate PE patient care by integrating AI solutions into clinical workflows, highlighting the potential of human-AI collaboration in medical diagnostics.
△ Less
Submitted 5 January, 2024; v1 submitted 9 November, 2023;
originally announced November 2023.
-
BLP-2023 Task 2: Sentiment Analysis
Authors:
Md. Arid Hasan,
Firoj Alam,
Anika Anjum,
Shudipta Das,
Afiyat Anjum
Abstract:
We present an overview of the BLP Sentiment Shared Task, organized as part of the inaugural BLP 2023 workshop, co-located with EMNLP 2023. The task is defined as the detection of sentiment in a given piece of social media text. This task attracted interest from 71 participants, among whom 29 and 30 teams submitted systems during the development and evaluation phases, respectively. In total, partic…
▽ More
We present an overview of the BLP Sentiment Shared Task, organized as part of the inaugural BLP 2023 workshop, co-located with EMNLP 2023. The task is defined as the detection of sentiment in a given piece of social media text. This task attracted interest from 71 participants, among whom 29 and 30 teams submitted systems during the development and evaluation phases, respectively. In total, participants submitted 597 runs. However, a total of 15 teams submitted system description papers. The range of approaches in the submitted systems spans from classical machine learning models, fine-tuning pre-trained models, to leveraging Large Language Model (LLMs) in zero- and few-shot settings. In this paper, we provide a detailed account of the task setup, including dataset development and evaluation setup. Additionally, we provide a brief overview of the systems submitted by the participants. All datasets and evaluation scripts from the shared task have been made publicly available for the research community, to foster further research in this domain.
△ Less
Submitted 21 February, 2024; v1 submitted 24 October, 2023;
originally announced October 2023.
-
Atmospheric dispersion corrector for a multi-object spectroscopic mode of HROS-TMT
Authors:
Manjunath Bestha,
Amirul Hasan,
Devika Divakar,
Arun Surya,
S. Sriram,
T. Sivarani,
Ajin Prakash,
Parvathy M,
Sudharsan Yadav
Abstract:
Highly multiplexed spectroscopic surveys have changed the astronomy landscape in recent years. However, these surveys are limited to low and medium spectral resolution. High spectral resolution spectroscopy is often photon starved and will benefit from a large telescope aperture. Multiplexed high-resolution surveys require a wide field of view and a large aperture for a suitable large number of br…
▽ More
Highly multiplexed spectroscopic surveys have changed the astronomy landscape in recent years. However, these surveys are limited to low and medium spectral resolution. High spectral resolution spectroscopy is often photon starved and will benefit from a large telescope aperture. Multiplexed high-resolution surveys require a wide field of view and a large aperture for a suitable large number of bright targets. This requirement introduces several practical difficulties, especially for large telescopes, such as the future ELTs. Some of the challenges are the need for a wide field atmospheric dispersion corrector and to deal with the curved non-telecentric focal plane. Here, we present a concept of Multi-Object Spectroscopy (MOS) mode for TMT High-Resolution Optical Spectrograph (HROS), we have designed an atmospheric dispersion corrector for individual objects that fit inside a fiber positioner. We present the ZEMAX design and the performance of the atmospheric dispersion corrector for all elevations accessible by TMT.
△ Less
Submitted 24 October, 2023;
originally announced October 2023.
-
Optical Properties and Behavior of Whispering Gallery Mode Resonators in Complex Microsphere Configurations: Insights for Sensing and Information Processing Applications
Authors:
Yaser M. Banad,
Syed Mohammad Abid Hasan,
Sarah S. Sharif,
Georgios Veronis,
Manas Ranjan Gartia
Abstract:
Whispering gallery mode (WGM) resonators are garnering significant attention due to their unique characteristics and remarkable properties. When integrated with optical sensing and processing technology, WGM resonators offer numerous advantages, including compact size, high sensitivity, rapid response, and tunability. This paper comprehensively investigates the optical properties and behavior of W…
▽ More
Whispering gallery mode (WGM) resonators are garnering significant attention due to their unique characteristics and remarkable properties. When integrated with optical sensing and processing technology, WGM resonators offer numerous advantages, including compact size, high sensitivity, rapid response, and tunability. This paper comprehensively investigates the optical properties and behavior of WGMs in complex microsphere resonator configurations. The findings underscore the potential of WGMs in sensing applications and their role in advancing future optical information processing. The study explores the impact of configuration, size, excitation, polarization, and coupling effects on the WGMs properties. The paper provides crucial insights and valuable guidance for designing and optimizing microsphere resonator systems, enabling their realization for practical applications.
△ Less
Submitted 5 October, 2023;
originally announced October 2023.