Search | arXiv e-print repository

ARA-O-RAN: End-to-End Programmable O-RAN Living Lab for Agriculture and Rural Communities

Authors: Tianyi Zhang, Joshua Ofori Boateng, Taimoor UI Islam, Arsalan Ahmad, Hongwei Zhang, Daji Qiao

Abstract: As wireless networks evolve towards open architectures like O-RAN, testing, and integration platforms are crucial to address challenges like interoperability. This paper describes ARA-O-RAN, a novel O-RAN testbed established through the NSF Platforms for Advanced Wireless Research (PAWR) ARA platform. ARA provides an at-scale rural wireless living lab focused on technologies for digital agricultur… ▽ More As wireless networks evolve towards open architectures like O-RAN, testing, and integration platforms are crucial to address challenges like interoperability. This paper describes ARA-O-RAN, a novel O-RAN testbed established through the NSF Platforms for Advanced Wireless Research (PAWR) ARA platform. ARA provides an at-scale rural wireless living lab focused on technologies for digital agriculture and rural communities. As an O-RAN Alliance certified Open Testing and Integration Centre (OTIC), ARA launched ARA-O-RAN -- the first public O-RAN testbed tailored to rural and agriculture use cases, together with the end-to-end, whole-stack programmability. ARA-O-RAN uniquely combines support for outdoor testing across a university campus, surrounding farmlands, and rural communities with a 50-node indoor sandbox. The testbed facilitates vital R\&D to implement open architectures that can meet rural connectivity needs. The paper outlines ARA-O-RAN's hardware system design, software architecture, and enabled research experiments. It also discusses plans aligned with national spectrum policy and rural spectrum innovation. ARA-O-RAN exemplifies the value of purpose-built wireless testbeds in accelerating impactful wireless research. △ Less

Submitted 14 June, 2024; originally announced July 2024.

arXiv:2407.04561 [pdf, other]

Wireless Spectrum in Rural Farmlands: Status, Challenges and Opportunities

Authors: Mukaram Shahid, Kunal Das, Taimoor Ul Islam, Christ Somiah, Daji Qiao, Arsalan Ahmad, Jimming Song, Zhengyuan Zhu, Sarath Babu, Yong Guan, Tusher Chakraborty, Suraj Jog, Ranveer Chandra, Hongwei Zhang

Abstract: Due to factors such as low population density and expansive geographical distances, network deployment falls behind in rural regions, leading to a broadband divide. Wireless spectrum serves as the blood and flesh of wireless communications. Shared white spaces such as those in the TVWS and CBRS spectrum bands offer opportunities to expand connectivity, innovate, and provide affordable access to hi… ▽ More Due to factors such as low population density and expansive geographical distances, network deployment falls behind in rural regions, leading to a broadband divide. Wireless spectrum serves as the blood and flesh of wireless communications. Shared white spaces such as those in the TVWS and CBRS spectrum bands offer opportunities to expand connectivity, innovate, and provide affordable access to high-speed Internet in under-served areas without additional cost to expensive licensed spectrum. However, the current methods to utilize these white spaces are inefficient due to very conservative models and spectrum policies, causing under-utilization of valuable spectrum resources. This hampers the full potential of innovative wireless technologies that could benefit farmers, small Internet Service Providers (ISPs) or Mobile Network Operators (MNOs) operating in rural regions. This study explores the challenges faced by farmers and service providers when using shared spectrum bands to deploy their networks while ensuring maximum system performance and minimizing interference with other users. Additionally, we discuss how spatiotemporal spectrum models, in conjunction with database-driven spectrum-sharing solutions, can enhance the allocation and management of spectrum resources, ultimately improving the efficiency and reliability of wireless networks operating in shared spectrum bands. △ Less

Submitted 5 July, 2024; originally announced July 2024.

arXiv:2407.01426 [pdf, other]

Maximizing Blockchain Performance: Mitigating Conflicting Transactions through Parallelism and Dependency Management

Authors: Faisal Haque Bappy, Tarannum Shaila Zaman, Md Sajidul Islam Sajid, Mir Mehedi Ahsan Pritom, Tariqul Islam

Abstract: While blockchains initially gained popularity in the realm of cryptocurrencies, their widespread adoption is expanding beyond conventional applications, driven by the imperative need for enhanced data security. Despite providing a secure network, blockchains come with certain tradeoffs, including high latency, lower throughput, and an increased number of transaction failures. A pivotal issue contr… ▽ More While blockchains initially gained popularity in the realm of cryptocurrencies, their widespread adoption is expanding beyond conventional applications, driven by the imperative need for enhanced data security. Despite providing a secure network, blockchains come with certain tradeoffs, including high latency, lower throughput, and an increased number of transaction failures. A pivotal issue contributing to these challenges is the improper management of "conflicting transactions", commonly referred to as "contention". When a number of pending transactions within a blockchain collide with each other, this results in a state of contention. This situation worsens network latency, leads to the wastage of system resources, and ultimately contributes to reduced throughput and higher transaction failures. In response to this issue, in this work, we present a novel blockchain scheme that integrates transaction parallelism and an intelligent dependency manager aiming to reduce the occurrence of conflicting transactions within blockchain networks. In terms of effectiveness and efficiency, experimental results show that our scheme not only mitigates the challenges posed by conflicting transactions, but also outperforms both existing parallel and non-parallel Hyperledger Fabric blockchain networks achieving higher transaction success rate, throughput, and latency. The integration of our scheme with Hyperledger Fabric appears to be a promising solution for improving the overall performance and stability of blockchain networks in real-world applications. △ Less

Submitted 2 July, 2024; v1 submitted 1 July, 2024; originally announced July 2024.

arXiv:2405.17756 [pdf]

Motion-Informed Deep Learning for Brain MR Image Reconstruction Framework

Authors: Zhifeng Chen, Kamlesh Pawar, Kh Tohidul Islam, Himashi Peiris, Gary Egan, Zhaolin Chen

Abstract: Motion artifacts in Magnetic Resonance Imaging (MRI) are one of the frequently occurring artifacts due to patient movements during scanning. Motion is estimated to be present in approximately 30% of clinical MRI scans; however, motion has not been explicitly modeled within deep learning image reconstruction models. Deep learning (DL) algorithms have been demonstrated to be effective for both the i… ▽ More Motion artifacts in Magnetic Resonance Imaging (MRI) are one of the frequently occurring artifacts due to patient movements during scanning. Motion is estimated to be present in approximately 30% of clinical MRI scans; however, motion has not been explicitly modeled within deep learning image reconstruction models. Deep learning (DL) algorithms have been demonstrated to be effective for both the image reconstruction task and the motion correction task, but the two tasks are considered separately. The image reconstruction task involves removing undersampling artifacts such as noise and aliasing artifacts, whereas motion correction involves removing artifacts including blurring, ghosting, and ringing. In this work, we propose a novel method to simultaneously accelerate imaging and correct motion. This is achieved by integrating a motion module into the deep learning-based MRI reconstruction process, enabling real-time detection and correction of motion. We model motion as a tightly integrated auxiliary layer in the deep learning model during training, making the deep learning model 'motion-informed'. During inference, image reconstruction is performed from undersampled raw k-space data using a trained motion-informed DL model. Experimental results demonstrate that the proposed motion-informed deep learning image reconstruction network outperformed the conventional image reconstruction network for motion-degraded MRI datasets. △ Less

Submitted 27 May, 2024; originally announced May 2024.

Comments: 22 pages, 7 figures, 4 tables

arXiv:2405.05019 [pdf, other]

AI-based Dynamic Schedule Calculation in Time Sensitive Networks using GCN-TD3

Authors: Syed Tasnimul Islam, Anas Bin Muslim

Abstract: Offline scheduling in Time Sensitive Networking (TSN) utilizing the Time Aware Shaper (TAS) facilitates optimal deterministic latency and jitter-bounds calculation for Time- Triggered (TT) flows. However, the dynamic nature of traffic in industrial settings necessitates a strategy for adaptively scheduling flows without interrupting existing schedules. Our research identifies critical gaps in curr… ▽ More Offline scheduling in Time Sensitive Networking (TSN) utilizing the Time Aware Shaper (TAS) facilitates optimal deterministic latency and jitter-bounds calculation for Time- Triggered (TT) flows. However, the dynamic nature of traffic in industrial settings necessitates a strategy for adaptively scheduling flows without interrupting existing schedules. Our research identifies critical gaps in current dynamic scheduling methods for TSN and introduces the novel GCN-TD3 approach. This novel approach utilizes a Graph Convolutional Network (GCN) for representing the various relations within different components of TSN and employs the Twin Delayed Deep Deterministic Policy Gradient (TD3) algorithm to dynamically schedule any incoming flow. Additionally, an Integer Linear Programming (ILP) based offline scheduler is used both to initiate the simulation and serve as a fallback mechanism. This mechanism is triggered to recalculate the entire schedule when the predefined threshold of Gate Control List(GCL) length exceeds. Comparative analyses demonstrate that GCN-TD3 outperforms existing methods like Deep Double Q-Network (DDQN) and Deep Deterministic Policy Gradient (DDPG), exhibiting convergence within 4000 epochs with a 90\% dynamic TT flow admission rate while maintaining deadlines and reducing jitter to as low as 2us. Finally, two modules were developed for the OMNeT++ simulator, facilitating dynamic simulation to evaluate the methodology. △ Less

Submitted 8 May, 2024; originally announced May 2024.

Comments: Accepted article IFIP/IEEE Networking 2024 (Tensor)

arXiv:2404.17434 [pdf, other]

Exploring Wireless Channels in Rural Areas: A Comprehensive Measurement Study

Authors: Tianyi Zhang, Guoying Zu, Taimoor Ul Islam, Evan Gossling, Sarath Babu, Daji Qiao, Hongwei Zhang

Abstract: The study of wireless channel behavior has been an active research topic for many years. However, there exists a noticeable scarcity of studies focusing on wireless channel characteristics in rural areas. With the advancement of smart agriculture practices in rural regions, there has been an increasing demand for affordable, high-capacity, and low-latency wireless networks to support various preci… ▽ More The study of wireless channel behavior has been an active research topic for many years. However, there exists a noticeable scarcity of studies focusing on wireless channel characteristics in rural areas. With the advancement of smart agriculture practices in rural regions, there has been an increasing demand for affordable, high-capacity, and low-latency wireless networks to support various precision agriculture applications such as plant phenotyping, livestock health monitoring, and agriculture automation. To address this research gap, we conducted a channel measurement study on multiple wireless frequency bands at various crop and livestock farms near Ames, Iowa, based on Iowa State University~(ISU)'s ARA Wireless Living lab - one of the NSF PAWR platforms. We specifically investigate the impact of weather conditions, humidity, temperature, and farm buildings on wireless channel behavior. The resulting measurement dataset, which will soon be made publicly accessible, represents a valuable resource for researchers interested in wireless channel prediction and optimization. △ Less

Submitted 26 April, 2024; originally announced April 2024.

arXiv:2404.10259 [pdf, other]

Uncovering Latent Arguments in Social Media Messaging by Employing LLMs-in-the-Loop Strategy

Authors: Tunazzina Islam, Dan Goldwasser

Abstract: The widespread use of social media has led to a surge in popularity for automated methods of analyzing public opinion. Supervised methods are adept at text categorization, yet the dynamic nature of social media discussions poses a continual challenge for these techniques due to the constant shifting of the focus. On the other hand, traditional unsupervised methods for extracting themes from public… ▽ More The widespread use of social media has led to a surge in popularity for automated methods of analyzing public opinion. Supervised methods are adept at text categorization, yet the dynamic nature of social media discussions poses a continual challenge for these techniques due to the constant shifting of the focus. On the other hand, traditional unsupervised methods for extracting themes from public discourse, such as topic modeling, often reveal overarching patterns that might not capture specific nuances. Consequently, a significant portion of research into social media discourse still depends on labor-intensive manual coding techniques and a human-in-the-loop approach, which are both time-consuming and costly. In this work, we study the problem of discovering arguments associated with a specific theme. We propose a generic LLMs-in-the-Loop strategy that leverages the advanced capabilities of Large Language Models (LLMs) to extract latent arguments from social media messaging. To demonstrate our approach, we apply our framework to contentious topics. We use two publicly available datasets: (1) the climate campaigns dataset of 14k Facebook ads with 25 themes and (2) the COVID-19 vaccine campaigns dataset of 9k Facebook ads with 14 themes. Additionally, we design a downstream task as stance prediction by leveraging talking points in climate debates. Furthermore, we analyze demographic targeting and the adaptation of messaging based on real-world events. △ Less

Submitted 15 July, 2024; v1 submitted 15 April, 2024; originally announced April 2024.

arXiv:2404.04686 [pdf]

Predictive Modeling for Breast Cancer Classification in the Context of Bangladeshi Patients: A Supervised Machine Learning Approach with Explainable AI

Authors: Taminul Islam, Md. Alif Sheakh, Mst. Sazia Tahosin, Most. Hasna Hena, Shopnil Akash, Yousef A. Bin Jardan, Gezahign Fentahun Wondmie, Hiba-Allah Nafidi, Mohammed Bourhia

Abstract: Breast cancer has rapidly increased in prevalence in recent years, making it one of the leading causes of mortality worldwide. Among all cancers, it is by far the most common. Diagnosing this illness manually requires significant time and expertise. Since detecting breast cancer is a time-consuming process, preventing its further spread can be aided by creating machine-based forecasts. Machine lea… ▽ More Breast cancer has rapidly increased in prevalence in recent years, making it one of the leading causes of mortality worldwide. Among all cancers, it is by far the most common. Diagnosing this illness manually requires significant time and expertise. Since detecting breast cancer is a time-consuming process, preventing its further spread can be aided by creating machine-based forecasts. Machine learning and Explainable AI are crucial in classification as they not only provide accurate predictions but also offer insights into how the model arrives at its decisions, aiding in the understanding and trustworthiness of the classification results. In this study, we evaluate and compare the classification accuracy, precision, recall, and F-1 scores of five different machine learning methods using a primary dataset (500 patients from Dhaka Medical College Hospital). Five different supervised machine learning techniques, including decision tree, random forest, logistic regression, naive bayes, and XGBoost, have been used to achieve optimal results on our dataset. Additionally, this study applied SHAP analysis to the XGBoost model to interpret the model's predictions and understand the impact of each feature on the model's output. We compared the accuracy with which several algorithms classified the data, as well as contrasted with other literature in this field. After final evaluation, this study found that XGBoost achieved the best model accuracy, which is 97%. △ Less

Submitted 6 April, 2024; originally announced April 2024.

Comments: Accepted for the Scientific Reports (Nature) journal. 32 pages, 12 figures

arXiv:2404.01345 [pdf]

Enhancing Bangla Fake News Detection Using Bidirectional Gated Recurrent Units and Deep Learning Techniques

Authors: Utsha Roy, Mst. Sazia Tahosin, Md. Mahedi Hassan, Taminul Islam, Fahim Imtiaz, Md Rezwane Sadik, Yassine Maleh, Rejwan Bin Sulaiman, Md. Simul Hasan Talukder

Abstract: The rise of fake news has made the need for effective detection methods, including in languages other than English, increasingly important. The study aims to address the challenges of Bangla which is considered a less important language. To this end, a complete dataset containing about 50,000 news items is proposed. Several deep learning models have been tested on this dataset, including the bidir… ▽ More The rise of fake news has made the need for effective detection methods, including in languages other than English, increasingly important. The study aims to address the challenges of Bangla which is considered a less important language. To this end, a complete dataset containing about 50,000 news items is proposed. Several deep learning models have been tested on this dataset, including the bidirectional gated recurrent unit (GRU), the long short-term memory (LSTM), the 1D convolutional neural network (CNN), and hybrid architectures. For this research, we assessed the efficacy of the model utilizing a range of useful measures, including recall, precision, F1 score, and accuracy. This was done by employing a big application. We carry out comprehensive trials to show the effectiveness of these models in identifying bogus news in Bangla, with the Bidirectional GRU model having a stunning accuracy of 99.16%. Our analysis highlights the importance of dataset balance and the need for continual improvement efforts to a substantial degree. This study makes a major contribution to the creation of Bangla fake news detecting systems with limited resources, thereby setting the stage for future improvements in the detection process. △ Less

Submitted 31 March, 2024; originally announced April 2024.

Comments: Accepted for publication in the 7th International Conference on Networking, Intelligent Systems & Security. The conference Proceedings will be published by ACM International Conference Proceeding Series (ICPS) ISBN N°: 979-8-4007-0019-4. 8 pages, 11 figures

arXiv:2403.17093 [pdf, other]

Enhancing UAV Security Through Zero Trust Architecture: An Advanced Deep Learning and Explainable AI Analysis

Authors: Ekramul Haque, Kamrul Hasan, Imtiaz Ahmed, Md. Sahabul Alam, Tariqul Islam

Abstract: In the dynamic and ever-changing domain of Unmanned Aerial Vehicles (UAVs), the utmost importance lies in guaranteeing resilient and lucid security measures. This study highlights the necessity of implementing a Zero Trust Architecture (ZTA) to enhance the security of unmanned aerial vehicles (UAVs), hence departing from conventional perimeter defences that may expose vulnerabilities. The Zero Tru… ▽ More In the dynamic and ever-changing domain of Unmanned Aerial Vehicles (UAVs), the utmost importance lies in guaranteeing resilient and lucid security measures. This study highlights the necessity of implementing a Zero Trust Architecture (ZTA) to enhance the security of unmanned aerial vehicles (UAVs), hence departing from conventional perimeter defences that may expose vulnerabilities. The Zero Trust Architecture (ZTA) paradigm requires a rigorous and continuous process of authenticating all network entities and communications. The accuracy of our methodology in detecting and identifying unmanned aerial vehicles (UAVs) is 84.59\%. This is achieved by utilizing Radio Frequency (RF) signals within a Deep Learning framework, a unique method. Precise identification is crucial in Zero Trust Architecture (ZTA), as it determines network access. In addition, the use of eXplainable Artificial Intelligence (XAI) tools such as SHapley Additive exPlanations (SHAP) and Local Interpretable Model-agnostic Explanations (LIME) contributes to the improvement of the model's transparency and interpretability. Adherence to Zero Trust Architecture (ZTA) standards guarantees that the classifications of unmanned aerial vehicles (UAVs) are verifiable and comprehensible, enhancing security within the UAV field. △ Less

Submitted 25 March, 2024; originally announced March 2024.

Comments: 6 pages, 5 figures

arXiv:2403.10722 [pdf, other]

Cannabis Seed Variant Detection using Faster R-CNN

Authors: Toqi Tahamid Sarker, Taminul Islam, Khaled R Ahmed

Abstract: Analyzing and detecting cannabis seed variants is crucial for the agriculture industry. It enables precision breeding, allowing cultivators to selectively enhance desirable traits. Accurate identification of seed variants also ensures regulatory compliance, facilitating the cultivation of specific cannabis strains with defined characteristics, ultimately improving agricultural productivity and mee… ▽ More Analyzing and detecting cannabis seed variants is crucial for the agriculture industry. It enables precision breeding, allowing cultivators to selectively enhance desirable traits. Accurate identification of seed variants also ensures regulatory compliance, facilitating the cultivation of specific cannabis strains with defined characteristics, ultimately improving agricultural productivity and meeting diverse market demands. This paper presents a study on cannabis seed variant detection by employing a state-of-the-art object detection model Faster R-CNN. This study implemented the model on a locally sourced cannabis seed dataset in Thailand, comprising 17 distinct classes. We evaluate six Faster R-CNN models by comparing performance on various metrics and achieving a mAP score of 94.08\% and an F1 score of 95.66\%. This paper presents the first known application of deep neural network object detection models to the novel task of visually identifying cannabis seed types. △ Less

Submitted 15 March, 2024; originally announced March 2024.

Comments: 6 pages, 2 figures, this has been submitted and accepted for publication at IEEE - ICACCS 2024

arXiv:2403.10707 [pdf, other]

Discovering Latent Themes in Social Media Messaging: A Machine-in-the-Loop Approach Integrating LLMs

Authors: Tunazzina Islam, Dan Goldwasser

Abstract: Grasping the themes of social media content is key to understanding the narratives that influence public opinion and behavior. The thematic analysis goes beyond traditional topic-level analysis, which often captures only the broadest patterns, providing deeper insights into specific and actionable themes such as "public sentiment towards vaccination", "political discourse surrounding climate polic… ▽ More Grasping the themes of social media content is key to understanding the narratives that influence public opinion and behavior. The thematic analysis goes beyond traditional topic-level analysis, which often captures only the broadest patterns, providing deeper insights into specific and actionable themes such as "public sentiment towards vaccination", "political discourse surrounding climate policies," etc. In this paper, we introduce a novel approach to uncovering latent themes in social media messaging. Recognizing the limitations of the traditional topic-level analysis, which tends to capture only overarching patterns, this study emphasizes the need for a finer-grained, theme-focused exploration. Traditional theme discovery methods typically involve manual processes and a human-in-the-loop approach. While valuable, these methods face challenges in scalability, consistency, and resource intensity in terms of time and cost. To address these challenges, we propose a machine-in-the-loop approach that leverages the advanced capabilities of Large Language Models (LLMs). To demonstrate our approach, we apply our framework to contentious topics, such as climate debate and vaccine debate. We use two publicly available datasets: (1) the climate campaigns dataset of 21k Facebook ads and (2) the COVID-19 vaccine campaigns dataset of 9k Facebook ads. Our quantitative and qualitative analysis shows that our methodology yields more accurate and interpretable results compared to the baselines. Our results not only demonstrate the effectiveness of our approach in uncovering latent themes but also illuminate how these themes are tailored for demographic targeting in social media contexts. Additionally, our work sheds light on the dynamic nature of social media, revealing the shifts in the thematic focus of messaging in response to real-world events. △ Less

Submitted 15 July, 2024; v1 submitted 15 March, 2024; originally announced March 2024.

Comments: Accepted at 19th International AAAI Conference on Web and Social Media (ICWSM-2025)

arXiv:2403.09836 [pdf, other]

Empowering Healthcare through Privacy-Preserving MRI Analysis

Authors: Al Amin, Kamrul Hasan, Saleh Zein-Sabatto, Deo Chimba, Liang Hong, Imtiaz Ahmed, Tariqul Islam

Abstract: In the healthcare domain, Magnetic Resonance Imaging (MRI) assumes a pivotal role, as it employs Artificial Intelligence (AI) and Machine Learning (ML) methodologies to extract invaluable insights from imaging data. Nonetheless, the imperative need for patient privacy poses significant challenges when collecting data from diverse healthcare sources. Consequently, the Deep Learning (DL) communities… ▽ More In the healthcare domain, Magnetic Resonance Imaging (MRI) assumes a pivotal role, as it employs Artificial Intelligence (AI) and Machine Learning (ML) methodologies to extract invaluable insights from imaging data. Nonetheless, the imperative need for patient privacy poses significant challenges when collecting data from diverse healthcare sources. Consequently, the Deep Learning (DL) communities occasionally face difficulties detecting rare features. In this research endeavor, we introduce the Ensemble-Based Federated Learning (EBFL) Framework, an innovative solution tailored to address this challenge. The EBFL framework deviates from the conventional approach by emphasizing model features over sharing sensitive patient data. This unique methodology fosters a collaborative and privacy-conscious environment for healthcare institutions, empowering them to harness the capabilities of a centralized server for model refinement while upholding the utmost data privacy standards.Conversely, a robust ensemble architecture boasts potent feature extraction capabilities, distinguishing itself from a single DL model. This quality makes it remarkably dependable for MRI analysis. By harnessing our groundbreaking EBFL methodology, we have achieved remarkable precision in the classification of brain tumors, including glioma, meningioma, pituitary, and non-tumor instances, attaining a precision rate of 94% for the Global model and an impressive 96% for the Ensemble model. Our models underwent rigorous evaluation using conventional performance metrics such as Accuracy, Precision, Recall, and F1 Score. Integrating DL within the Federated Learning (FL) framework has yielded a methodology that offers precise and dependable diagnostics for detecting brain tumors. △ Less

Submitted 14 March, 2024; originally announced March 2024.

Comments: 6

arXiv:2403.04130 [pdf, other]

An Explainable AI Framework for Artificial Intelligence of Medical Things

Authors: Al Amin, Kamrul Hasan, Saleh Zein-Sabatto, Deo Chimba, Imtiaz Ahmed, Tariqul Islam

Abstract: The healthcare industry has been revolutionized by the convergence of Artificial Intelligence of Medical Things (AIoMT), allowing advanced data-driven solutions to improve healthcare systems. With the increasing complexity of Artificial Intelligence (AI) models, the need for Explainable Artificial Intelligence (XAI) techniques become paramount, particularly in the medical domain, where transparent… ▽ More The healthcare industry has been revolutionized by the convergence of Artificial Intelligence of Medical Things (AIoMT), allowing advanced data-driven solutions to improve healthcare systems. With the increasing complexity of Artificial Intelligence (AI) models, the need for Explainable Artificial Intelligence (XAI) techniques become paramount, particularly in the medical domain, where transparent and interpretable decision-making becomes crucial. Therefore, in this work, we leverage a custom XAI framework, incorporating techniques such as Local Interpretable Model-Agnostic Explanations (LIME), SHapley Additive exPlanations (SHAP), and Gradient-weighted Class Activation Mapping (Grad-Cam), explicitly designed for the domain of AIoMT. The proposed framework enhances the effectiveness of strategic healthcare methods and aims to instill trust and promote understanding in AI-driven medical applications. Moreover, we utilize a majority voting technique that aggregates predictions from multiple convolutional neural networks (CNNs) and leverages their collective intelligence to make robust and accurate decisions in the healthcare system. Building upon this decision-making process, we apply the XAI framework to brain tumor detection as a use case demonstrating accurate and transparent diagnosis. Evaluation results underscore the exceptional performance of the XAI framework, achieving high precision, recall, and F1 scores with a training accuracy of 99% and a validation accuracy of 98%. Combining advanced XAI techniques with ensemble-based deep-learning (DL) methodologies allows for precise and reliable brain tumor diagnoses as an application of AIoMT. △ Less

Submitted 6 March, 2024; originally announced March 2024.

Comments: 7 pages, 8 figures

arXiv:2401.10799 [pdf, other]

Novel Representation Learning Technique using Graphs for Performance Analytics

Authors: Tarek Ramadan, Ankur Lahiry, Tanzima Z. Islam

Abstract: The performance analytics domain in High Performance Computing (HPC) uses tabular data to solve regression problems, such as predicting the execution time. Existing Machine Learning (ML) techniques leverage the correlations among features given tabular datasets, not leveraging the relationships between samples directly. Moreover, since high-quality embeddings from raw features improve the fidelity… ▽ More The performance analytics domain in High Performance Computing (HPC) uses tabular data to solve regression problems, such as predicting the execution time. Existing Machine Learning (ML) techniques leverage the correlations among features given tabular datasets, not leveraging the relationships between samples directly. Moreover, since high-quality embeddings from raw features improve the fidelity of the downstream predictive models, existing methods rely on extensive feature engineering and pre-processing steps, costing time and manual effort. To fill these two gaps, we propose a novel idea of transforming tabular performance data into graphs to leverage the advancement of Graph Neural Network-based (GNN) techniques in capturing complex relationships between features and samples. In contrast to other ML application domains, such as social networks, the graph is not given; instead, we need to build it. To address this gap, we propose graph-building methods where nodes represent samples, and the edges are automatically inferred iteratively based on the similarity between the features in the samples. We evaluate the effectiveness of the generated embeddings from GNNs based on how well they make even a simple feed-forward neural network perform for regression tasks compared to other state-of-the-art representation learning techniques. Our evaluation demonstrates that even with up to 25% random missing values for each dataset, our method outperforms commonly used graph and Deep Neural Network (DNN)-based approaches and achieves up to 61.67% & 78.56% improvement in MSE loss over the DNN baseline respectively for HPC dataset and Machine Learning Datasets. △ Less

Submitted 19 January, 2024; originally announced January 2024.

Comments: This paper has been accepted at 22nd International Conference on Machine Learning and Applications (ICMLA2023)

arXiv:2401.08953 [pdf, other]

An Efficient and Scalable Auditing Scheme for Cloud Data Storage using an Enhanced B-tree

Authors: Tariqul Islam, Faisal Haque Bappy, Md Nafis Ul Haque Shifat, Farhan Ahmad, Kamrul Hasan, Tarannum Shaila Zaman

Abstract: An efficient, scalable, and provably secure dynamic auditing scheme is highly desirable in the cloud storage environment for verifying the integrity of the outsourced data. Most of the existing work on remote integrity checking focuses on static archival data and therefore cannot be applied to cases where dynamic data updates are more common. Additionally, existing auditing schemes suffer from per… ▽ More An efficient, scalable, and provably secure dynamic auditing scheme is highly desirable in the cloud storage environment for verifying the integrity of the outsourced data. Most of the existing work on remote integrity checking focuses on static archival data and therefore cannot be applied to cases where dynamic data updates are more common. Additionally, existing auditing schemes suffer from performance bottlenecks and scalability issues. To address these issues, in this paper, we present a novel dynamic auditing scheme for centralized cloud environments leveraging an enhanced version of the B-tree. Our proposed scheme achieves the immutable characteristic of a decentralized system (i.e., blockchain technology) while effectively addressing the synchronization and performance challenges of such systems. Unlike other static auditing schemes, our scheme supports dynamic insert, update, and delete operations. Also, by leveraging an enhanced B-tree, our scheme maintains a balanced tree after any alteration to a certain file, improving performance significantly. Experimental results show that our scheme outperforms both traditional Merkle Hash Tree-based centralized auditing and decentralized blockchain-based auditing schemes in terms of block modifications (e.g., insert, delete, update), block retrieval, and data verification time. △ Less

Submitted 16 January, 2024; originally announced January 2024.

arXiv:2401.07035 [pdf, other]

Causative Insights into Open Source Software Security using Large Language Code Embeddings and Semantic Vulnerability Graph

Authors: Nafis Tanveer Islam, Gonzalo De La Torre Parra, Dylan Manual, Murtuza Jadliwala, Peyman Najafirad

Abstract: Open Source Software (OSS) security and resilience are worldwide phenomena hampering economic and technological innovation. OSS vulnerabilities can cause unauthorized access, data breaches, network disruptions, and privacy violations, rendering any benefits worthless. While recent deep-learning techniques have shown great promise in identifying and localizing vulnerabilities in source code, it is… ▽ More Open Source Software (OSS) security and resilience are worldwide phenomena hampering economic and technological innovation. OSS vulnerabilities can cause unauthorized access, data breaches, network disruptions, and privacy violations, rendering any benefits worthless. While recent deep-learning techniques have shown great promise in identifying and localizing vulnerabilities in source code, it is unclear how effective these research techniques are from a usability perspective due to a lack of proper methodological analysis. Usually, these methods offload a developer's task of classifying and localizing vulnerable code; still, a reasonable study to measure the actual effectiveness of these systems to the end user has yet to be conducted. To address the challenge of proper developer training from the prior methods, we propose a system to link vulnerabilities to their root cause, thereby intuitively educating the developers to code more securely. Furthermore, we provide a comprehensive usability study to test the effectiveness of our system in fixing vulnerabilities and its capability to assist developers in writing more secure code. We demonstrate the effectiveness of our system by showing its efficacy in helping developers fix source code with vulnerabilities. Our study shows a 24% improvement in code repair capabilities compared to previous methods. We also show that, when trained by our system, on average, approximately 9% of the developers naturally tend to write more secure code with fewer vulnerabilities. △ Less

Submitted 13 January, 2024; originally announced January 2024.

arXiv:2401.07031 [pdf, other]

Code Security Vulnerability Repair Using Reinforcement Learning with Large Language Models

Authors: Nafis Tanveer Islam, Mohammad Bahrami Karkevandi, Peyman Najafirad

Abstract: With the recent advancement of Large Language Models (LLMs), generating functionally correct code has become less complicated for a wide array of developers. While using LLMs has sped up the functional development process, it poses a heavy risk to code security. Code generation with proper security measures using LLM is a significantly more challenging task than functional code generation. Securit… ▽ More With the recent advancement of Large Language Models (LLMs), generating functionally correct code has become less complicated for a wide array of developers. While using LLMs has sped up the functional development process, it poses a heavy risk to code security. Code generation with proper security measures using LLM is a significantly more challenging task than functional code generation. Security measures may include adding a pair of lines of code with the original code, consisting of null pointer checking or prepared statements for SQL injection prevention. Currently, available code repair LLMs generate code repair by supervised fine-tuning, where the model looks at cross-entropy loss. However, the original and repaired codes are mostly similar in functionality and syntactically, except for a few (1-2) lines, which act as security measures. This imbalance between the lines needed for security measures and the functional code enforces the supervised fine-tuned model to prioritize generating functional code without adding proper security measures, which also benefits the model by resulting in minimal loss. Therefore, in this work, for security hardening and strengthening of generated code from LLMs, we propose a reinforcement learning-based method for program-specific repair with the combination of semantic and syntactic reward mechanisms that focus heavily on adding security and functional measures in the code, respectively. △ Less

Submitted 30 January, 2024; v1 submitted 13 January, 2024; originally announced January 2024.

arXiv:2401.03374 [pdf, other]

LLM-Powered Code Vulnerability Repair with Reinforcement Learning and Semantic Reward

Authors: Nafis Tanveer Islam, Joseph Khoury, Andrew Seong, Mohammad Bahrami Karkevandi, Gonzalo De La Torre Parra, Elias Bou-Harb, Peyman Najafirad

Abstract: In software development, the predominant emphasis on functionality often supersedes security concerns, a trend gaining momentum with AI-driven automation tools like GitHub Copilot. These tools significantly improve developers' efficiency in functional code development. Nevertheless, it remains a notable concern that such tools are also responsible for creating insecure code, predominantly because… ▽ More In software development, the predominant emphasis on functionality often supersedes security concerns, a trend gaining momentum with AI-driven automation tools like GitHub Copilot. These tools significantly improve developers' efficiency in functional code development. Nevertheless, it remains a notable concern that such tools are also responsible for creating insecure code, predominantly because of pre-training on publicly available repositories with vulnerable code. Moreover, developers are called the "weakest link in the chain" since they have very minimal knowledge of code security. Although existing solutions provide a reasonable solution to vulnerable code, they must adequately describe and educate the developers on code security to ensure that the security issues are not repeated. Therefore we introduce a multipurpose code vulnerability analysis system \texttt{SecRepair}, powered by a large language model, CodeGen2 assisting the developer in identifying and generating fixed code along with a complete description of the vulnerability with a code comment. Our innovative methodology uses a reinforcement learning paradigm to generate code comments augmented by a semantic reward mechanism. Inspired by how humans fix code issues, we propose an instruction-based dataset suitable for vulnerability analysis with LLMs. We further identify zero-day and N-day vulnerabilities in 6 Open Source IoT Operating Systems on GitHub. Our findings underscore that incorporating reinforcement learning coupled with semantic reward augments our model's performance, thereby fortifying its capacity to address code vulnerabilities with improved efficacy. △ Less

Submitted 21 February, 2024; v1 submitted 6 January, 2024; originally announced January 2024.

arXiv:2312.09123 [pdf, other]

MRL-PoS: A Multi-agent Reinforcement Learning based Proof of Stake Consensus Algorithm for Blockchain

Authors: Tariqul Islam, Faisal Haque Bappy, Tarannum Shaila Zaman, Md Sajidul Islam Sajid, Mir Mehedi Ahsan Pritom

Abstract: The core of a blockchain network is its consensus algorithm. Starting with the Proof-of-Work, there have been various versions of consensus algorithms, such as Proof-of-Stake (PoS), Proof-of-Authority (PoA), and Practical Byzantine Fault Tolerance (PBFT). Each of these algorithms focuses on different aspects to ensure efficient and reliable processing of transactions. Blockchain operates in a dece… ▽ More The core of a blockchain network is its consensus algorithm. Starting with the Proof-of-Work, there have been various versions of consensus algorithms, such as Proof-of-Stake (PoS), Proof-of-Authority (PoA), and Practical Byzantine Fault Tolerance (PBFT). Each of these algorithms focuses on different aspects to ensure efficient and reliable processing of transactions. Blockchain operates in a decentralized manner where there is no central authority and the network is composed of diverse users. This openness creates the potential for malicious nodes to disrupt the network in various ways. Therefore, it is crucial to embed a mechanism within the blockchain network to constantly monitor, identify, and eliminate these malicious nodes. However, there is no one-size-fits-all mechanism to identify all malicious nodes. Hence, the dynamic adaptability of the blockchain network is important to maintain security and reliability at all times. This paper introduces MRL-PoS, a Proof-of-Stake consensus algorithm based on multi-agent reinforcement learning. MRL-PoS employs reinforcement learning for dynamically adjusting to the behavior of all users. It incorporates a system of rewards and penalties to eliminate malicious nodes and incentivize honest ones. Additionally, MRL-PoS has the capability to learn and respond to new malicious tactics by continually training its agents. △ Less

Submitted 14 December, 2023; originally announced December 2023.

arXiv:2312.08309 [pdf, other]

FASTEN: Towards a FAult-tolerant and STorage EfficieNt Cloud: Balancing Between Replication and Deduplication

Authors: Sabbir Ahmed, Md Nahiduzzaman, Tariqul Islam, Faisal Haque Bappy, Tarannum Shaila Zaman, Raiful Hasan

Abstract: With the surge in cloud storage adoption, enterprises face challenges managing data duplication and exponential data growth. Deduplication mitigates redundancy, yet maintaining redundancy ensures high availability, incurring storage costs. Balancing these aspects is a significant research concern. We propose FASTEN, a distributed cloud storage scheme ensuring efficiency, security, and high availab… ▽ More With the surge in cloud storage adoption, enterprises face challenges managing data duplication and exponential data growth. Deduplication mitigates redundancy, yet maintaining redundancy ensures high availability, incurring storage costs. Balancing these aspects is a significant research concern. We propose FASTEN, a distributed cloud storage scheme ensuring efficiency, security, and high availability. FASTEN achieves fault tolerance by dispersing data subsets optimally across servers and maintains redundancy for high availability. Experimental results show FASTEN's effectiveness in fault tolerance, cost reduction, batch auditing, and file and block-level deduplication. It outperforms existing systems with low time complexity, strong fault tolerance, and commendable deduplication performance. △ Less

Submitted 13 December, 2023; originally announced December 2023.

arXiv:2312.08305 [pdf, other]

ConChain: A Scheme for Contention-free and Attack Resilient BlockChain

Authors: Faisal Haque Bappy, Tariqul Islam, Tarannum Shaila Zaman, Md Sajidul Islam Sajid, Mir Mehedi Ahsan Pritom

Abstract: Although blockchains have become widely popular for their use in cryptocurrencies, they are now becoming pervasive as more traditional applications adopt blockchain to ensure data security. Despite being a secured network, blockchains have some tradeoffs such as high latency, low throughput, and transaction failures. One of the core problems behind these is improper management of "conflicting tran… ▽ More Although blockchains have become widely popular for their use in cryptocurrencies, they are now becoming pervasive as more traditional applications adopt blockchain to ensure data security. Despite being a secured network, blockchains have some tradeoffs such as high latency, low throughput, and transaction failures. One of the core problems behind these is improper management of "conflicting transactions", which is also known as "contention". When there is a large pool of pending transactions in a blockchain and some of them are conflicting, a situation of contention occurs, and as a result, the latency of the network increases, and a substantial amount of resources are wasted which results in low throughput and transaction failures. In this paper, we proposed ConChain, a novel blockchain scheme that combines transaction parallelism and an intelligent dependency manager to minimize conflicting transactions in blockchain networks as well as improve performance. ConChain is also capable of ensuring proper defense against major attacks due to contention. △ Less

Submitted 13 December, 2023; originally announced December 2023.

arXiv:2311.05435 [pdf]

Parkinson's Disease Detection through Vocal Biomarkers and Advanced Machine Learning Algorithms

Authors: Md Abu Sayed, Maliha Tayaba, MD Tanvir Islam, Md Eyasin Ul Islam Pavel, Md Tuhin Mia, Eftekhar Hossain Ayon, Nur Nob, Bishnu Padh Ghosh

Abstract: Parkinson's disease (PD) is a prevalent neurodegenerative disorder known for its impact on motor neurons, causing symptoms like tremors, stiffness, and gait difficulties. This study explores the potential of vocal feature alterations in PD patients as a means of early disease prediction. This research aims to predict the onset of Parkinson's disease. Utilizing a variety of advanced machine-learnin… ▽ More Parkinson's disease (PD) is a prevalent neurodegenerative disorder known for its impact on motor neurons, causing symptoms like tremors, stiffness, and gait difficulties. This study explores the potential of vocal feature alterations in PD patients as a means of early disease prediction. This research aims to predict the onset of Parkinson's disease. Utilizing a variety of advanced machine-learning algorithms, including XGBoost, LightGBM, Bagging, AdaBoost, and Support Vector Machine, among others, the study evaluates the predictive performance of these models using metrics such as accuracy, area under the curve (AUC), sensitivity, and specificity. The findings of this comprehensive analysis highlight LightGBM as the most effective model, achieving an impressive accuracy rate of 96% alongside a matching AUC of 96%. LightGBM exhibited a remarkable sensitivity of 100% and specificity of 94.43%, surpassing other machine learning algorithms in accuracy and AUC scores. Given the complexities of Parkinson's disease and its challenges in early diagnosis, this study underscores the significance of leveraging vocal biomarkers coupled with advanced machine-learning techniques for precise and timely PD detection. △ Less

Submitted 2 December, 2023; v1 submitted 9 November, 2023; originally announced November 2023.

arXiv:2311.03196 [pdf, other]

Pseudo-Labeling for Domain-Agnostic Bangla Automatic Speech Recognition

Authors: Rabindra Nath Nandi, Mehadi Hasan Menon, Tareq Al Muntasir, Sagor Sarker, Quazi Sarwar Muhtaseem, Md. Tariqul Islam, Shammur Absar Chowdhury, Firoj Alam

Abstract: One of the major challenges for developing automatic speech recognition (ASR) for low-resource languages is the limited access to labeled data with domain-specific variations. In this study, we propose a pseudo-labeling approach to develop a large-scale domain-agnostic ASR dataset. With the proposed methodology, we developed a 20k+ hours labeled Bangla speech dataset covering diverse topics, speak… ▽ More One of the major challenges for developing automatic speech recognition (ASR) for low-resource languages is the limited access to labeled data with domain-specific variations. In this study, we propose a pseudo-labeling approach to develop a large-scale domain-agnostic ASR dataset. With the proposed methodology, we developed a 20k+ hours labeled Bangla speech dataset covering diverse topics, speaking styles, dialects, noisy environments, and conversational scenarios. We then exploited the developed corpus to design a conformer-based ASR system. We benchmarked the trained ASR with publicly available datasets and compared it with other available models. To investigate the efficacy, we designed and developed a human-annotated domain-agnostic test set composed of news, telephony, and conversational data among others. Our results demonstrate the efficacy of the model trained on psuedo-label data for the designed test-set along with publicly-available Bangla datasets. The experimental resources will be publicly available.(https://github.com/hishab-nlp/Pseudo-Labeling-for-Domain-Agnostic-Bangla-ASR) △ Less

Submitted 6 November, 2023; originally announced November 2023.

Comments: Accepted at BLP-2023 (at EMNLP 2023), ASR, low-resource, out-of-distribution, domain-agnostic

MSC Class: 68T50 ACM Class: F.2.2; I.2.7

arXiv:2310.13654 [pdf]

An experimental study for early diagnosing Parkinson's disease using machine learning

Authors: Md. Taufiqul Haque Khan Tusar, Md. Touhidul Islam, Abul Hasnat Sakil

Abstract: One of the most catastrophic neurological disorders worldwide is Parkinson's Disease. Along with it, the treatment is complicated and abundantly expensive. The only effective action to control the progression is diagnosing it in the early stage. However, this is challenging because early detection necessitates a large and complex clinical study. This experimental work used Machine Learning techniq… ▽ More One of the most catastrophic neurological disorders worldwide is Parkinson's Disease. Along with it, the treatment is complicated and abundantly expensive. The only effective action to control the progression is diagnosing it in the early stage. However, this is challenging because early detection necessitates a large and complex clinical study. This experimental work used Machine Learning techniques to automate the early detection of Parkinson's Disease from clinical characteristics, voice features and motor examination. In this study, we develop ML models utilizing a public dataset of 130 individuals, 30 of whom are untreated Parkinson's Disease patients, 50 of whom are Rapid Eye Movement Sleep Behaviour Disorder patients who are at a greater risk of contracting Parkinson's Disease, and 50 of whom are Healthy Controls. We use MinMax Scaler to rescale the data points, Local Outlier Factor to remove outliers, and SMOTE to balance existing class frequency. Afterwards, apply a number of Machine Learning techniques. We implement the approaches in such a way that data leaking and overfitting are not possible. Finally, obtained 100% accuracy in classifying PD and RBD patients, as well as 92% accuracy in classifying PD and HC individuals. △ Less

Submitted 20 October, 2023; originally announced October 2023.

Comments: 12 pages, 9 figures, 5 tables

arXiv:2310.13108 [pdf, other]

doi 10.1109/SIST58284.2023.10223507

Streamlining Brain Tumor Classification with Custom Transfer Learning in MRI Images

Authors: Javed Hossain, Md. Touhidul Islam, Md. Taufiqul Haque Khan Tusar

Abstract: Brain tumors are increasingly prevalent, characterized by the uncontrolled spread of aberrant tissues in the brain, with almost 700,000 new cases diagnosed globally each year. Magnetic Resonance Imaging (MRI) is commonly used for the diagnosis of brain tumors and accurate classification is a critical clinical procedure. In this study, we propose an efficient solution for classifying brain tumors f… ▽ More Brain tumors are increasingly prevalent, characterized by the uncontrolled spread of aberrant tissues in the brain, with almost 700,000 new cases diagnosed globally each year. Magnetic Resonance Imaging (MRI) is commonly used for the diagnosis of brain tumors and accurate classification is a critical clinical procedure. In this study, we propose an efficient solution for classifying brain tumors from MRI images using custom transfer learning networks. While several researchers have employed various pre-trained architectures such as RESNET-50, ALEXNET, VGG-16, and VGG-19, these methods often suffer from high computational complexity. To address this issue, we present a custom and lightweight model using a Convolutional Neural Network-based pre-trained architecture with reduced complexity. Specifically, we employ the VGG-19 architecture with additional hidden layers, which reduces the complexity of the base architecture but improves computational efficiency. The objective is to achieve high classification accuracy using a novel approach. Finally, the result demonstrates a classification accuracy of 96.42%. △ Less

Submitted 19 October, 2023; originally announced October 2023.

Comments: 6 pages, 9 figures, 4 tables

Journal ref: 2023 IEEE International Conference on Smart Information Systems and Technologies (SIST), Astana, Kazakhstan, 2023, pp. 522-526

arXiv:2310.00106 [pdf, other]

FashionFlow: Leveraging Diffusion Models for Dynamic Fashion Video Synthesis from Static Imagery

Authors: Tasin Islam, Alina Miron, XiaoHui Liu, Yongmin Li

Abstract: Our study introduces a new image-to-video generator called FashionFlow to generate fashion videos. By utilising a diffusion model, we are able to create short videos from still fashion images. Our approach involves developing and connecting relevant components with the diffusion model, which results in the creation of high-fidelity videos that are aligned with the conditional image. The components… ▽ More Our study introduces a new image-to-video generator called FashionFlow to generate fashion videos. By utilising a diffusion model, we are able to create short videos from still fashion images. Our approach involves developing and connecting relevant components with the diffusion model, which results in the creation of high-fidelity videos that are aligned with the conditional image. The components include the use of pseudo-3D convolutional layers to generate videos efficiently. VAE and CLIP encoders capture vital characteristics from still images to condition the diffusion model at a global level. Our research demonstrates a successful synthesis of fashion videos featuring models posing from various angles, showcasing the fit and appearance of the garment. Our findings hold great promise for improving and enhancing the shopping experience for the online fashion industry. △ Less

Submitted 20 January, 2024; v1 submitted 29 September, 2023; originally announced October 2023.

arXiv:2309.13147 [pdf, other]

Cardiovascular Disease Risk Prediction via Social Media

Authors: Al Zadid Sultan Bin Habib, Md Asif Bin Syed, Md Tanvirul Islam, Donald A. Adjeroh

Abstract: Researchers use Twitter and sentiment analysis to predict Cardiovascular Disease (CVD) risk. We developed a new dictionary of CVD-related keywords by analyzing emotions expressed in tweets. Tweets from eighteen US states, including the Appalachian region, were collected. Using the VADER model for sentiment analysis, users were classified as potentially at CVD risk. Machine Learning (ML) models wer… ▽ More Researchers use Twitter and sentiment analysis to predict Cardiovascular Disease (CVD) risk. We developed a new dictionary of CVD-related keywords by analyzing emotions expressed in tweets. Tweets from eighteen US states, including the Appalachian region, were collected. Using the VADER model for sentiment analysis, users were classified as potentially at CVD risk. Machine Learning (ML) models were employed to classify individuals' CVD risk and applied to a CDC dataset with demographic information to make the comparison. Performance evaluation metrics such as Test Accuracy, Precision, Recall, F1 score, Mathew's Correlation Coefficient (MCC), and Cohen's Kappa (CK) score were considered. Results demonstrated that analyzing tweets' emotions surpassed the predictive power of demographic data alone, enabling the identification of individuals at potential risk of developing CVD. This research highlights the potential of Natural Language Processing (NLP) and ML techniques in using tweets to identify individuals with CVD risks, providing an alternative approach to traditional demographic information for public health monitoring. △ Less

Submitted 28 September, 2023; v1 submitted 22 September, 2023; originally announced September 2023.

Comments: 9 pages, 3 figures, 16th International Conference on Social Computing, Behavioral-Cultural Modeling & Prediction and Behavior Representation in Modeling and Simulation (SBP-BRiMS 2023)

arXiv:2309.05227 [pdf, other]

Detecting Natural Language Biases with Prompt-based Learning

Authors: Md Abdul Aowal, Maliha T Islam, Priyanka Mary Mammen, Sandesh Shetty

Abstract: In this project, we want to explore the newly emerging field of prompt engineering and apply it to the downstream task of detecting LM biases. More concretely, we explore how to design prompts that can indicate 4 different types of biases: (1) gender, (2) race, (3) sexual orientation, and (4) religion-based. Within our project, we experiment with different manually crafted prompts that can draw ou… ▽ More In this project, we want to explore the newly emerging field of prompt engineering and apply it to the downstream task of detecting LM biases. More concretely, we explore how to design prompts that can indicate 4 different types of biases: (1) gender, (2) race, (3) sexual orientation, and (4) religion-based. Within our project, we experiment with different manually crafted prompts that can draw out the subtle biases that may be present in the language model. We apply these prompts to multiple variations of popular and well-recognized models: BERT, RoBERTa, and T5 to evaluate their biases. We provide a comparative analysis of these models and assess them using a two-fold method: use human judgment to decide whether model predictions are biased and utilize model-level judgment (through further prompts) to understand if a model can self-diagnose the biases of its own prediction. △ Less

Submitted 11 September, 2023; originally announced September 2023.

arXiv:2308.16734 [pdf]

doi 10.1109/MOBILSoft59058.2023.00013

Native vs Web Apps: Comparing the Energy Consumption and Performance of Android Apps and their Web Counterparts

Authors: Ruben Horn, Abdellah Lahnaoui, Edgardo Reinoso, Sicheng Peng, Vadim Isakov, Tanjina Islam, Ivano Malavolta

Abstract: Context. Many Internet content platforms, such as Spotify and YouTube, provide their services via both native and Web apps. Even though those apps provide similar features to the end user, using their native version or Web counterpart might lead to different levels of energy consumption and performance. Goal. The goal of this study is to empirically assess the energy consumption and performance of… ▽ More Context. Many Internet content platforms, such as Spotify and YouTube, provide their services via both native and Web apps. Even though those apps provide similar features to the end user, using their native version or Web counterpart might lead to different levels of energy consumption and performance. Goal. The goal of this study is to empirically assess the energy consumption and performance of native and Web apps in the context of Internet content platforms on Android. Method. We select 10 Internet content platforms across 5 categories. Then, we measure them based on the energy consumption, network traffic volume, CPU load, memory load, and frame time of their native and Web versions; then, we statistically analyze the collected measures and report our results. Results. We confirm that native apps consume significantly less energy than their Web counterparts, with large effect size. Web apps use more CPU and memory, with statistically significant difference and large effect size. Therefore, we conclude that native apps tend to require fewer hardware resources than their corresponding Web versions. The network traffic volume exhibits statistically significant difference in favour of native apps, with small effect size. Our results do not allow us to draw any conclusion in terms of frame time. Conclusions. Based on our results, we advise users to access Internet contents using native apps over Web apps, when possible. Also, the results of this study motivate further research on the optimization of the usage of runtime resources of mobile Web apps and Android browsers. △ Less

Submitted 31 August, 2023; originally announced August 2023.

arXiv:2308.06549 [pdf, other]

Human Behavior-based Personalized Meal Recommendation and Menu Planning Social System

Authors: Tanvir Islam, Anika Rahman Joyita, Md. Golam Rabiul Alam, Mohammad Mehedi Hassan, Md. Rafiul Hassan, Raffaele Gravina

Abstract: The traditional dietary recommendation systems are basically nutrition or health-aware where the human feelings on food are ignored. Human affects vary when it comes to food cravings, and not all foods are appealing in all moods. A questionnaire-based and preference-aware meal recommendation system can be a solution. However, automated recognition of social affects on different foods and planning… ▽ More The traditional dietary recommendation systems are basically nutrition or health-aware where the human feelings on food are ignored. Human affects vary when it comes to food cravings, and not all foods are appealing in all moods. A questionnaire-based and preference-aware meal recommendation system can be a solution. However, automated recognition of social affects on different foods and planning the menu considering nutritional demand and social-affect has some significant benefits of the questionnaire-based and preference-aware meal recommendations. A patient with severe illness, a person in a coma, or patients with locked-in syndrome and amyotrophic lateral sclerosis (ALS) cannot express their meal preferences. Therefore, the proposed framework includes a social-affective computing module to recognize the affects of different meals where the person's affect is detected using electroencephalography signals. EEG allows to capture the brain signals and analyze them to anticipate affective toward a food. In this study, we have used a 14-channel wireless Emotive Epoc+ to measure affectivity for different food items. A hierarchical ensemble method is applied to predict affectivity upon multiple feature extraction methods and TOPSIS (Technique for Order of Preference by Similarity to Ideal Solution) is used to generate a food list based on the predicted affectivity. In addition to the meal recommendation, an automated menu planning approach is also proposed considering a person's energy intake requirement, affectivity, and nutritional values of the different menus. The bin-packing algorithm is used for the personalized menu planning of breakfast, lunch, dinner, and snacks. The experimental findings reveal that the suggested affective computing, meal recommendation, and menu planning algorithms perform well across a variety of assessment parameters. △ Less

Submitted 12 August, 2023; originally announced August 2023.

Journal ref: IEEE TRANSACTIONS ON COMPUTATIONAL SOCIAL SYSTEMS. 2022

arXiv:2308.04453 [pdf, other]

Towards Immutability: A Secure and Efficient Auditing Framework for Cloud Supporting Data Integrity and File Version Control

Authors: Faisal Haque Bappy, Saklain Zaman, Tariqul Islam, Redwan Ahmed Rizvee, Joon S. Park, Kamrul Hasan

Abstract: Although wide-scale integration of cloud services with myriad applications increases quality of services (QoS) for enterprise users, verifying the existence and manipulation of stored cloud information remains an open research problem. Decentralized blockchain-based solutions are becoming more appealing for cloud auditing environments because of the immutable nature of blockchain. However, the dec… ▽ More Although wide-scale integration of cloud services with myriad applications increases quality of services (QoS) for enterprise users, verifying the existence and manipulation of stored cloud information remains an open research problem. Decentralized blockchain-based solutions are becoming more appealing for cloud auditing environments because of the immutable nature of blockchain. However, the decentralized structure of blockchain results in considerable synchronization and communication overhead, which increases maintenance costs for cloud service providers (CSP). This paper proposes a Merkle Hash Tree based architecture named Entangled Merkle Forest to support version control and dynamic auditing of information in centralized cloud environments. We utilized a semi-trusted third-party auditor to conduct the auditing tasks with minimal privacy-preserving file metadata. To the best of our knowledge, we are the first to design a node sharing Merkle Forest to offer a cost-effective auditing framework for centralized cloud infrastructures while achieving the immutable feature of blockchain, mitigating the synchronization and performance challenges of the decentralized architectures. Our proposed scheme outperforms it's equivalent Blockchain-based schemes by ensuring time and storage efficiency with minimum overhead as evidenced by performance analysis. △ Less

Submitted 4 August, 2023; originally announced August 2023.

arXiv:2308.04452 [pdf, other]

doi 10.1109/CSCloud-EdgeCom58631.2023.00053

Quarks: A Secure and Decentralized Blockchain-Based Messaging Network

Authors: Mirza Kamrul Bashar Shuhan, Tariqul Islam, Enam Ahmed Shuvo, Faisal Haque Bappy, Kamrul Hasan, Carlos Caicedo

Abstract: In last two decades, messaging systems have gained widespread popularity both in the enterprise and consumer sectors. Many of these systems used secure protocols like end-to-end encryption to ensure strong security in one-to-one communication. However, the majority of them rely on centralized servers, which allows them to use their users' personal data. Also, it allows the government to track and… ▽ More In last two decades, messaging systems have gained widespread popularity both in the enterprise and consumer sectors. Many of these systems used secure protocols like end-to-end encryption to ensure strong security in one-to-one communication. However, the majority of them rely on centralized servers, which allows them to use their users' personal data. Also, it allows the government to track and regulate their citizens' activities, which poses significant threats to "digital freedom". Also, these systems have failed to achieve security attributes like confidentiality, integrity, and privacy for group communications. In this paper, we present a novel blockchain-based secure messaging system named Quarks that overcomes the security pitfalls of the existing systems and eliminates centralized control. We have analyzed our architecture with security models to demonstrate the system's reliability and usability. We have developed a Proof of Concept (PoC) of the Quarks system leveraging Distributed Ledger Technology (DLT) and conducted load testing on that. We noticed that our PoC system achieves all the desired attributes that are prevalent in a traditional centralized messaging scheme despite the limited capacity of the development and testing environment. Therefore, this assures us of the applicability of such systems in the near future if scaled up properly. △ Less

Submitted 4 August, 2023; originally announced August 2023.

arXiv:2308.02731 [pdf, other]

Personalization of Stress Mobile Sensing using Self-Supervised Learning

Authors: Tanvir Islam, Peter Washington

Abstract: Stress is widely recognized as a major contributor to a variety of health issues. Stress prediction using biosignal data recorded by wearables is a key area of study in mobile sensing research because real-time stress prediction can enable digital interventions to immediately react at the onset of stress, helping to avoid many psychological and physiological symptoms such as heart rhythm irregular… ▽ More Stress is widely recognized as a major contributor to a variety of health issues. Stress prediction using biosignal data recorded by wearables is a key area of study in mobile sensing research because real-time stress prediction can enable digital interventions to immediately react at the onset of stress, helping to avoid many psychological and physiological symptoms such as heart rhythm irregularities. Electrodermal activity (EDA) is often used to measure stress. However, major challenges with the prediction of stress using machine learning include the subjectivity and sparseness of the labels, a large feature space, relatively few labels, and a complex nonlinear and subjective relationship between the features and outcomes. To tackle these issues, we examine the use of model personalization: training a separate stress prediction model for each user. To allow the neural network to learn the temporal dynamics of each individual's baseline biosignal patterns, thus enabling personalization with very few labels, we pre-train a 1-dimensional convolutional neural network (CNN) using self-supervised learning (SSL). We evaluate our method using the Wearable Stress and Affect prediction (WESAD) dataset. We fine-tune the pre-trained networks to the stress prediction task and compare against equivalent models without any self-supervised pre-training. We discover that embeddings learned using our pre-training method outperform supervised baselines with significantly fewer labeled data points: the models trained with SSL require less than 30% of the labels to reach equivalent performance without personalized SSL. This personalized learning method can enable precision health systems which are tailored to each subject and require few annotations by the end user, thus allowing for the mobile sensing of increasingly complex, heterogeneous, and subjective outcomes such as stress. △ Less

Submitted 4 August, 2023; originally announced August 2023.

Journal ref: BMC Medical Informatics and Decision Making. 2023

arXiv:2308.02358 [pdf, other]

A Deep Dive into the Google Cluster Workload Traces: Analyzing the Application Failure Characteristics and User Behaviors

Authors: Faisal Haque Bappy, Tariqul Islam, Tarannum Shaila Zaman, Raiful Hasan, Carlos Caicedo

Abstract: Large-scale cloud data centers have gained popularity due to their high availability, rapid elasticity, scalability, and low cost. However, current data centers continue to have high failure rates due to the lack of proper resource utilization and early failure detection. To maximize resource efficiency and reduce failure rates in large-scale cloud data centers, it is crucial to understand the wor… ▽ More Large-scale cloud data centers have gained popularity due to their high availability, rapid elasticity, scalability, and low cost. However, current data centers continue to have high failure rates due to the lack of proper resource utilization and early failure detection. To maximize resource efficiency and reduce failure rates in large-scale cloud data centers, it is crucial to understand the workload and failure characteristics. In this paper, we perform a deep analysis of the 2019 Google Cluster Trace Dataset, which contains 2.4TiB of workload traces from eight different clusters around the world. We explore the characteristics of failed and killed jobs in Google's production cloud and attempt to correlate them with key attributes such as resource usage, job priority, scheduling class, job duration, and the number of task resubmissions. Our analysis reveals several important characteristics of failed jobs that contribute to job failure and hence, could be used for developing an early failure prediction system. Also, we present a novel usage analysis to identify heterogeneity in jobs and tasks submitted by users. We are able to identify specific users who control more than half of all collection events on a single cluster. We contend that these characteristics could be useful in developing an early job failure prediction system that could be utilized for dynamic rescheduling of the job scheduler and thus improving resource utilization in large-scale cloud data centers while reducing failure rates. △ Less

Submitted 4 August, 2023; originally announced August 2023.

arXiv:2307.03337 [pdf, other]

Personalized Prediction of Recurrent Stress Events Using Self-Supervised Learning on Multimodal Time-Series Data

Authors: Tanvir Islam, Peter Washington

Abstract: Chronic stress can significantly affect physical and mental health. The advent of wearable technology allows for the tracking of physiological signals, potentially leading to innovative stress prediction and intervention methods. However, challenges such as label scarcity and data heterogeneity render stress prediction difficult in practice. To counter these issues, we have developed a multimodal… ▽ More Chronic stress can significantly affect physical and mental health. The advent of wearable technology allows for the tracking of physiological signals, potentially leading to innovative stress prediction and intervention methods. However, challenges such as label scarcity and data heterogeneity render stress prediction difficult in practice. To counter these issues, we have developed a multimodal personalized stress prediction system using wearable biosignal data. We employ self-supervised learning (SSL) to pre-train the models on each subject's data, allowing the models to learn the baseline dynamics of the participant's biosignals prior to fine-tuning the stress prediction task. We test our model on the Wearable Stress and Affect Detection (WESAD) dataset, demonstrating that our SSL models outperform non-SSL models while utilizing less than 5% of the annotations. These results suggest that our approach can personalize stress prediction to each user with minimal annotations. This paradigm has the potential to enable personalized prediction of a variety of recurring health events using complex multimodal data streams. △ Less

Submitted 6 July, 2023; originally announced July 2023.

Journal ref: International Conference on Machine Learning (ICML), Honolulu, Hawaii, USA. 2023

arXiv:2306.07997 [pdf]

Machine Learning Approach on Multiclass Classification of Internet Firewall Log Files

Authors: Md Habibur Rahman, Taminul Islam, Md Masum Rana, Rehnuma Tasnim, Tanzina Rahman Mona, Md. Mamun Sakib

Abstract: Firewalls are critical components in securing communication networks by screening all incoming (and occasionally exiting) data packets. Filtering is carried out by comparing incoming data packets to a set of rules designed to prevent malicious code from entering the network. To regulate the flow of data packets entering and leaving a network, an Internet firewall keeps a track of all activity. Whi… ▽ More Firewalls are critical components in securing communication networks by screening all incoming (and occasionally exiting) data packets. Filtering is carried out by comparing incoming data packets to a set of rules designed to prevent malicious code from entering the network. To regulate the flow of data packets entering and leaving a network, an Internet firewall keeps a track of all activity. While the primary function of log files is to aid in troubleshooting and diagnostics, the information they contain is also very relevant to system audits and forensics. Firewalls primary function is to prevent malicious data packets from being sent. In order to better defend against cyberattacks and understand when and how malicious actions are influencing the internet, it is necessary to examine log files. As a result, the firewall decides whether to 'allow,' 'deny,' 'drop,' or 'reset-both' the incoming and outgoing packets. In this research, we apply various categorization algorithms to make sense of data logged by a firewall device. Harmonic mean F1 score, recall, and sensitivity measurement data with a 99% accuracy score in the random forest technique are used to compare the classifier's performance. To be sure, the proposed characteristics did significantly contribute to enhancing the firewall classification rate, as seen by the high accuracy rates generated by the other methods. △ Less

Submitted 12 June, 2023; originally announced June 2023.

Comments: Accepted and presented in International Conference on Computational Intelligence and Sustainable Engineering (CISES-2023), 2022, 7 pages, 13 figures

arXiv:2306.06206 [pdf]

PotatoPestNet: A CTInceptionV3-RS-Based Neural Network for Accurate Identification of Potato Pests

Authors: Md. Simul Hasan Talukder, Rejwan Bin Sulaiman, Mohammad Raziuddin Chowdhury, Musarrat Saberin Nipun, Taminul Islam

Abstract: Potatoes are the third-largest food crop globally, but their production frequently encounters difficulties because of aggressive pest infestations. The aim of this study is to investigate the various types and characteristics of these pests and propose an efficient PotatoPestNet AI-based automatic potato pest identification system. To accomplish this, we curated a reliable dataset consisting of ei… ▽ More Potatoes are the third-largest food crop globally, but their production frequently encounters difficulties because of aggressive pest infestations. The aim of this study is to investigate the various types and characteristics of these pests and propose an efficient PotatoPestNet AI-based automatic potato pest identification system. To accomplish this, we curated a reliable dataset consisting of eight types of potato pests. We leveraged the power of transfer learning by employing five customized, pre-trained transfer learning models: CMobileNetV2, CNASLargeNet, CXception, CDenseNet201, and CInceptionV3, in proposing a robust PotatoPestNet model to accurately classify potato pests. To improve the models' performance, we applied various augmentation techniques, incorporated a global average pooling layer, and implemented proper regularization methods. To further enhance the performance of the models, we utilized random search (RS) optimization for hyperparameter tuning. This optimization technique played a significant role in fine-tuning the models and achieving improved performance. We evaluated the models both visually and quantitatively, utilizing different evaluation metrics. The robustness of the models in handling imbalanced datasets was assessed using the Receiver Operating Characteristic (ROC) curve. Among the models, the Customized Tuned Inception V3 (CTInceptionV3) model, optimized through random search, demonstrated outstanding performance. It achieved the highest accuracy (91%), precision (91%), recall (91%), and F1-score (91%), showcasing its superior ability to accurately identify and classify potato pests. △ Less

Submitted 15 July, 2023; v1 submitted 27 May, 2023; originally announced June 2023.

arXiv:2305.12725 [pdf, other]

Quantum Key Distribution with Minimal Qubit Transmission Based on MultiQubit Greenberger Horne Zeilinger State

Authors: Tasdiqul Islam, Engin Arslan

Abstract: Conventional Quantum Key Distribution (QKD) requires the transmission of multiple qubits equivalent to the length of the key. As quantum networks are still in their infancy thus, they are expected to have a limited capacity, necessitating too many qubit transmissions for QKD might limit the effective use of limited network bandwidth of quantum networks. To address this challenge and enhance the pr… ▽ More Conventional Quantum Key Distribution (QKD) requires the transmission of multiple qubits equivalent to the length of the key. As quantum networks are still in their infancy thus, they are expected to have a limited capacity, necessitating too many qubit transmissions for QKD might limit the effective use of limited network bandwidth of quantum networks. To address this challenge and enhance the practicality of QKD, we propose a Multi-Qubit Greenberger Horne Zeilinger (GHZ) State-based QKD scheme that requires a small number of qubit transmissions. The proposed method transmits one qubit between endpoints and reuses it for the transmission of multiple classical bits with the help of Quantum nondemolition (QND) measurements. We show that one can transfer L-1 classical bits by generating an L-qubit GHZ state and transferring one to the remote party. We further show that the proposed QKD algorithm can be extended to enable multi-party QKD. It can also support QKD between parties with minimal quantum resources. As a result, the proposed scheme offers a quantum network-efficient alternative QKD. △ Less

Submitted 22 May, 2023; originally announced May 2023.

arXiv:2305.09855 [pdf, other]

Scalable Quantum Repeater Deployment Modeling

Authors: Tasdiqul Islam, Engin Arslan

Abstract: Long-distance quantum communication presents a significant challenge as maintaining the fidelity of qubits can be difficult. This issue can be addressed through the use of quantum repeaters to transmit entanglement information through Bell measurements. However, despite its necessity to enable wide-area quantum internet, the deployment cost of quantum repeaters can be prohibitively expensive, thus… ▽ More Long-distance quantum communication presents a significant challenge as maintaining the fidelity of qubits can be difficult. This issue can be addressed through the use of quantum repeaters to transmit entanglement information through Bell measurements. However, despite its necessity to enable wide-area quantum internet, the deployment cost of quantum repeaters can be prohibitively expensive, thus it is important to develop a quantum repeater deployment model that can strike a balance between cost and effectiveness. In this work, we present novel heuristic models to quickly determine a minimum number of quantum repeaters to deploy in large-scale networks to provide end-to-end connectivity between all end hosts. The results show that, compared to the linear programming approach, the heuristic methods can find near-optimal solutions while reducing the execution time from days to seconds when evaluated against several synthetic and real-world networks such as SURFnet and ESnet. As reliability is key for any network, we also demonstrate that the heuristic method can determine deployment models that can endure up to two link/node failures. △ Less

Submitted 16 May, 2023; originally announced May 2023.

arXiv:2305.06174 [pdf, other]

Analysis of Climate Campaigns on Social Media using Bayesian Model Averaging

Authors: Tunazzina Islam, Ruqi Zhang, Dan Goldwasser

Abstract: Climate change is the defining issue of our time, and we are at a defining moment. Various interest groups, social movement organizations, and individuals engage in collective action on this issue on social media. In addition, issue advocacy campaigns on social media often arise in response to ongoing societal concerns, especially those faced by energy industries. Our goal in this paper is to anal… ▽ More Climate change is the defining issue of our time, and we are at a defining moment. Various interest groups, social movement organizations, and individuals engage in collective action on this issue on social media. In addition, issue advocacy campaigns on social media often arise in response to ongoing societal concerns, especially those faced by energy industries. Our goal in this paper is to analyze how those industries, their advocacy group, and climate advocacy group use social media to influence the narrative on climate change. In this work, we propose a minimally supervised model soup [57] approach combined with messaging themes to identify the stances of climate ads on Facebook. Finally, we release our stance dataset, model, and set of themes related to climate campaigns for future work on opinion mining and the automatic detection of climate change stances. △ Less

Submitted 30 June, 2023; v1 submitted 6 May, 2023; originally announced May 2023.

Comments: Accepted as a long paper at 6th AAAI/ACM Conference on AI, Ethics, and Society 2023 (AIES-2023). Updated for camera-ready

arXiv:2305.05094 [pdf, other]

Interactive Concept Learning for Uncovering Latent Themes in Large Text Collections

Authors: Maria Leonor Pacheco, Tunazzina Islam, Lyle Ungar, Ming Yin, Dan Goldwasser

Abstract: Experts across diverse disciplines are often interested in making sense of large text collections. Traditionally, this challenge is approached either by noisy unsupervised techniques such as topic models, or by following a manual theme discovery process. In this paper, we expand the definition of a theme to account for more than just a word distribution, and include generalized concepts deemed rel… ▽ More Experts across diverse disciplines are often interested in making sense of large text collections. Traditionally, this challenge is approached either by noisy unsupervised techniques such as topic models, or by following a manual theme discovery process. In this paper, we expand the definition of a theme to account for more than just a word distribution, and include generalized concepts deemed relevant by domain experts. Then, we propose an interactive framework that receives and encodes expert feedback at different levels of abstraction. Our framework strikes a balance between automation and manual coding, allowing experts to maintain control of their study while reducing the manual effort required. △ Less

Submitted 8 May, 2023; originally announced May 2023.

Comments: Accepted to Findings of ACL: ACL 2023

arXiv:2304.11072 [pdf, other]

doi 10.1109/EuroSP57164.2023.00018

An Unbiased Transformer Source Code Learning with Semantic Vulnerability Graph

Authors: Nafis Tanveer Islam, Gonzalo De La Torre Parra, Dylan Manuel, Elias Bou-Harb, Peyman Najafirad

Abstract: Over the years, open-source software systems have become prey to threat actors. Even as open-source communities act quickly to patch the breach, code vulnerability screening should be an integral part of agile software development from the beginning. Unfortunately, current vulnerability screening techniques are ineffective at identifying novel vulnerabilities or providing developers with code vuln… ▽ More Over the years, open-source software systems have become prey to threat actors. Even as open-source communities act quickly to patch the breach, code vulnerability screening should be an integral part of agile software development from the beginning. Unfortunately, current vulnerability screening techniques are ineffective at identifying novel vulnerabilities or providing developers with code vulnerability and classification. Furthermore, the datasets used for vulnerability learning often exhibit distribution shifts from the real-world testing distribution due to novel attack strategies deployed by adversaries and as a result, the machine learning model's performance may be hindered or biased. To address these issues, we propose a joint interpolated multitasked unbiased vulnerability classifier comprising a transformer "RoBERTa" and graph convolution neural network (GCN). We present a training process utilizing a semantic vulnerability graph (SVG) representation from source code, created by integrating edges from a sequential flow, control flow, and data flow, as well as a novel flow dubbed Poacher Flow (PF). Poacher flow edges reduce the gap between dynamic and static program analysis and handle complex long-range dependencies. Moreover, our approach reduces biases of classifiers regarding unbalanced datasets by integrating Focal Loss objective function along with SVG. Remarkably, experimental results show that our classifier outperforms state-of-the-art results on vulnerability detection with fewer false negatives and false positives. After testing our model across multiple datasets, it shows an improvement of at least 2.41% and 18.75% in the best-case scenario. Evaluations using N-day program samples demonstrate that our proposed approach achieves a 93% accuracy and was able to detect 4, zero-day vulnerabilities from popular GitHub repositories. △ Less

Submitted 17 April, 2023; originally announced April 2023.

arXiv:2304.09916 [pdf, other]

doi 10.1109/PERCOM56429.2023.10099081

An Intent-based Framework for Vehicular Edge Computing

Authors: TianZhang He, Adel N. Toosi, Negin Akbari, Muhammed Tawfiqul Islam, Muhammad Aamir Cheema

Abstract: The rapid development of emerging vehicular edge computing (VEC) brings new opportunities and challenges for dynamic resource management. The increasing number of edge data centers, roadside units (RSUs), and network devices, however, makes resource management a complex task in VEC. On the other hand, the exponential growth of service applications and end-users makes corresponding QoS hard to main… ▽ More The rapid development of emerging vehicular edge computing (VEC) brings new opportunities and challenges for dynamic resource management. The increasing number of edge data centers, roadside units (RSUs), and network devices, however, makes resource management a complex task in VEC. On the other hand, the exponential growth of service applications and end-users makes corresponding QoS hard to maintain. Intent-Based Networking (IBN), based on Software-Defined Networking, was introduced to provide the ability to automatically handle and manage the networking requirements of different applications. Motivated by the IBN concept, in this paper, we propose a novel approach to jointly orchestrate networking and computing resources based on user requirements. The proposed solution constantly monitors user requirements and dynamically re-configures the system to satisfy desired states of the application. We compared our proposed solution with the state-of-the-art networking embedding algorithms using real-world taxi GPS traces. Results show that our proposed method is significantly faster (up to 95%) and can improve resource utilization (up to 76%) and the acceptance ratio of computing and networking requests with various priorities (up to 71%). We also present a small-scale prototype of the proposed intent management framework to validate our solution. △ Less

Submitted 19 April, 2023; originally announced April 2023.

Comments: accepted by PerCom 2023, 10 pages, 12 figures

Journal ref: 2023 IEEE International Conference on Pervasive Computing and Communications (PerCom), Atlanta, GA, USA, 2023, pp. 121-130

arXiv:2303.09645 [pdf]

Development of a Voice Controlled Robotic Arm

Authors: Akkas U. Haque, Humayun Kabir, S. C. Banik, M. T. Islam

Abstract: This paper describes a robotic arm with 5 degrees-of-freedom (DOF) which is controlled by human voice and has been developed in the Mechatronics Laboratory, CUET. This robotic arm is interfaced with a PC by serial communication (RS-232). Users' voice command is captured by a microphone, and this voice is processed by software which is made by Microsoft visual studio. Then the specific signal (obta… ▽ More This paper describes a robotic arm with 5 degrees-of-freedom (DOF) which is controlled by human voice and has been developed in the Mechatronics Laboratory, CUET. This robotic arm is interfaced with a PC by serial communication (RS-232). Users' voice command is captured by a microphone, and this voice is processed by software which is made by Microsoft visual studio. Then the specific signal (obtained by signal processing) is sent to control unit. The main control unit that is used in the robotic arm is a microcontroller whose model no. is PIC18f452. Then Control unit drives the actuators, (Hitec HS-422, HS-81) according to the signal or signals to give required motion of the robotic arm. At present the robotic arm can perform a set action like pick & pull, gripping, holding & releasing, and some other extra function like dance-like movement, and can turn according to the voice commands. △ Less

Submitted 16 March, 2023; originally announced March 2023.

arXiv:2301.10174 [pdf]

doi 10.1109/I2CT54291.2022.9825052

Analysis of Arrhythmia Classification on ECG Dataset

Authors: Taminul Islam, Arindom Kundu, Tanzim Ahmed, Nazmul Islam Khan

Abstract: The heart is one of the most vital organs in the human body. It supplies blood and nutrients in other parts of the body. Therefore, maintaining a healthy heart is essential. As a heart disorder, arrhythmia is a condition in which the heart's pumping mechanism becomes aberrant. The Electrocardiogram is used to analyze the arrhythmia problem from the ECG signals because of its fewer difficulties and… ▽ More The heart is one of the most vital organs in the human body. It supplies blood and nutrients in other parts of the body. Therefore, maintaining a healthy heart is essential. As a heart disorder, arrhythmia is a condition in which the heart's pumping mechanism becomes aberrant. The Electrocardiogram is used to analyze the arrhythmia problem from the ECG signals because of its fewer difficulties and cheapness. The heart peaks shown in the ECG graph are used to detect heart diseases, and the R peak is used to analyze arrhythmia disease. Arrhythmia is grouped into two groups - Tachycardia and Bradycardia for detection. In this paper, we discussed many different techniques such as Deep CNNs, LSTM, SVM, NN classifier, Wavelet, TQWT, etc., that have been used for detecting arrhythmia using various datasets throughout the previous decade. This work shows the analysis of some arrhythmia classification on the ECG dataset. Here, Data preprocessing, feature extraction, classification processes were applied on most research work and achieved better performance for classifying ECG signals to detect arrhythmia. Automatic arrhythmia detection can help cardiologists make the right decisions immediately to save human life. In addition, this research presents various previous research limitations with some challenges in detecting arrhythmia that will help in future research. △ Less

Submitted 10 January, 2023; originally announced January 2023.

Comments: 6 pages, 5 figures. This paper has been published to 2022 proceedings of IEEE 7th International conference for Convergence in Technology (I2CT), 07-09 April 2022, Mumbai, India

Journal ref: In 2022 IEEE 7th International conference for Convergence in Technology (I2CT) (pp. 1-6). IEEE

arXiv:2212.13261 [pdf, other]

Explainable AI for Bioinformatics: Methods, Tools, and Applications

Authors: Md. Rezaul Karim, Tanhim Islam, Oya Beyan, Christoph Lange, Michael Cochez, Dietrich Rebholz-Schuhmann, Stefan Decker

Abstract: Artificial intelligence (AI) systems utilizing deep neural networks (DNNs) and machine learning (ML) algorithms are widely used for solving important problems in bioinformatics, biomedical informatics, and precision medicine. However, complex DNNs or ML models, which are often perceived as opaque and black-box, can make it difficult to understand the reasoning behind their decisions. This lack of… ▽ More Artificial intelligence (AI) systems utilizing deep neural networks (DNNs) and machine learning (ML) algorithms are widely used for solving important problems in bioinformatics, biomedical informatics, and precision medicine. However, complex DNNs or ML models, which are often perceived as opaque and black-box, can make it difficult to understand the reasoning behind their decisions. This lack of transparency can be a challenge for both end-users and decision-makers, as well as AI developers. Additionally, in sensitive areas like healthcare, explainability and accountability are not only desirable but also legally required for AI systems that can have a significant impact on human lives. Fairness is another growing concern, as algorithmic decisions should not show bias or discrimination towards certain groups or individuals based on sensitive attributes. Explainable artificial intelligence (XAI) aims to overcome the opaqueness of black-box models and provide transparency in how AI systems make decisions. Interpretable ML models can explain how they make predictions and the factors that influence their outcomes. However, most state-of-the-art interpretable ML methods are domain-agnostic and evolved from fields like computer vision, automated reasoning, or statistics, making direct application to bioinformatics problems challenging without customization and domain-specific adaptation. In this paper, we discuss the importance of explainability in the context of bioinformatics, provide an overview of model-specific and model-agnostic interpretable ML methods and tools, and outline their potential caveats and drawbacks. Besides, we discuss how to customize existing interpretable ML methods for bioinformatics problems. Nevertheless, we demonstrate how XAI methods can improve transparency through case studies in bioimaging, cancer genomics, and text mining. △ Less

Submitted 23 February, 2023; v1 submitted 25 December, 2022; originally announced December 2022.

arXiv:2210.10669 [pdf, other]

doi 10.1609/icwsm.v17i1.22156

Weakly Supervised Learning for Analyzing Political Campaigns on Facebook

Authors: Tunazzina Islam, Shamik Roy, Dan Goldwasser

Abstract: Social media platforms are currently the main channel for political messaging, allowing politicians to target specific demographics and adapt based on their reactions. However, making this communication transparent is challenging, as the messaging is tightly coupled with its intended audience and often echoed by multiple stakeholders interested in advancing specific policies. Our goal in this pape… ▽ More Social media platforms are currently the main channel for political messaging, allowing politicians to target specific demographics and adapt based on their reactions. However, making this communication transparent is challenging, as the messaging is tightly coupled with its intended audience and often echoed by multiple stakeholders interested in advancing specific policies. Our goal in this paper is to take a first step towards understanding these highly decentralized settings. We propose a weakly supervised approach to identify the stance and issue of political ads on Facebook and analyze how political campaigns use some kind of demographic targeting by location, gender, or age. Furthermore, we analyze the temporal dynamics of the political ads on election polls. △ Less

Submitted 9 May, 2023; v1 submitted 19 October, 2022; originally announced October 2022.

Comments: accepted at 17th International AAAI Conference on Web and Social Media (ICWSM-2023), 12 pages

arXiv:2210.10031 [pdf, other]

doi 10.1109/BigData55660.2022.10021123

Understanding COVID-19 Vaccine Campaign on Facebook using Minimal Supervision

Authors: Tunazzina Islam, Dan Goldwasser

Abstract: In the age of social media, where billions of internet users share information and opinions, the negative impact of pandemics is not limited to the physical world. It provokes a surge of incomplete, biased, and incorrect information, also known as an infodemic. This global infodemic jeopardizes measures to control the pandemic by creating panic, vaccine hesitancy, and fragmented social response. P… ▽ More In the age of social media, where billions of internet users share information and opinions, the negative impact of pandemics is not limited to the physical world. It provokes a surge of incomplete, biased, and incorrect information, also known as an infodemic. This global infodemic jeopardizes measures to control the pandemic by creating panic, vaccine hesitancy, and fragmented social response. Platforms like Facebook allow advertisers to adapt their messaging to target different demographics and help alleviate or exacerbate the infodemic problem depending on their content. In this paper, we propose a minimally supervised multi-task learning framework for understanding messaging on Facebook related to the COVID vaccine by identifying ad themes and moral foundations. Furthermore, we perform a more nuanced thematic analysis of messaging tactics of vaccine campaigns on social media so that policymakers can make better decisions on pandemic control. △ Less

Submitted 16 November, 2022; v1 submitted 18 October, 2022; originally announced October 2022.

Comments: Accepted as a regular paper at 2022 IEEE International Conference on Big Data (IEEE BigData 2022). Also accepted at the NLP for Positive Impact (NLP4PI) workshop@EMNLP 2022

arXiv:2208.07655 [pdf, other]

A Hybrid Deep Feature-Based Deformable Image Registration Method for Pathology Images

Authors: Chulong Zhang, Yuming Jiang, Na Li, Zhicheng Zhang, Md Tauhidul Islam, Jingjing Dai, Lin Liu, Wenfeng He, Wenjian Qin, Jing Xiong, Yaoqin Xie, Xiaokun Liang

Abstract: Pathologists need to combine information from differently stained pathology slices for accurate diagnosis. Deformable image registration is a necessary technique for fusing multi-modal pathology slices. This paper proposes a hybrid deep feature-based deformable image registration framework for stained pathology samples. We first extract dense feature points via the detector-based and detector-free… ▽ More Pathologists need to combine information from differently stained pathology slices for accurate diagnosis. Deformable image registration is a necessary technique for fusing multi-modal pathology slices. This paper proposes a hybrid deep feature-based deformable image registration framework for stained pathology samples. We first extract dense feature points via the detector-based and detector-free deep learning feature networks and perform points matching. Then, to further reduce false matches, an outlier detection method combining the isolation forest statistical model and the local affine correction model is proposed. Finally, the interpolation method generates the deformable vector field for pathology image registration based on the above matching points. We evaluate our method on the dataset of the Non-rigid Histology Image Registration (ANHIR) challenge, which is co-organized with the IEEE ISBI 2019 conference. Our technique outperforms the traditional approaches by 17% with the Average-Average registration target error (rTRE) reaching 0.0034. The proposed method achieved state-of-the-art performance and ranked 1st in evaluating the test dataset. The proposed hybrid deep feature-based registration method can potentially become a reliable method for pathology image registration. △ Less

Submitted 10 April, 2023; v1 submitted 16 August, 2022; originally announced August 2022.

Comments: 22 pages, 12 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Showing 1–50 of 106 results for author: Islam, T