-
SONICS: Synthetic Or Not -- Identifying Counterfeit Songs
Authors:
Md Awsafur Rahman,
Zaber Ibn Abdul Hakim,
Najibul Haque Sarker,
Bishmoy Paul,
Shaikh Anowarul Fattah
Abstract:
The recent surge in AI-generated songs presents exciting possibilities and challenges. While these tools democratize music creation, they also necessitate the ability to distinguish between human-composed and AI-generated songs for safeguarding artistic integrity and content curation. Existing research and datasets in fake song detection only focus on singing voice deepfake detection (SVDD), where…
▽ More
The recent surge in AI-generated songs presents exciting possibilities and challenges. While these tools democratize music creation, they also necessitate the ability to distinguish between human-composed and AI-generated songs for safeguarding artistic integrity and content curation. Existing research and datasets in fake song detection only focus on singing voice deepfake detection (SVDD), where the vocals are AI-generated but the instrumental music is sourced from real songs. However, this approach is inadequate for contemporary end-to-end AI-generated songs where all components (vocals, lyrics, music, and style) could be AI-generated. Additionally, existing datasets lack lyrics-music diversity, long-duration songs, and open fake songs. To address these gaps, we introduce SONICS, a novel dataset for end-to-end Synthetic Song Detection (SSD), comprising over 97k songs with over 49k synthetic songs from popular platforms like Suno and Udio. Furthermore, we highlight the importance of modeling long-range temporal dependencies in songs for effective authenticity detection, an aspect overlooked in existing methods. To capture these patterns, we propose a novel model, SpecTTTra, that is up to 3 times faster and 6 times more memory efficient compared to popular CNN and Transformer-based models while maintaining competitive performance. Finally, we offer both AI-based and Human evaluation benchmarks, addressing another deficiency in current research.
△ Less
Submitted 27 August, 2024; v1 submitted 26 August, 2024;
originally announced August 2024.
-
Enhancing material property prediction with ensemble deep graph convolutional networks
Authors:
Chowdhury Mohammad Abid Rahman,
Ghadendra Bhandari,
Nasser M Nasrabadi,
Aldo H. Romero,
Prashnna K. Gyawali
Abstract:
Machine learning (ML) models have emerged as powerful tools for accelerating materials discovery and design by enabling accurate predictions of properties from compositional and structural data. These capabilities are vital for developing advanced technologies across fields such as energy, electronics, and biomedicine, potentially reducing the time and resources needed for new material exploration…
▽ More
Machine learning (ML) models have emerged as powerful tools for accelerating materials discovery and design by enabling accurate predictions of properties from compositional and structural data. These capabilities are vital for developing advanced technologies across fields such as energy, electronics, and biomedicine, potentially reducing the time and resources needed for new material exploration and promoting rapid innovation cycles. Recent efforts have focused on employing advanced ML algorithms, including deep learning - based graph neural network, for property prediction. Additionally, ensemble models have proven to enhance the generalizability and robustness of ML and DL. However, the use of such ensemble strategies in deep graph networks for material property prediction remains underexplored. Our research provides an in-depth evaluation of ensemble strategies in deep learning - based graph neural network, specifically targeting material property prediction tasks. By testing the Crystal Graph Convolutional Neural Network (CGCNN) and its multitask version, MT-CGCNN, we demonstrated that ensemble techniques, especially prediction averaging, substantially improve precision beyond traditional metrics for key properties like formation energy per atom ($ΔE^{f}$), band gap ($E_{g}$) and density ($ρ$) in 33,990 stable inorganic materials. These findings support the broader application of ensemble methods to enhance predictive accuracy in the field.
△ Less
Submitted 26 July, 2024;
originally announced July 2024.
-
Celeb-FBI: A Benchmark Dataset on Human Full Body Images and Age, Gender, Height and Weight Estimation using Deep Learning Approach
Authors:
Pronay Debnath,
Usafa Akther Rifa,
Busra Kamal Rafa,
Ali Haider Talukder Akib,
Md. Aminur Rahman
Abstract:
The scarcity of comprehensive datasets in surveillance, identification, image retrieval systems, and healthcare poses a significant challenge for researchers in exploring new methodologies and advancing knowledge in these respective fields. Furthermore, the need for full-body image datasets with detailed attributes like height, weight, age, and gender is particularly significant in areas such as f…
▽ More
The scarcity of comprehensive datasets in surveillance, identification, image retrieval systems, and healthcare poses a significant challenge for researchers in exploring new methodologies and advancing knowledge in these respective fields. Furthermore, the need for full-body image datasets with detailed attributes like height, weight, age, and gender is particularly significant in areas such as fashion industry analytics, ergonomic design assessment, virtual reality avatar creation, and sports performance analysis. To address this gap, we have created the 'Celeb-FBI' dataset which contains 7,211 full-body images of individuals accompanied by detailed information on their height, age, weight, and gender. Following the dataset creation, we proceed with the preprocessing stages, including image cleaning, scaling, and the application of Synthetic Minority Oversampling Technique (SMOTE). Subsequently, utilizing this prepared dataset, we employed three deep learning approaches: Convolutional Neural Network (CNN), 50-layer ResNet, and 16-layer VGG, which are used for estimating height, weight, age, and gender from human full-body images. From the results obtained, ResNet-50 performed best for the system with an accuracy rate of 79.18% for age, 95.43% for gender, 85.60% for height and 81.91% for weight.
△ Less
Submitted 3 July, 2024;
originally announced July 2024.
-
DefAn: Definitive Answer Dataset for LLMs Hallucination Evaluation
Authors:
A B M Ashikur Rahman,
Saeed Anwar,
Muhammad Usman,
Ajmal Mian
Abstract:
Large Language Models (LLMs) have demonstrated remarkable capabilities, revolutionizing the integration of AI in daily life applications. However, they are prone to hallucinations, generating claims that contradict established facts, deviating from prompts, and producing inconsistent responses when the same prompt is presented multiple times. Addressing these issues is challenging due to the lack…
▽ More
Large Language Models (LLMs) have demonstrated remarkable capabilities, revolutionizing the integration of AI in daily life applications. However, they are prone to hallucinations, generating claims that contradict established facts, deviating from prompts, and producing inconsistent responses when the same prompt is presented multiple times. Addressing these issues is challenging due to the lack of comprehensive and easily assessable benchmark datasets. Most existing datasets are small and rely on multiple-choice questions, which are inadequate for evaluating the generative prowess of LLMs. To measure hallucination in LLMs, this paper introduces a comprehensive benchmark dataset comprising over 75,000 prompts across eight domains. These prompts are designed to elicit definitive, concise, and informative answers. The dataset is divided into two segments: one publicly available for testing and assessing LLM performance and a hidden segment for benchmarking various LLMs. In our experiments, we tested six LLMs-GPT-3.5, LLama 2, LLama 3, Gemini, Mixtral, and Zephyr-revealing that overall factual hallucination ranges from 59% to 82% on the public dataset and 57% to 76% in the hidden benchmark. Prompt misalignment hallucination ranges from 6% to 95% in the public dataset and 17% to 94% in the hidden counterpart. Average consistency ranges from 21% to 61% and 22% to 63%, respectively. Domain-wise analysis shows that LLM performance significantly deteriorates when asked for specific numeric information while performing moderately with person, location, and date queries. Our dataset demonstrates its efficacy and serves as a comprehensive benchmark for LLM performance evaluation. Our dataset and LLMs responses are available at \href{https://github.com/ashikiut/DefAn}{https://github.com/ashikiut/DefAn}.
△ Less
Submitted 13 June, 2024;
originally announced June 2024.
-
MosquitoFusion: A Multiclass Dataset for Real-Time Detection of Mosquitoes, Swarms, and Breeding Sites Using Deep Learning
Authors:
Md. Faiyaz Abdullah Sayeedi,
Fahim Hafiz,
Md Ashiqur Rahman
Abstract:
In this paper, we present an integrated approach to real-time mosquito detection using our multiclass dataset (MosquitoFusion) containing 1204 diverse images and leverage cutting-edge technologies, specifically computer vision, to automate the identification of Mosquitoes, Swarms, and Breeding Sites. The pre-trained YOLOv8 model, trained on this dataset, achieved a mean Average Precision (mAP@50)…
▽ More
In this paper, we present an integrated approach to real-time mosquito detection using our multiclass dataset (MosquitoFusion) containing 1204 diverse images and leverage cutting-edge technologies, specifically computer vision, to automate the identification of Mosquitoes, Swarms, and Breeding Sites. The pre-trained YOLOv8 model, trained on this dataset, achieved a mean Average Precision (mAP@50) of 57.1%, with precision at 73.4% and recall at 50.5%. The integration of Geographic Information Systems (GIS) further enriches the depth of our analysis, providing valuable insights into spatial patterns. The dataset and code are available at https://github.com/faiyazabdullah/MosquitoFusion.
△ Less
Submitted 1 April, 2024;
originally announced April 2024.
-
IPA Transcription of Bengali Texts
Authors:
Kanij Fatema,
Fazle Dawood Haider,
Nirzona Ferdousi Turpa,
Tanveer Azmal,
Sourav Ahmed,
Navid Hasan,
Mohammad Akhlaqur Rahman,
Biplab Kumar Sarkar,
Afrar Jahin,
Md. Rezuwan Hassan,
Md Foriduzzaman Zihad,
Rubayet Sabbir Faruque,
Asif Sushmit,
Mashrur Imtiaz,
Farig Sadeque,
Syed Shahrier Rahman
Abstract:
The International Phonetic Alphabet (IPA) serves to systematize phonemes in language, enabling precise textual representation of pronunciation. In Bengali phonology and phonetics, ongoing scholarly deliberations persist concerning the IPA standard and core Bengali phonemes. This work examines prior research, identifies current and potential issues, and suggests a framework for a Bengali IPA standa…
▽ More
The International Phonetic Alphabet (IPA) serves to systematize phonemes in language, enabling precise textual representation of pronunciation. In Bengali phonology and phonetics, ongoing scholarly deliberations persist concerning the IPA standard and core Bengali phonemes. This work examines prior research, identifies current and potential issues, and suggests a framework for a Bengali IPA standard, facilitating linguistic analysis and NLP resource creation and downstream technology development. In this work, we present a comprehensive study of Bengali IPA transcription and introduce a novel IPA transcription framework incorporating a novel dataset with DL-based benchmarks.
△ Less
Submitted 29 March, 2024;
originally announced March 2024.
-
Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEs
Authors:
Md Ashiqur Rahman,
Robert Joseph George,
Mogab Elleithy,
Daniel Leibovici,
Zongyi Li,
Boris Bonev,
Colin White,
Julius Berner,
Raymond A. Yeh,
Jean Kossaifi,
Kamyar Azizzadenesheli,
Anima Anandkumar
Abstract:
Existing neural operator architectures face challenges when solving multiphysics problems with coupled partial differential equations (PDEs), due to complex geometries, interactions between physical variables, and the lack of large amounts of high-resolution training data. To address these issues, we propose Codomain Attention Neural Operator (CoDA-NO), which tokenizes functions along the codomain…
▽ More
Existing neural operator architectures face challenges when solving multiphysics problems with coupled partial differential equations (PDEs), due to complex geometries, interactions between physical variables, and the lack of large amounts of high-resolution training data. To address these issues, we propose Codomain Attention Neural Operator (CoDA-NO), which tokenizes functions along the codomain or channel space, enabling self-supervised learning or pretraining of multiple PDE systems. Specifically, we extend positional encoding, self-attention, and normalization layers to the function space. CoDA-NO can learn representations of different PDE systems with a single model. We evaluate CoDA-NO's potential as a backbone for learning multiphysics PDEs over multiple systems by considering few-shot learning settings. On complex downstream tasks with limited data, such as fluid flow simulations and fluid-structure interactions, we found CoDA-NO to outperform existing methods on the few-shot learning task by over $36\%$. The code is available at https://github.com/ashiq24/CoDA-NO.
△ Less
Submitted 5 April, 2024; v1 submitted 19 March, 2024;
originally announced March 2024.
-
Progression and Challenges of IoT in Healthcare: A Short Review
Authors:
S M Atikur Rahman,
Sifat Ibtisum,
Priya Podder,
S. M. Saokat Hossain
Abstract:
Smart healthcare, an integral element of connected living, plays a pivotal role in fulfilling a fundamental human need. The burgeoning field of smart healthcare is poised to generate substantial revenue in the foreseeable future. Its multifaceted framework encompasses vital components such as the Internet of Things (IoT), medical sensors, artificial intelligence (AI), edge and cloud computing, as…
▽ More
Smart healthcare, an integral element of connected living, plays a pivotal role in fulfilling a fundamental human need. The burgeoning field of smart healthcare is poised to generate substantial revenue in the foreseeable future. Its multifaceted framework encompasses vital components such as the Internet of Things (IoT), medical sensors, artificial intelligence (AI), edge and cloud computing, as well as next-generation wireless communication technologies. Many research papers discuss smart healthcare and healthcare more broadly. Numerous nations have strategically deployed the Internet of Medical Things (IoMT) alongside other measures to combat the propagation of COVID-19. This combined effort has not only enhanced the safety of frontline healthcare workers but has also augmented the overall efficacy in managing the pandemic, subsequently reducing its impact on human lives and mortality rates. Remarkable strides have been made in both applications and technology within the IoMT domain. However, it is imperative to acknowledge that this technological advancement has introduced certain challenges, particularly in the realm of security. The rapid and extensive adoption of IoMT worldwide has magnified issues related to security and privacy. These encompass a spectrum of concerns, ranging from replay attacks, man-in-the-middle attacks, impersonation, privileged insider threats, remote hijacking, password guessing, and denial of service (DoS) attacks, to malware incursions. In this comprehensive review, we undertake a comparative analysis of existing strategies designed for the detection and prevention of malware in IoT environments.
△ Less
Submitted 11 November, 2023;
originally announced November 2023.
-
Truly Scale-Equivariant Deep Nets with Fourier Layers
Authors:
Md Ashiqur Rahman,
Raymond A. Yeh
Abstract:
In computer vision, models must be able to adapt to changes in image resolution to effectively carry out tasks such as image segmentation; This is known as scale-equivariance. Recent works have made progress in developing scale-equivariant convolutional neural networks, e.g., through weight-sharing and kernel resizing. However, these networks are not truly scale-equivariant in practice. Specifical…
▽ More
In computer vision, models must be able to adapt to changes in image resolution to effectively carry out tasks such as image segmentation; This is known as scale-equivariance. Recent works have made progress in developing scale-equivariant convolutional neural networks, e.g., through weight-sharing and kernel resizing. However, these networks are not truly scale-equivariant in practice. Specifically, they do not consider anti-aliasing as they formulate the down-scaling operation in the continuous domain. To address this shortcoming, we directly formulate down-scaling in the discrete domain with consideration of anti-aliasing. We then propose a novel architecture based on Fourier layers to achieve truly scale-equivariant deep nets, i.e., absolute zero equivariance-error. Following prior works, we test this model on MNIST-scale and STL-10 datasets. Our proposed model achieves competitive classification performance while maintaining zero equivariance-error.
△ Less
Submitted 6 November, 2023;
originally announced November 2023.
-
The Significance of Machine Learning in Clinical Disease Diagnosis: A Review
Authors:
S M Atikur Rahman,
Sifat Ibtisum,
Ehsan Bazgir,
Tumpa Barai
Abstract:
The global need for effective disease diagnosis remains substantial, given the complexities of various disease mechanisms and diverse patient symptoms. To tackle these challenges, researchers, physicians, and patients are turning to machine learning (ML), an artificial intelligence (AI) discipline, to develop solutions. By leveraging sophisticated ML and AI methods, healthcare stakeholders gain en…
▽ More
The global need for effective disease diagnosis remains substantial, given the complexities of various disease mechanisms and diverse patient symptoms. To tackle these challenges, researchers, physicians, and patients are turning to machine learning (ML), an artificial intelligence (AI) discipline, to develop solutions. By leveraging sophisticated ML and AI methods, healthcare stakeholders gain enhanced diagnostic and treatment capabilities. However, there is a scarcity of research focused on ML algorithms for enhancing the accuracy and computational efficiency. This research investigates the capacity of machine learning algorithms to improve the transmission of heart rate data in time series healthcare metrics, concentrating particularly on optimizing accuracy and efficiency. By exploring various ML algorithms used in healthcare applications, the review presents the latest trends and approaches in ML-based disease diagnosis (MLBDD). The factors under consideration include the algorithm utilized, the types of diseases targeted, the data types employed, the applications, and the evaluation metrics. This review aims to shed light on the prospects of ML in healthcare, particularly in disease diagnosis. By analyzing the current literature, the study provides insights into state-of-the-art methodologies and their performance metrics.
△ Less
Submitted 25 October, 2023;
originally announced October 2023.
-
RBF Weighted Hyper-Involution for RGB-D Object Detection
Authors:
Mehfuz A Rahman,
Jiju Peethambaran,
Neil London
Abstract:
A vast majority of conventional augmented reality devices are equipped with depth sensors. Depth images produced by such sensors contain complementary information for object detection when used with color images. Despite the benefits, it remains a complex task to simultaneously extract photometric and depth features in real time due to the immanent difference between depth and color images. Moreov…
▽ More
A vast majority of conventional augmented reality devices are equipped with depth sensors. Depth images produced by such sensors contain complementary information for object detection when used with color images. Despite the benefits, it remains a complex task to simultaneously extract photometric and depth features in real time due to the immanent difference between depth and color images. Moreover, standard convolution operations are not sufficient to properly extract information directly from raw depth images leading to intermediate representations of depth which is inefficient. To address these issues, we propose a real-time and two stream RGBD object detection model. The proposed model consists of two new components: a depth guided hyper-involution that adapts dynamically based on the spatial interaction pattern in the raw depth map and an up-sampling based trainable fusion layer that combines the extracted depth and color image features without blocking the information transfer between them. We show that the proposed model outperforms other RGB-D based object detection models on NYU Depth v2 dataset and achieves comparable (second best) results on SUN RGB-D. Additionally, we introduce a new outdoor RGB-D object detection dataset where our proposed model outperforms other models. The performance evaluation on diverse synthetic data generated from CAD models and images shows the potential of the proposed model to be adapted to augmented reality based applications.
△ Less
Submitted 30 September, 2023;
originally announced October 2023.
-
Attention and Pooling based Sigmoid Colon Segmentation in 3D CT images
Authors:
Md Akizur Rahman,
Sonit Singh,
Kuruparan Shanmugalingam,
Sankaran Iyer,
Alan Blair,
Praveen Ravindran,
Arcot Sowmya
Abstract:
Segmentation of the sigmoid colon is a crucial aspect of treating diverticulitis. It enables accurate identification and localisation of inflammation, which in turn helps healthcare professionals make informed decisions about the most appropriate treatment options. This research presents a novel deep learning architecture for segmenting the sigmoid colon from Computed Tomography (CT) images using…
▽ More
Segmentation of the sigmoid colon is a crucial aspect of treating diverticulitis. It enables accurate identification and localisation of inflammation, which in turn helps healthcare professionals make informed decisions about the most appropriate treatment options. This research presents a novel deep learning architecture for segmenting the sigmoid colon from Computed Tomography (CT) images using a modified 3D U-Net architecture. Several variations of the 3D U-Net model with modified hyper-parameters were examined in this study. Pyramid pooling (PyP) and channel-spatial Squeeze and Excitation (csSE) were also used to improve the model performance. The networks were trained using manually annotated sigmoid colon. A five-fold cross-validation procedure was used on a test dataset to evaluate the network's performance. As indicated by the maximum Dice similarity coefficient (DSC) of 56.92+/-1.42%, the application of PyP and csSE techniques improves segmentation precision. We explored ensemble methods including averaging, weighted averaging, majority voting, and max ensemble. The results show that average and majority voting approaches with a threshold value of 0.5 and consistent weight distribution among the top three models produced comparable and optimal results with DSC of 88.11+/-3.52%. The results indicate that the application of a modified 3D U-Net architecture is effective for segmenting the sigmoid colon in Computed Tomography (CT) images. In addition, the study highlights the potential benefits of integrating ensemble methods to improve segmentation precision.
△ Less
Submitted 25 September, 2023;
originally announced September 2023.
-
Syn-Att: Synthetic Speech Attribution via Semi-Supervised Unknown Multi-Class Ensemble of CNNs
Authors:
Md Awsafur Rahman,
Bishmoy Paul,
Najibul Haque Sarker,
Zaber Ibn Abdul Hakim,
Shaikh Anowarul Fattah,
Mohammad Saquib
Abstract:
With the huge technological advances introduced by deep learning in audio & speech processing, many novel synthetic speech techniques achieved incredible realistic results. As these methods generate realistic fake human voices, they can be used in malicious acts such as people imitation, fake news, spreading, spoofing, media manipulations, etc. Hence, the ability to detect synthetic or natural spe…
▽ More
With the huge technological advances introduced by deep learning in audio & speech processing, many novel synthetic speech techniques achieved incredible realistic results. As these methods generate realistic fake human voices, they can be used in malicious acts such as people imitation, fake news, spreading, spoofing, media manipulations, etc. Hence, the ability to detect synthetic or natural speech has become an urgent necessity. Moreover, being able to tell which algorithm has been used to generate a synthetic speech track can be of preeminent importance to track down the culprit. In this paper, a novel strategy is proposed to attribute a synthetic speech track to the generator that is used to synthesize it. The proposed detector transforms the audio into log-mel spectrogram, extracts features using CNN, and classifies it between five known and unknown algorithms, utilizing semi-supervision and ensemble to improve its robustness and generalizability significantly. The proposed detector is validated on two evaluation datasets consisting of a total of 18,000 weakly perturbed (Eval 1) & 10,000 strongly perturbed (Eval 2) synthetic speeches. The proposed method outperforms other top teams in accuracy by 12-13% on Eval 2 and 1-2% on Eval 1, in the IEEE SP Cup challenge at ICASSP 2022.
△ Less
Submitted 15 September, 2023;
originally announced September 2023.
-
Evaluating the Reliability of CNN Models on Classifying Traffic and Road Signs using LIME
Authors:
Md. Atiqur Rahman,
Ahmed Saad Tanim,
Sanjid Islam,
Fahim Pranto,
G. M. Shahariar,
Md. Tanvir Rouf Shawon
Abstract:
The objective of this investigation is to evaluate and contrast the effectiveness of four state-of-the-art pre-trained models, ResNet-34, VGG-19, DenseNet-121, and Inception V3, in classifying traffic and road signs with the utilization of the GTSRB public dataset. The study focuses on evaluating the accuracy of these models' predictions as well as their ability to employ appropriate features for…
▽ More
The objective of this investigation is to evaluate and contrast the effectiveness of four state-of-the-art pre-trained models, ResNet-34, VGG-19, DenseNet-121, and Inception V3, in classifying traffic and road signs with the utilization of the GTSRB public dataset. The study focuses on evaluating the accuracy of these models' predictions as well as their ability to employ appropriate features for image categorization. To gain insights into the strengths and limitations of the model's predictions, the study employs the local interpretable model-agnostic explanations (LIME) framework. The findings of this experiment indicate that LIME is a crucial tool for improving the interpretability and dependability of machine learning models for image identification, regardless of the models achieving an f1 score of 0.99 on classifying traffic and road signs. The conclusion of this study has important ramifications for how these models are used in practice, as it is crucial to ensure that model predictions are founded on the pertinent image features.
△ Less
Submitted 11 September, 2023;
originally announced September 2023.
-
Bornil: An open-source sign language data crowdsourcing platform for AI enabled dialect-agnostic communication
Authors:
Shahriar Elahi Dhruvo,
Mohammad Akhlaqur Rahman,
Manash Kumar Mandal,
Md. Istiak Hossain Shihab,
A. A. Noman Ansary,
Kaneez Fatema Shithi,
Sanjida Khanom,
Rabeya Akter,
Safaeid Hossain Arib,
M. N. Ansary,
Sazia Mehnaz,
Rezwana Sultana,
Sejuti Rahman,
Sayma Sultana Chowdhury,
Sabbir Ahmed Chowdhury,
Farig Sadeque,
Asif Sushmit
Abstract:
The absence of annotated sign language datasets has hindered the development of sign language recognition and translation technologies. In this paper, we introduce Bornil; a crowdsource-friendly, multilingual sign language data collection, annotation, and validation platform. Bornil allows users to record sign language gestures and lets annotators perform sentence and gloss-level annotation. It al…
▽ More
The absence of annotated sign language datasets has hindered the development of sign language recognition and translation technologies. In this paper, we introduce Bornil; a crowdsource-friendly, multilingual sign language data collection, annotation, and validation platform. Bornil allows users to record sign language gestures and lets annotators perform sentence and gloss-level annotation. It also allows validators to make sure of the quality of both the recorded videos and the annotations through manual validation to develop high-quality datasets for deep learning-based Automatic Sign Language Recognition. To demonstrate the system's efficacy; we collected the largest sign language dataset for Bangladeshi Sign Language dialect, perform deep learning based Sign Language Recognition modeling, and report the benchmark performance. The Bornil platform, BornilDB v1.0 Dataset, and the codebases are available on https://bornil.bengali.ai
△ Less
Submitted 29 August, 2023;
originally announced August 2023.
-
Semi-Supervised Semantic Depth Estimation using Symbiotic Transformer and NearFarMix Augmentation
Authors:
Md Awsafur Rahman,
Shaikh Anowarul Fattah
Abstract:
In computer vision, depth estimation is crucial for domains like robotics, autonomous vehicles, augmented reality, and virtual reality. Integrating semantics with depth enhances scene understanding through reciprocal information sharing. However, the scarcity of semantic information in datasets poses challenges. Existing convolutional approaches with limited local receptive fields hinder the full…
▽ More
In computer vision, depth estimation is crucial for domains like robotics, autonomous vehicles, augmented reality, and virtual reality. Integrating semantics with depth enhances scene understanding through reciprocal information sharing. However, the scarcity of semantic information in datasets poses challenges. Existing convolutional approaches with limited local receptive fields hinder the full utilization of the symbiotic potential between depth and semantics. This paper introduces a dataset-invariant semi-supervised strategy to address the scarcity of semantic information. It proposes the Depth Semantics Symbiosis module, leveraging the Symbiotic Transformer for achieving comprehensive mutual awareness by information exchange within both local and global contexts. Additionally, a novel augmentation, NearFarMix is introduced to combat overfitting and compensate both depth-semantic tasks by strategically merging regions from two images, generating diverse and structurally consistent samples with enhanced control. Extensive experiments on NYU-Depth-V2 and KITTI datasets demonstrate the superiority of our proposed techniques in indoor and outdoor environments.
△ Less
Submitted 28 August, 2023;
originally announced August 2023.
-
Ensemble of Anchor-Free Models for Robust Bangla Document Layout Segmentation
Authors:
U Mong Sain Chak,
Md. Asib Rahman
Abstract:
In this research paper, we introduce a novel approach designed for the purpose of segmenting the layout of Bangla documents. Our methodology involves the utilization of a sophisticated ensemble of YOLOv8 models, which were trained for the DL Sprint 2.0 - BUET CSE Fest 2023 Competition focused on Bangla document layout segmentation. Our primary emphasis lies in enhancing various aspects of the task…
▽ More
In this research paper, we introduce a novel approach designed for the purpose of segmenting the layout of Bangla documents. Our methodology involves the utilization of a sophisticated ensemble of YOLOv8 models, which were trained for the DL Sprint 2.0 - BUET CSE Fest 2023 Competition focused on Bangla document layout segmentation. Our primary emphasis lies in enhancing various aspects of the task, including techniques such as image augmentation, model architecture, and the incorporation of model ensembles. We deliberately reduce the quality of a subset of document images to enhance the resilience of model training, thereby resulting in an improvement in our cross-validation score. By employing Bayesian optimization, we determine the optimal confidence and Intersection over Union (IoU) thresholds for our model ensemble. Through our approach, we successfully demonstrate the effectiveness of anchor-free models in achieving robust layout segmentation in Bangla documents.
△ Less
Submitted 29 August, 2023; v1 submitted 28 August, 2023;
originally announced August 2023.
-
Logistics Hub Location Optimization: A K-Means and P-Median Model Hybrid Approach Using Road Network Distances
Authors:
Muhammad Abdul Rahman,
Muhammad Aamir Basheer,
Zubair Khalid,
Muhammad Tahir,
Momin Uppal
Abstract:
Logistic hubs play a pivotal role in the last-mile delivery distance; even a slight increment in distance negatively impacts the business of the e-commerce industry while also increasing its carbon footprint. The growth of this industry, particularly after Covid-19, has further intensified the need for optimized allocation of resources in an urban environment. In this study, we use a hybrid approa…
▽ More
Logistic hubs play a pivotal role in the last-mile delivery distance; even a slight increment in distance negatively impacts the business of the e-commerce industry while also increasing its carbon footprint. The growth of this industry, particularly after Covid-19, has further intensified the need for optimized allocation of resources in an urban environment. In this study, we use a hybrid approach to optimize the placement of logistic hubs. The approach sequentially employs different techniques. Initially, delivery points are clustered using K-Means in relation to their spatial locations. The clustering method utilizes road network distances as opposed to Euclidean distances. Non-road network-based approaches have been avoided since they lead to erroneous and misleading results. Finally, hubs are located using the P-Median method. The P-Median method also incorporates the number of deliveries and population as weights. Real-world delivery data from Muller and Phipps (M&P) is used to demonstrate the effectiveness of the approach. Serving deliveries from the optimal hub locations results in the saving of 815 (10%) meters per delivery.
△ Less
Submitted 4 July, 2024; v1 submitted 18 August, 2023;
originally announced August 2023.
-
IndoHerb: Indonesia Medicinal Plants Recognition using Transfer Learning and Deep Learning
Authors:
Muhammad Salman Ikrar Musyaffa,
Novanto Yudistira,
Muhammad Arif Rahman,
Jati Batoro
Abstract:
The rich diversity of herbal plants in Indonesia holds immense potential as alternative resources for traditional healing and ethnobotanical practices. However, the dwindling recognition of herbal plants due to modernization poses a significant challenge in preserving this valuable heritage. The accurate identification of these plants is crucial for the continuity of traditional practices and the…
▽ More
The rich diversity of herbal plants in Indonesia holds immense potential as alternative resources for traditional healing and ethnobotanical practices. However, the dwindling recognition of herbal plants due to modernization poses a significant challenge in preserving this valuable heritage. The accurate identification of these plants is crucial for the continuity of traditional practices and the utilization of their nutritional benefits. Nevertheless, the manual identification of herbal plants remains a time-consuming task, demanding expert knowledge and meticulous examination of plant characteristics. In response, the application of computer vision emerges as a promising solution to facilitate the efficient identification of herbal plants. This research addresses the task of classifying Indonesian herbal plants through the implementation of transfer learning of Convolutional Neural Networks (CNN). To support our study, we curated an extensive dataset of herbal plant images from Indonesia with careful manual selection. Subsequently, we conducted rigorous data preprocessing, and classification utilizing transfer learning methodologies with five distinct models: ResNet, DenseNet, VGG, ConvNeXt, and Swin Transformer. Our comprehensive analysis revealed that ConvNeXt achieved the highest accuracy, standing at an impressive 92.5%. Additionally, we conducted testing using a scratch model, resulting in an accuracy of 53.9%. The experimental setup featured essential hyperparameters, including the ExponentialLR scheduler with a gamma value of 0.9, a learning rate of 0.001, the Cross-Entropy Loss function, the Adam optimizer, and a training epoch count of 50. This study's outcomes offer valuable insights and practical implications for the automated identification of Indonesian medicinal plants.
△ Less
Submitted 9 June, 2024; v1 submitted 3 August, 2023;
originally announced August 2023.
-
PD-SEG: Population Disaggregation Using Deep Segmentation Networks For Improved Built Settlement Mask
Authors:
Muhammad Abdul Rahman,
Muhammad Ahmad Waseem,
Zubair Khalid,
Muhammad Tahir,
Momin Uppal
Abstract:
Any policy-level decision-making procedure and academic research involving the optimum use of resources for development and planning initiatives depends on accurate population density statistics. The current cutting-edge datasets offered by WorldPop and Meta do not succeed in achieving this aim for developing nations like Pakistan; the inputs to their algorithms provide flawed estimates that fail…
▽ More
Any policy-level decision-making procedure and academic research involving the optimum use of resources for development and planning initiatives depends on accurate population density statistics. The current cutting-edge datasets offered by WorldPop and Meta do not succeed in achieving this aim for developing nations like Pakistan; the inputs to their algorithms provide flawed estimates that fail to capture the spatial and land-use dynamics. In order to precisely estimate population counts at a resolution of 30 meters by 30 meters, we use an accurate built settlement mask obtained using deep segmentation networks and satellite imagery. The Points of Interest (POI) data is also used to exclude non-residential areas.
△ Less
Submitted 29 July, 2023;
originally announced July 2023.
-
BN-DRISHTI: Bangla Document Recognition through Instance-level Segmentation of Handwritten Text Images
Authors:
Sheikh Mohammad Jubaer,
Nazifa Tabassum,
Md. Ataur Rahman,
Mohammad Khairul Islam
Abstract:
Handwriting recognition remains challenging for some of the most spoken languages, like Bangla, due to the complexity of line and word segmentation brought by the curvilinear nature of writing and lack of quality datasets. This paper solves the segmentation problem by introducing a state-of-the-art method (BN-DRISHTI) that combines a deep learning-based object detection framework (YOLO) with Hough…
▽ More
Handwriting recognition remains challenging for some of the most spoken languages, like Bangla, due to the complexity of line and word segmentation brought by the curvilinear nature of writing and lack of quality datasets. This paper solves the segmentation problem by introducing a state-of-the-art method (BN-DRISHTI) that combines a deep learning-based object detection framework (YOLO) with Hough and Affine transformation for skew correction. However, training deep learning models requires a massive amount of data. Thus, we also present an extended version of the BN-HTRd dataset comprising 786 full-page handwritten Bangla document images, line and word-level annotation for segmentation, and corresponding ground truths for word recognition. Evaluation on the test portion of our dataset resulted in an F-score of 99.97% for line and 98% for word segmentation. For comparative analysis, we used three external Bangla handwritten datasets, namely BanglaWriting, WBSUBNdb_text, and ICDAR 2013, where our system outperformed by a significant margin, further justifying the performance of our approach on completely unseen samples.
△ Less
Submitted 31 May, 2023;
originally announced June 2023.
-
DEMIST: A deep-learning-based task-specific denoising approach for myocardial perfusion SPECT
Authors:
Md Ashequr Rahman,
Zitong Yu,
Richard Laforest,
Craig K. Abbey,
Barry A. Siegel,
Abhinav K. Jha
Abstract:
There is an important need for methods to process myocardial perfusion imaging (MPI) SPECT images acquired at lower radiation dose and/or acquisition time such that the processed images improve observer performance on the clinical task of detecting perfusion defects. To address this need, we build upon concepts from model-observer theory and our understanding of the human visual system to propose…
▽ More
There is an important need for methods to process myocardial perfusion imaging (MPI) SPECT images acquired at lower radiation dose and/or acquisition time such that the processed images improve observer performance on the clinical task of detecting perfusion defects. To address this need, we build upon concepts from model-observer theory and our understanding of the human visual system to propose a Detection task-specific deep-learning-based approach for denoising MPI SPECT images (DEMIST). The approach, while performing denoising, is designed to preserve features that influence observer performance on detection tasks. We objectively evaluated DEMIST on the task of detecting perfusion defects using a retrospective study with anonymized clinical data in patients who underwent MPI studies across two scanners (N = 338). The evaluation was performed at low-dose levels of 6.25%, 12.5% and 25% and using an anthropomorphic channelized Hotelling observer. Performance was quantified using area under the receiver operating characteristics curve (AUC). Images denoised with DEMIST yielded significantly higher AUC compared to corresponding low-dose images and images denoised with a commonly used task-agnostic DL-based denoising method. Similar results were observed with stratified analysis based on patient sex and defect type. Additionally, DEMIST improved visual fidelity of the low-dose images as quantified using root mean squared error and structural similarity index metric. A mathematical analysis revealed that DEMIST preserved features that assist in detection tasks while improving the noise properties, resulting in improved observer performance. The results provide strong evidence for further clinical evaluation of DEMIST to denoise low-count images in MPI SPECT.
△ Less
Submitted 25 October, 2023; v1 submitted 7 June, 2023;
originally announced June 2023.
-
SHATTER: Control and Defense-Aware Attack Analytics for Activity-Driven Smart Home Systems
Authors:
Nur Imtiazul Haque,
Maurice Ngouen,
Mohammad Ashiqur Rahman,
Selcuk Uluagac,
Laurent Njilla
Abstract:
Modern smart home control systems utilize real-time occupancy and activity monitoring to ensure control efficiency, occupants' comfort, and optimal energy consumption. Moreover, adopting machine learning-based anomaly detection models (ADMs) enhances security and reliability. However, sufficient system knowledge allows adversaries/attackers to alter sensor measurements through stealthy false data…
▽ More
Modern smart home control systems utilize real-time occupancy and activity monitoring to ensure control efficiency, occupants' comfort, and optimal energy consumption. Moreover, adopting machine learning-based anomaly detection models (ADMs) enhances security and reliability. However, sufficient system knowledge allows adversaries/attackers to alter sensor measurements through stealthy false data injection (FDI) attacks. Although ADMs limit attack scopes, the availability of information like occupants' location, conducted activities, and alteration capability of smart appliances increase the attack surface. Therefore, performing an attack space analysis of modern home control systems is crucial to design robust defense solutions. However, state-of-the-art analyzers do not consider contemporary control and defense solutions and generate trivial attack vectors. To address this, we propose a control and defense-aware novel attack analysis framework for a modern smart home control system, efficiently extracting ADM rules. We verify and validate our framework using a state-of-the-art dataset and a prototype testbed.
△ Less
Submitted 27 April, 2023;
originally announced May 2023.
-
An Efficient Transfer Learning-based Approach for Apple Leaf Disease Classification
Authors:
Md. Hamjajul Ashmafee,
Tasnim Ahmed,
Sabbir Ahmed,
Md. Bakhtiar Hasan,
Mst Nura Jahan,
A. B. M. Ashikur Rahman
Abstract:
Correct identification and categorization of plant diseases are crucial for ensuring the safety of the global food supply and the overall financial success of stakeholders. In this regard, a wide range of solutions has been made available by introducing deep learning-based classification systems for different staple crops. Despite being one of the most important commercial crops in many parts of t…
▽ More
Correct identification and categorization of plant diseases are crucial for ensuring the safety of the global food supply and the overall financial success of stakeholders. In this regard, a wide range of solutions has been made available by introducing deep learning-based classification systems for different staple crops. Despite being one of the most important commercial crops in many parts of the globe, research proposing a smart solution for automatically classifying apple leaf diseases remains relatively unexplored. This study presents a technique for identifying apple leaf diseases based on transfer learning. The system extracts features using a pretrained EfficientNetV2S architecture and passes to a classifier block for effective prediction. The class imbalance issues are tackled by utilizing runtime data augmentation. The effect of various hyperparameters, such as input resolution, learning rate, number of epochs, etc., has been investigated carefully. The competence of the proposed pipeline has been evaluated on the apple leaf disease subset from the publicly available `PlantVillage' dataset, where it achieved an accuracy of 99.21%, outperforming the existing works.
△ Less
Submitted 10 April, 2023;
originally announced April 2023.
-
Enhancing Cluster Quality of Numerical Datasets with Domain Ontology
Authors:
Sudath Rohitha Heiyanthuduwage,
Md Anisur Rahman,
Md Zahidul Islam
Abstract:
Ontology-based clustering has gained attention in recent years due to the potential benefits of ontology. Current ontology-based clustering approaches have mainly been applied to reduce the dimensionality of attributes in text document clustering. Reduction in dimensionality of attributes using ontology helps to produce high quality clusters for a dataset. However, ontology-based approaches in clu…
▽ More
Ontology-based clustering has gained attention in recent years due to the potential benefits of ontology. Current ontology-based clustering approaches have mainly been applied to reduce the dimensionality of attributes in text document clustering. Reduction in dimensionality of attributes using ontology helps to produce high quality clusters for a dataset. However, ontology-based approaches in clustering numerical datasets have not been gained enough attention. Moreover, some literature mentions that ontology-based clustering can produce either high quality or low-quality clusters from a dataset. Therefore, in this paper we present a clustering approach that is based on domain ontology to reduce the dimensionality of attributes in a numerical dataset using domain ontology and to produce high quality clusters. For every dataset, we produce three datasets using domain ontology. We then cluster these datasets using a genetic algorithm-based clustering technique called GenClust++. The clusters of each dataset are evaluated in terms of Sum of Squared-Error (SSE). We use six numerical datasets to evaluate the performance of our ontology-based approach. The experimental results of our approach indicate that cluster quality gradually improves from lower to the higher levels of a domain ontology.
△ Less
Submitted 2 April, 2023;
originally announced April 2023.
-
CIFF-Net: Contextual Image Feature Fusion for Melanoma Diagnosis
Authors:
Md Awsafur Rahman,
Bishmoy Paul,
Tanvir Mahmud,
Shaikh Anowarul Fattah
Abstract:
Melanoma is considered to be the deadliest variant of skin cancer causing around 75\% of total skin cancer deaths. To diagnose Melanoma, clinicians assess and compare multiple skin lesions of the same patient concurrently to gather contextual information regarding the patterns, and abnormality of the skin. So far this concurrent multi-image comparative method has not been explored by existing deep…
▽ More
Melanoma is considered to be the deadliest variant of skin cancer causing around 75\% of total skin cancer deaths. To diagnose Melanoma, clinicians assess and compare multiple skin lesions of the same patient concurrently to gather contextual information regarding the patterns, and abnormality of the skin. So far this concurrent multi-image comparative method has not been explored by existing deep learning-based schemes. In this paper, based on contextual image feature fusion (CIFF), a deep neural network (CIFF-Net) is proposed, which integrates patient-level contextual information into the traditional approaches for improved Melanoma diagnosis by concurrent multi-image comparative method. The proposed multi-kernel self attention (MKSA) module offers better generalization of the extracted features by introducing multi-kernel operations in the self attention mechanisms. To utilize both self attention and contextual feature-wise attention, an attention guided module named contextual feature fusion (CFF) is proposed that integrates extracted features from different contextual images into a single feature vector. Finally, in comparative contextual feature fusion (CCFF) module, primary and contextual features are compared concurrently to generate comparative features. Significant improvement in performance has been achieved on the ISIC-2020 dataset over the traditional approaches that validate the effectiveness of the proposed contextual learning scheme.
△ Less
Submitted 7 March, 2023;
originally announced March 2023.
-
DwinFormer: Dual Window Transformers for End-to-End Monocular Depth Estimation
Authors:
Md Awsafur Rahman,
Shaikh Anowarul Fattah
Abstract:
Depth estimation from a single image is of paramount importance in the realm of computer vision, with a multitude of applications. Conventional methods suffer from the trade-off between consistency and fine-grained details due to the local-receptive field limiting their practicality. This lack of long-range dependency inherently comes from the convolutional neural network part of the architecture.…
▽ More
Depth estimation from a single image is of paramount importance in the realm of computer vision, with a multitude of applications. Conventional methods suffer from the trade-off between consistency and fine-grained details due to the local-receptive field limiting their practicality. This lack of long-range dependency inherently comes from the convolutional neural network part of the architecture. In this paper, a dual window transformer-based network, namely DwinFormer, is proposed, which utilizes both local and global features for end-to-end monocular depth estimation. The DwinFormer consists of dual window self-attention and cross-attention transformers, Dwin-SAT and Dwin-CAT, respectively. The Dwin-SAT seamlessly extracts intricate, locally aware features while concurrently capturing global context. It harnesses the power of local and global window attention to adeptly capture both short-range and long-range dependencies, obviating the need for complex and computationally expensive operations, such as attention masking or window shifting. Moreover, Dwin-SAT introduces inductive biases which provide desirable properties, such as translational equvariance and less dependence on large-scale data. Furthermore, conventional decoding methods often rely on skip connections which may result in semantic discrepancies and a lack of global context when fusing encoder and decoder features. In contrast, the Dwin-CAT employs both local and global window cross-attention to seamlessly fuse encoder and decoder features with both fine-grained local and contextually aware global information, effectively amending semantic gap. Empirical evidence obtained through extensive experimentation on the NYU-Depth-V2 and KITTI datasets demonstrates the superiority of the proposed method, consistently outperforming existing approaches across both indoor and outdoor environments.
△ Less
Submitted 7 March, 2023; v1 submitted 6 March, 2023;
originally announced March 2023.
-
Need for Objective Task-based Evaluation of Deep Learning-Based Denoising Methods: A Study in the Context of Myocardial Perfusion SPECT
Authors:
Zitong Yu,
Md Ashequr Rahman,
Richard Laforest,
Thomas H. Schindler,
Robert J. Gropler,
Richard L. Wahl,
Barry A. Siegel,
Abhinav K. Jha
Abstract:
Artificial intelligence-based methods have generated substantial interest in nuclear medicine. An area of significant interest has been using deep-learning (DL)-based approaches for denoising images acquired with lower doses, shorter acquisition times, or both. Objective evaluation of these approaches is essential for clinical application. DL-based approaches for denoising nuclear-medicine images…
▽ More
Artificial intelligence-based methods have generated substantial interest in nuclear medicine. An area of significant interest has been using deep-learning (DL)-based approaches for denoising images acquired with lower doses, shorter acquisition times, or both. Objective evaluation of these approaches is essential for clinical application. DL-based approaches for denoising nuclear-medicine images have typically been evaluated using fidelity-based figures of merit (FoMs) such as RMSE and SSIM. However, these images are acquired for clinical tasks and thus should be evaluated based on their performance in these tasks. Our objectives were to (1) investigate whether evaluation with these FoMs is consistent with objective clinical-task-based evaluation; (2) provide a theoretical analysis for determining the impact of denoising on signal-detection tasks; (3) demonstrate the utility of virtual clinical trials (VCTs) to evaluate DL-based methods. A VCT to evaluate a DL-based method for denoising myocardial perfusion SPECT (MPS) images was conducted. The impact of DL-based denoising was evaluated using fidelity-based FoMs and AUC, which quantified performance on detecting perfusion defects in MPS images as obtained using a model observer with anthropomorphic channels. Based on fidelity-based FoMs, denoising using the considered DL-based method led to significantly superior performance. However, based on ROC analysis, denoising did not improve, and in fact, often degraded detection-task performance. The results motivate the need for objective task-based evaluation of DL-based denoising approaches. Further, this study shows how VCTs provide a mechanism to conduct such evaluations using VCTs. Finally, our theoretical treatment reveals insights into the reasons for the limited performance of the denoising approach.
△ Less
Submitted 1 April, 2023; v1 submitted 3 March, 2023;
originally announced March 2023.
-
A task-specific deep-learning-based denoising approach for myocardial perfusion SPECT
Authors:
Md Ashequr Rahman,
Zitong Yu,
Barry A. Siegel,
Abhinav K. Jha
Abstract:
Deep-learning (DL)-based methods have shown significant promise in denoising myocardial perfusion SPECT images acquired at low dose. For clinical application of these methods, evaluation on clinical tasks is crucial. Typically, these methods are designed to minimize some fidelity-based criterion between the predicted denoised image and some reference normal-dose image. However, while promising, st…
▽ More
Deep-learning (DL)-based methods have shown significant promise in denoising myocardial perfusion SPECT images acquired at low dose. For clinical application of these methods, evaluation on clinical tasks is crucial. Typically, these methods are designed to minimize some fidelity-based criterion between the predicted denoised image and some reference normal-dose image. However, while promising, studies have shown that these methods may have limited impact on the performance of clinical tasks in SPECT. To address this issue, we use concepts from the literature on model observers and our understanding of the human visual system to propose a DL-based denoising approach designed to preserve observer-related information for detection tasks. The proposed method was objectively evaluated on the task of detecting perfusion defect in myocardial perfusion SPECT images using a retrospective study with anonymized clinical data. Our results demonstrate that the proposed method yields improved performance on this detection task compared to using low-dose images. The results show that by preserving task-specific information, DL may provide a mechanism to improve observer performance in low-dose myocardial perfusion SPECT.
△ Less
Submitted 28 February, 2023;
originally announced March 2023.
-
ArtiFact: A Large-Scale Dataset with Artificial and Factual Images for Generalizable and Robust Synthetic Image Detection
Authors:
Md Awsafur Rahman,
Bishmoy Paul,
Najibul Haque Sarker,
Zaber Ibn Abdul Hakim,
Shaikh Anowarul Fattah
Abstract:
Synthetic image generation has opened up new opportunities but has also created threats in regard to privacy, authenticity, and security. Detecting fake images is of paramount importance to prevent illegal activities, and previous research has shown that generative models leave unique patterns in their synthetic images that can be exploited to detect them. However, the fundamental problem of gener…
▽ More
Synthetic image generation has opened up new opportunities but has also created threats in regard to privacy, authenticity, and security. Detecting fake images is of paramount importance to prevent illegal activities, and previous research has shown that generative models leave unique patterns in their synthetic images that can be exploited to detect them. However, the fundamental problem of generalization remains, as even state-of-the-art detectors encounter difficulty when facing generators never seen during training. To assess the generalizability and robustness of synthetic image detectors in the face of real-world impairments, this paper presents a large-scale dataset named ArtiFact, comprising diverse generators, object categories, and real-world challenges. Moreover, the proposed multi-class classification scheme, combined with a filter stride reduction strategy addresses social platform impairments and effectively detects synthetic images from both seen and unseen generators. The proposed solution significantly outperforms other top teams by 8.34% on Test 1, 1.26% on Test 2, and 15.08% on Test 3 in the IEEE VIP Cup challenge at ICIP 2022, as measured by the accuracy metric.
△ Less
Submitted 24 February, 2023; v1 submitted 23 February, 2023;
originally announced February 2023.
-
Neural Operator: Is data all you need to model the world? An insight into the impact of Physics Informed Machine Learning
Authors:
Hrishikesh Viswanath,
Md Ashiqur Rahman,
Abhijeet Vyas,
Andrey Shor,
Beatriz Medeiros,
Stephanie Hernandez,
Suhas Eswarappa Prameela,
Aniket Bera
Abstract:
Numerical approximations of partial differential equations (PDEs) are routinely employed to formulate the solution of physics, engineering and mathematical problems involving functions of several variables, such as the propagation of heat or sound, fluid flow, elasticity, electrostatics, electrodynamics, and more. While this has led to solving many complex phenomena, there are some limitations. Co…
▽ More
Numerical approximations of partial differential equations (PDEs) are routinely employed to formulate the solution of physics, engineering and mathematical problems involving functions of several variables, such as the propagation of heat or sound, fluid flow, elasticity, electrostatics, electrodynamics, and more. While this has led to solving many complex phenomena, there are some limitations. Conventional approaches such as Finite Element Methods (FEMs) and Finite Differential Methods (FDMs) require considerable time and are computationally expensive. In contrast, data driven machine learning-based methods such as neural networks provide a faster, fairly accurate alternative, and have certain advantages such as discretization invariance and resolution invariance. This article aims to provide a comprehensive insight into how data-driven approaches can complement conventional techniques to solve engineering and physics problems, while also noting some of the major pitfalls of machine learning-based approaches. Furthermore, we highlight, a novel and fast machine learning-based approach (~1000x) to learning the solution operator of a PDE operator learning. We will note how these new computational approaches can bring immense advantages in tackling many problems in fundamental and applied physics.
△ Less
Submitted 18 September, 2023; v1 submitted 30 January, 2023;
originally announced January 2023.
-
Cell-free mMIMO Support in the O-RAN Architecture: A PHY Layer Perspective for 5G and Beyond Networks
Authors:
Vida Ranjbar,
Adam Girycki,
Md Arifur Rahman,
Sofie Pollin,
Marc Moonen,
Evgenii Vinogradov
Abstract:
To keep supporting next-generation requirements, the radio access infrastructure will increasingly densify. Cell-free (CF) network architectures are emerging, combining dense deployments with extreme flexibility in allocating resources to users. In parallel, the Open Radio Access Networks (O-RAN) paradigm is transforming RAN towards an open, intelligent, virtualized, and fully interoperable archit…
▽ More
To keep supporting next-generation requirements, the radio access infrastructure will increasingly densify. Cell-free (CF) network architectures are emerging, combining dense deployments with extreme flexibility in allocating resources to users. In parallel, the Open Radio Access Networks (O-RAN) paradigm is transforming RAN towards an open, intelligent, virtualized, and fully interoperable architecture. This paradigm brings the needed flexibility and intelligent control opportunities for CF networking. In this paper, we document the current O-RAN terminology and contrast it with some common CF processing approaches. We then discuss the main O-RAN innovations and research challenges that remain to be solved.
△ Less
Submitted 25 January, 2023;
originally announced January 2023.
-
Fruit Quality Assessment with Densely Connected Convolutional Neural Network
Authors:
Md. Samin Morshed,
Sabbir Ahmed,
Tasnim Ahmed,
Muhammad Usama Islam,
A. B. M. Ashikur Rahman
Abstract:
Accurate recognition of food items along with quality assessment is of paramount importance in the agricultural industry. Such automated systems can speed up the wheel of the food processing sector and save tons of manual labor. In this connection, the recent advancement of Deep learning-based architectures has introduced a wide variety of solutions offering remarkable performance in several class…
▽ More
Accurate recognition of food items along with quality assessment is of paramount importance in the agricultural industry. Such automated systems can speed up the wheel of the food processing sector and save tons of manual labor. In this connection, the recent advancement of Deep learning-based architectures has introduced a wide variety of solutions offering remarkable performance in several classification tasks. In this work, we have exploited the concept of Densely Connected Convolutional Neural Networks (DenseNets) for fruit quality assessment. The feature propagation towards the deeper layers has enabled the network to tackle the vanishing gradient problems and ensured the reuse of features to learn meaningful insights. Evaluating on a dataset of 19,526 images containing six fruits having three quality grades for each, the proposed pipeline achieved a remarkable accuracy of 99.67%. The robustness of the model was further tested for fruit classification and quality assessment tasks where the model produced a similar performance, which makes it suitable for real-life applications.
△ Less
Submitted 24 December, 2022; v1 submitted 8 December, 2022;
originally announced December 2022.
-
PaCMO: Partner Dependent Human Motion Generation in Dyadic Human Activity using Neural Operators
Authors:
Md Ashiqur Rahman,
Jasorsi Ghosh,
Hrishikesh Viswanath,
Kamyar Azizzadenesheli,
Aniket Bera
Abstract:
We address the problem of generating 3D human motions in dyadic activities. In contrast to the concurrent works, which mainly focus on generating the motion of a single actor from the textual description, we generate the motion of one of the actors from the motion of the other participating actor in the action. This is a particularly challenging, under-explored problem, that requires learning intr…
▽ More
We address the problem of generating 3D human motions in dyadic activities. In contrast to the concurrent works, which mainly focus on generating the motion of a single actor from the textual description, we generate the motion of one of the actors from the motion of the other participating actor in the action. This is a particularly challenging, under-explored problem, that requires learning intricate relationships between the motion of two actors participating in an action and also identifying the action from the motion of one actor. To address these, we propose partner conditioned motion operator (PaCMO), a neural operator-based generative model which learns the distribution of human motion conditioned by the partner's motion in function spaces through adversarial training. Our model can handle long unlabeled action sequences at arbitrary time resolution. We also introduce the "Functional Frechet Inception Distance" ($F^2ID$) metric for capturing similarity between real and generated data for function spaces. We test PaCMO on NTU RGB+D and DuetDance datasets and our model produces realistic results evidenced by the $F^2ID$ score and the conducted user study.
△ Less
Submitted 25 November, 2022;
originally announced November 2022.
-
AdaFNIO: Adaptive Fourier Neural Interpolation Operator for video frame interpolation
Authors:
Hrishikesh Viswanath,
Md Ashiqur Rahman,
Rashmi Bhaskara,
Aniket Bera
Abstract:
We present, AdaFNIO - Adaptive Fourier Neural Interpolation Operator, a neural operator-based architecture to perform video frame interpolation. Current deep learning based methods rely on local convolutions for feature learning and suffer from not being scale-invariant, thus requiring training data to be augmented through random flipping and re-scaling. On the other hand, AdaFNIO, learns the feat…
▽ More
We present, AdaFNIO - Adaptive Fourier Neural Interpolation Operator, a neural operator-based architecture to perform video frame interpolation. Current deep learning based methods rely on local convolutions for feature learning and suffer from not being scale-invariant, thus requiring training data to be augmented through random flipping and re-scaling. On the other hand, AdaFNIO, learns the features in the frames, independent of input resolution, through token mixing and global convolution in the Fourier space or the spectral domain by using Fast Fourier Transform (FFT). We show that AdaFNIO can produce visually smooth and accurate results. To evaluate the visual quality of our interpolated frames, we calculate the structural similarity index (SSIM) and Peak Signal to Noise Ratio (PSNR) between the generated frame and the ground truth frame. We provide the quantitative performance of our model on Vimeo-90K dataset, DAVIS, UCF101 and DISFA+ dataset.
△ Less
Submitted 8 March, 2023; v1 submitted 19 November, 2022;
originally announced November 2022.
-
Multiple Object Tracking in Recent Times: A Literature Review
Authors:
Mk Bashar,
Samia Islam,
Kashifa Kawaakib Hussain,
Md. Bakhtiar Hasan,
A. B. M. Ashikur Rahman,
Md. Hasanul Kabir
Abstract:
Multiple object tracking gained a lot of interest from researchers in recent years, and it has become one of the trending problems in computer vision, especially with the recent advancement of autonomous driving. MOT is one of the critical vision tasks for different issues like occlusion in crowded scenes, similar appearance, small object detection difficulty, ID switching, etc. To tackle these ch…
▽ More
Multiple object tracking gained a lot of interest from researchers in recent years, and it has become one of the trending problems in computer vision, especially with the recent advancement of autonomous driving. MOT is one of the critical vision tasks for different issues like occlusion in crowded scenes, similar appearance, small object detection difficulty, ID switching, etc. To tackle these challenges, as researchers tried to utilize the attention mechanism of transformer, interrelation of tracklets with graph convolutional neural network, appearance similarity of objects in different frames with the siamese network, they also tried simple IOU matching based CNN network, motion prediction with LSTM. To take these scattered techniques under an umbrella, we have studied more than a hundred papers published over the last three years and have tried to extract the techniques that are more focused on by researchers in recent times to solve the problems of MOT. We have enlisted numerous applications, possibilities, and how MOT can be related to real life. Our review has tried to show the different perspectives of techniques that researchers used overtimes and give some future direction for the potential researchers. Moreover, we have included popular benchmark datasets and metrics in this review.
△ Less
Submitted 11 September, 2022;
originally announced September 2022.
-
Bayesian Hyperparameter Optimization for Deep Neural Network-Based Network Intrusion Detection
Authors:
Mohammad Masum,
Hossain Shahriar,
Hisham Haddad,
Md Jobair Hossain Faruk,
Maria Valero,
Md Abdullah Khan,
Mohammad A. Rahman,
Muhaiminul I. Adnan,
Alfredo Cuzzocrea
Abstract:
Traditional network intrusion detection approaches encounter feasibility and sustainability issues to combat modern, sophisticated, and unpredictable security attacks. Deep neural networks (DNN) have been successfully applied for intrusion detection problems. The optimal use of DNN-based classifiers requires careful tuning of the hyper-parameters. Manually tuning the hyperparameters is tedious, ti…
▽ More
Traditional network intrusion detection approaches encounter feasibility and sustainability issues to combat modern, sophisticated, and unpredictable security attacks. Deep neural networks (DNN) have been successfully applied for intrusion detection problems. The optimal use of DNN-based classifiers requires careful tuning of the hyper-parameters. Manually tuning the hyperparameters is tedious, time-consuming, and computationally expensive. Hence, there is a need for an automatic technique to find optimal hyperparameters for the best use of DNN in intrusion detection. This paper proposes a novel Bayesian optimization-based framework for the automatic optimization of hyperparameters, ensuring the best DNN architecture. We evaluated the performance of the proposed framework on NSL-KDD, a benchmark dataset for network intrusion detection. The experimental results show the framework's effectiveness as the resultant DNN architecture demonstrates significantly higher intrusion detection performance than the random search optimization-based approach in terms of accuracy, precision, recall, and f1-score.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
Ride-Hailing for Autonomous Vehicles: Hyperledger Fabric-Based Secure and Decentralize Blockchain Platform
Authors:
Ryan Shivers,
Mohammad Ashiqur Rahman,
Md Jobair Hossain Faruk,
Hossain Shahriar,
Alfredo Cuzzocrea,
Victor Clincy
Abstract:
Ride-hailing and ride-sharing applications have recently gained popularity as a convenient alternative to traditional modes of travel. Current research into autonomous vehicles is accelerating rapidly and will soon become a critical component of a ride-hailing platforms architecture. Implementing an autonomous vehicle ride-hailing platform proves a difficult challenge due to the centralized nature…
▽ More
Ride-hailing and ride-sharing applications have recently gained popularity as a convenient alternative to traditional modes of travel. Current research into autonomous vehicles is accelerating rapidly and will soon become a critical component of a ride-hailing platforms architecture. Implementing an autonomous vehicle ride-hailing platform proves a difficult challenge due to the centralized nature of traditional ride-hailing architectures. In a traditional ride-hailing environment the drivers operate their own personal vehicles so it follows that a fleet of autonomous vehicles would be required for a centralized ride-hailing platform to succeed. Decentralization of the ride-hailing platform would remove a roadblock along the way to an autonomous vehicle ride-hailing platform by allowing owners of autonomous vehicles to add their vehicles to a community-driven fleet when not in use. Blockchain technology is an attractive choice for this decentralized architecture due to its immutability and fault tolerance. This thesis proposes a framework for developing a decentralized ride-hailing architecture that is verifiably secure. This framework is implemented on the Hyperledger Fabric blockchain platform. The evaluation of the implementation is done by applying known security models, utilizing a static analysis tool, and performing a performance analysis under heavy network load.
△ Less
Submitted 7 July, 2022;
originally announced July 2022.
-
BN-HTRd: A Benchmark Dataset for Document Level Offline Bangla Handwritten Text Recognition (HTR) and Line Segmentation
Authors:
Md. Ataur Rahman,
Nazifa Tabassum,
Mitu Paul,
Riya Pal,
Mohammad Khairul Islam
Abstract:
We introduce a new dataset for offline Handwritten Text Recognition (HTR) from images of Bangla scripts comprising words, lines, and document-level annotations. The BN-HTRd dataset is based on the BBC Bangla News corpus, meant to act as ground truth texts. These texts were subsequently used to generate the annotations that were filled out by people with their handwriting. Our dataset includes 788…
▽ More
We introduce a new dataset for offline Handwritten Text Recognition (HTR) from images of Bangla scripts comprising words, lines, and document-level annotations. The BN-HTRd dataset is based on the BBC Bangla News corpus, meant to act as ground truth texts. These texts were subsequently used to generate the annotations that were filled out by people with their handwriting. Our dataset includes 788 images of handwritten pages produced by approximately 150 different writers. It can be adopted as a basis for various handwriting classification tasks such as end-to-end document recognition, word-spotting, word or line segmentation, and so on. We also propose a scheme to segment Bangla handwritten document images into corresponding lines in an unsupervised manner. Our line segmentation approach takes care of the variability involved in different writing styles, accurately segmenting complex handwritten text lines of curvilinear nature. Along with a bunch of pre-processing and morphological operations, both Hough line and circle transforms were employed to distinguish different linear components. In order to arrange those components into their corresponding lines, we followed an unsupervised clustering approach. The average success rate of our segmentation technique is 81.57% in terms of FM metrics (similar to F-measure) with a mean Average Precision (mAP) of 0.547.
△ Less
Submitted 29 May, 2022;
originally announced June 2022.
-
Two Decades of Bengali Handwritten Digit Recognition: A Survey
Authors:
A. B. M. Ashikur Rahman,
Md. Bakhtiar Hasan,
Sabbir Ahmed,
Tasnim Ahmed,
Md. Hamjajul Ashmafee,
Mohammad Ridwan Kabir,
Md. Hasanul Kabir
Abstract:
Handwritten Digit Recognition (HDR) is one of the most challenging tasks in the domain of Optical Character Recognition (OCR). Irrespective of language, there are some inherent challenges of HDR, which mostly arise due to the variations in writing styles across individuals, writing medium and environment, inability to maintain the same strokes while writing any digit repeatedly, etc. In addition t…
▽ More
Handwritten Digit Recognition (HDR) is one of the most challenging tasks in the domain of Optical Character Recognition (OCR). Irrespective of language, there are some inherent challenges of HDR, which mostly arise due to the variations in writing styles across individuals, writing medium and environment, inability to maintain the same strokes while writing any digit repeatedly, etc. In addition to that, the structural complexities of the digits of a particular language may lead to ambiguous scenarios of HDR. Over the years, researchers have developed numerous offline and online HDR pipelines, where different image processing techniques are combined with traditional Machine Learning (ML)-based and/or Deep Learning (DL)-based architectures. Although evidence of extensive review studies on HDR exists in the literature for languages, such as English, Arabic, Indian, Farsi, Chinese, etc., few surveys on Bengali HDR (BHDR) can be found, which lack a comprehensive analysis of the challenges, the underlying recognition process, and possible future directions. In this paper, the characteristics and inherent ambiguities of Bengali handwritten digits along with a comprehensive insight of two decades of state-of-the-art datasets and approaches towards offline BHDR have been analyzed. Furthermore, several real-life application-specific studies, which involve BHDR, have also been discussed in detail. This paper will also serve as a compendium for researchers interested in the science behind offline BHDR, instigating the exploration of newer avenues of relevant research that may further lead to better offline recognition of Bengali handwritten digits in different application areas.
△ Less
Submitted 25 September, 2022; v1 submitted 5 June, 2022;
originally announced June 2022.
-
Generative Adversarial Neural Operators
Authors:
Md Ashiqur Rahman,
Manuel A. Florez,
Anima Anandkumar,
Zachary E. Ross,
Kamyar Azizzadenesheli
Abstract:
We propose the generative adversarial neural operator (GANO), a generative model paradigm for learning probabilities on infinite-dimensional function spaces. The natural sciences and engineering are known to have many types of data that are sampled from infinite-dimensional function spaces, where classical finite-dimensional deep generative adversarial networks (GANs) may not be directly applicabl…
▽ More
We propose the generative adversarial neural operator (GANO), a generative model paradigm for learning probabilities on infinite-dimensional function spaces. The natural sciences and engineering are known to have many types of data that are sampled from infinite-dimensional function spaces, where classical finite-dimensional deep generative adversarial networks (GANs) may not be directly applicable. GANO generalizes the GAN framework and allows for the sampling of functions by learning push-forward operator maps in infinite-dimensional spaces. GANO consists of two main components, a generator neural operator and a discriminator neural functional. The inputs to the generator are samples of functions from a user-specified probability measure, e.g., Gaussian random field (GRF), and the generator outputs are synthetic data functions. The input to the discriminator is either a real or synthetic data function. In this work, we instantiate GANO using the Wasserstein criterion and show how the Wasserstein loss can be computed in infinite-dimensional spaces. We empirically study GANO in controlled cases where both input and output functions are samples from GRFs and compare its performance to the finite-dimensional counterpart GAN. We empirically study the efficacy of GANO on real-world function data of volcanic activities and show its superior performance over GAN.
△ Less
Submitted 12 October, 2022; v1 submitted 6 May, 2022;
originally announced May 2022.
-
U-NO: U-shaped Neural Operators
Authors:
Md Ashiqur Rahman,
Zachary E. Ross,
Kamyar Azizzadenesheli
Abstract:
Neural operators generalize classical neural networks to maps between infinite-dimensional spaces, e.g., function spaces. Prior works on neural operators proposed a series of novel methods to learn such maps and demonstrated unprecedented success in learning solution operators of partial differential equations. Due to their close proximity to fully connected architectures, these models mainly suff…
▽ More
Neural operators generalize classical neural networks to maps between infinite-dimensional spaces, e.g., function spaces. Prior works on neural operators proposed a series of novel methods to learn such maps and demonstrated unprecedented success in learning solution operators of partial differential equations. Due to their close proximity to fully connected architectures, these models mainly suffer from high memory usage and are generally limited to shallow deep learning models. In this paper, we propose U-shaped Neural Operator (U-NO), a U-shaped memory enhanced architecture that allows for deeper neural operators. U-NOs exploit the problem structures in function predictions and demonstrate fast training, data efficiency, and robustness with respect to hyperparameters choices. We study the performance of U-NO on PDE benchmarks, namely, Darcy's flow law and the Navier-Stokes equations. We show that U-NO results in an average of 26% and 44% prediction improvement on Darcy's flow and turbulent Navier-Stokes equations, respectively, over the state of the art. On Navier-Stokes 3D spatiotemporal operator learning task, we show U-NO provides 37% improvement over the state of art methods.
△ Less
Submitted 5 May, 2023; v1 submitted 23 April, 2022;
originally announced April 2022.
-
Investigating the limited performance of a deep-learning-based SPECT denoising approach: An observer-study-based characterization
Authors:
Zitong Yu,
Md Ashequr Rahman,
Abhinav K. Jha
Abstract:
Multiple objective assessment of image-quality-based studies have reported that several deep-learning-based denoising methods show limited performance on signal-detection tasks. Our goal was to investigate the reasons for this limited performance. To achieve this goal, we conducted a task-based characterization of a DL-based denoising approach for individual signal properties. We conducted this st…
▽ More
Multiple objective assessment of image-quality-based studies have reported that several deep-learning-based denoising methods show limited performance on signal-detection tasks. Our goal was to investigate the reasons for this limited performance. To achieve this goal, we conducted a task-based characterization of a DL-based denoising approach for individual signal properties. We conducted this study in the context of evaluating a DL-based approach for denoising SPECT images. The training data consisted of signals of different sizes and shapes within a clustered-lumpy background, imaged with a 2D parallel-hole-collimator SPECT system. The projections were generated at normal and 20% low count level, both of which were reconstructed using an OSEM algorithm. A CNN-based denoiser was trained to process the low-count images. The performance of this CNN was characterized for five different signal sizes and four different SBR by designing each evaluation as an SKE/BKS signal-detection task. Performance on this task was evaluated using an anthropomorphic CHO. As in previous studies, we observed that the DL-based denoising method did not improve performance on signal-detection tasks. Evaluation using the idea of observer-study-based characterization demonstrated that the DL-based denoising approach did not improve performance on the signal-detection task for any of the signal types. Overall, these results provide new insights on the performance of the DL-based denoising approach as a function of signal size and contrast. More generally, the observer study-based characterization provides a mechanism to evaluate the sensitivity of the method to specific object properties and may be explored as analogous to characterizations such as modulation transfer function for linear systems. Finally, this work underscores the need for objective task-based evaluation of DL-based denoising approaches.
△ Less
Submitted 3 March, 2022;
originally announced March 2022.
-
Secure Spectrum and Resource Sharing for 5G Networks using a Blockchain-based Decentralized Trusted Computing Platform
Authors:
Hisham A. Kholidy,
Mohammad A. Rahman,
Andrew Karam,
Zahid Akhtar
Abstract:
The 5G network would fuel next-gen, bandwidth-heavy technologies such as automation, IoT, and AI on the factory floor. It will improve efficiency by powering AR overlays in workflows, as well as ensure safer practices and reduce the number of defects through predictive analytics and real-time detection of damage. The Dynamic Spectrum Sharing (DSS) in 5G networks will permit 5G NR and 4G LTE to coe…
▽ More
The 5G network would fuel next-gen, bandwidth-heavy technologies such as automation, IoT, and AI on the factory floor. It will improve efficiency by powering AR overlays in workflows, as well as ensure safer practices and reduce the number of defects through predictive analytics and real-time detection of damage. The Dynamic Spectrum Sharing (DSS) in 5G networks will permit 5G NR and 4G LTE to coexist and will provide cost-effective and efficient solutions that enable a smooth transition from 4G to 5G. However, this increases the attack surface in the 5G networks. To the best of our knowledge, none of the current works introduces a real-time secure spectrum-sharing mechanism for 5G networks to defend spectrum resources and applications. This paper aims to propose a Blockchain-based Decentralized Trusted Computing Platform (BTCP) to self-protect large-scale 5G spectrum resources against cyberattacks in a timely, dynamic, and accurate way. Furthermore, the platform provides a decentralized, trusted, and non-repudiating platform to enable secure spectrum sharing and data exchange between the 5G spectrum resources
△ Less
Submitted 3 January, 2022;
originally announced January 2022.
-
BLEnD: Improving NDN Performance Over Wireless Links Using Interest Bundling
Authors:
Md Ashiqur Rahman,
Teng Liang,
Beichuan Zhang
Abstract:
Named Data Networking (NDN) employs small-sized Interest packets to retrieve large-sized Data packets. Given the half-duplex nature of wireless links, Interest packets frequently contend for the channel with Data packets, leading to throughput degradation. In this work, we present a novel idea called BLEnD, an Interest-bundling technique that encodes multiple Interests into one at the sender and d…
▽ More
Named Data Networking (NDN) employs small-sized Interest packets to retrieve large-sized Data packets. Given the half-duplex nature of wireless links, Interest packets frequently contend for the channel with Data packets, leading to throughput degradation. In this work, we present a novel idea called BLEnD, an Interest-bundling technique that encodes multiple Interests into one at the sender and decodes at the receiver. The major design challenges are to reduce the number of Interest transmissions without impacting the one-Interest one-Data principle embedded everywhere in NDN architecture and implementation, and support flow/congestion control mechanisms that usually use Interest packets as signals. BLEnD achieves these by bundling/unbundling Interests at the link adaptation layer, keeping all NDN components unaware and unaffected. Over a one-hop WiFi link, BLEnD improves application throughput by 30%. It may also be used over multiple hops and be improved in a number of ways.
△ Less
Submitted 28 October, 2021; v1 submitted 3 October, 2021;
originally announced October 2021.
-
BIoTA Control-Aware Attack Analytics for Building Internet of Things
Authors:
Nur Imtiazul Haque,
Mohammad Ashiqur Rahman,
Dong Chen,
Hisham Kholidy
Abstract:
Modern building control systems adopt demand control heating, ventilation, and cooling (HVAC) for increased energy efficiency. The integration of the Internet of Things (IoT) in the building control system can determine real-time demand, which has made the buildings smarter, reliable, and efficient. As occupants in a building are the main source of continuous heat and $CO_2$ generation, estimating…
▽ More
Modern building control systems adopt demand control heating, ventilation, and cooling (HVAC) for increased energy efficiency. The integration of the Internet of Things (IoT) in the building control system can determine real-time demand, which has made the buildings smarter, reliable, and efficient. As occupants in a building are the main source of continuous heat and $CO_2$ generation, estimating the accurate number of people in real-time using building IoT (BIoT) system facilities is essential for optimal energy consumption and occupants' comfort. However, the incorporation of less secured IoT sensor nodes and open communication network in the building control system eventually increases the number of vulnerable points to be compromised. Exploiting these vulnerabilities, attackers can manipulate the controller with false sensor measurements and disrupt the system's consistency. The attackers with the knowledge of overall system topology and control logics can launch attacks without alarming the system. This paper proposes a building internet of things analyzer (BIoTA) framework\footnote{https://github.com/imtiazulhaque/research-implementations/tree/main/biota} that assesses the smart building HVAC control system's security using formal attack modeling. We evaluate the proposed attack analyzer's effectiveness on the commercial occupancy dataset (COD) and the KTH live-in lab dataset. To the best of our knowledge, this is the first research attempt to formally model a BIoT-based HVAC control system and perform an attack analysis.
△ Less
Submitted 23 July, 2021;
originally announced July 2021.
-
CURE: Enabling RF Energy Harvesting using Cell-Free Massive MIMO UAVs Assisted by RIS
Authors:
Alvi Ataur Khalil,
Mohamed Y. Selim,
Mohammad Ashiqur Rahman
Abstract:
The ever-evolving internet of things (IoT) has led to the growth of numerous wireless sensors, communicating through the internet infrastructure. When designing a network using these sensors, one critical aspect is the longevity and self-sustainability of these devices. For extending the lifetime of these sensors, radio frequency energy harvesting (RFEH) technology has proved to be promising. In t…
▽ More
The ever-evolving internet of things (IoT) has led to the growth of numerous wireless sensors, communicating through the internet infrastructure. When designing a network using these sensors, one critical aspect is the longevity and self-sustainability of these devices. For extending the lifetime of these sensors, radio frequency energy harvesting (RFEH) technology has proved to be promising. In this paper, we propose CURE, a novel framework for RFEH that effectively combines the benefits of cell-free massive MIMO (CFmMIMO), unmanned aerial vehicles (UAVs), and reconfigurable intelligent surfaces (RISs) to provide seamless energy harvesting to IoT devices. We consider UAV as an access point (AP) in the CFmMIMO framework. To enhance the signal strength of the RFEH and information transfer, we leverage RISs owing to their passive reflection capability. Based on an extensive simulation, we validate our framework's performance by comparing the max-min fairness (MMF) algorithm for the amount of harvested energy.
△ Less
Submitted 21 July, 2021;
originally announced July 2021.
-
A Literature Review on Blockchain-enabled Security and Operation of Cyber-Physical Systems
Authors:
Alvi Ataur Khalil,
Javier Franco,
Imtiaz Parvez,
Selcuk Uluagac,
Mohammad Ashiqur Rahman
Abstract:
Blockchain has become a key technology in a plethora of application domains owing to its decentralized public nature. The cyber-physical systems (CPS) is one of the prominent application domains that leverage blockchain for myriad operations, where the Internet of Things (IoT) is utilized for data collection. Although some of the CPS problems can be solved by simply adopting blockchain for its sec…
▽ More
Blockchain has become a key technology in a plethora of application domains owing to its decentralized public nature. The cyber-physical systems (CPS) is one of the prominent application domains that leverage blockchain for myriad operations, where the Internet of Things (IoT) is utilized for data collection. Although some of the CPS problems can be solved by simply adopting blockchain for its secure and distributed nature, others require complex considerations for overcoming blockchain-imposed limitations while maintaining the core aspect of CPS. Even though a number of studies focus on either the utilization of blockchains for different CPS applications or the blockchain-enabled security of CPS, there is no comprehensive survey including both perspectives together. To fill this gap, we present a comprehensive overview of contemporary advancement in using blockchain for enhancing different CPS operations as well as improving CPS security. To the best of our knowledge, this is the first paper that presents an in-depth review of research on blockchain-enabled CPS operation and security.
△ Less
Submitted 16 July, 2021;
originally announced July 2021.
-
On the Analysis of Adaptive-Rate Applications in Data-Centric Wireless Ad-Hoc Networks
Authors:
Md Ashiqur Rahman,
Beichuan Zhang
Abstract:
Adapting applications' data rates in multi-hop wireless ad-hoc networks is inherently challenging. Packet collision, channel contention, and queue buildup contribute to packet loss but are difficult to manage in conventional TCP/IP architecture. This work explores a data-centric approach based on Name Data Networking (NDN) architecture, which is considered more suitable for wireless ad-hoc network…
▽ More
Adapting applications' data rates in multi-hop wireless ad-hoc networks is inherently challenging. Packet collision, channel contention, and queue buildup contribute to packet loss but are difficult to manage in conventional TCP/IP architecture. This work explores a data-centric approach based on Name Data Networking (NDN) architecture, which is considered more suitable for wireless ad-hoc networks. We show that the default NDN transport offers better performance in linear topologies but struggles in more extensive networks due to high collision and contention caused by excessive Interests from out-of-order data retrieval and redundant data transmission from improper Interest lifetime setting as well as in-network caching. To fix these, we use round-trip hop count to limit Interest rate and Dynamic Interest Lifetime to minimize the negative effect of improper Interest lifetime. Finally, we analyze the effect of in-network caching on transport performance and which scenarios may benefit or suffer from it.
△ Less
Submitted 14 July, 2021;
originally announced July 2021.
-
On Data-centric Forwarding in Mobile Ad-hoc Networks: Baseline Design and Simulation Analysis
Authors:
Md Ashiqur Rahman,
Beichuan Zhang
Abstract:
IP networking deals with end-to-end communication where the network layer routing protocols maintain the reachability from one address to another. However, challenging environments, such as mobile ad-hoc networks or MANETs, lead to frequent path failures and changes between the sender and receiver, incurring higher packet loss. The obligatory route setup and maintenance of a device-to-device stabl…
▽ More
IP networking deals with end-to-end communication where the network layer routing protocols maintain the reachability from one address to another. However, challenging environments, such as mobile ad-hoc networks or MANETs, lead to frequent path failures and changes between the sender and receiver, incurring higher packet loss. The obligatory route setup and maintenance of a device-to-device stable path in MANETs incur significant data retrieval delay and transmission overhead. Such overhead exaggerates the packet loss manifold.
Named Data Networking (NDN) can avoid such delays and overhead and significantly improve the overall network performance. It does so with direct application-controlled named-data retrieval from any node in a network instead of reaching a specific IP address with protocol message exchange. However, existing works lack any explicit or systematic analysis to justify such claims. Our work analyzes the core NDN and IP architectures in a MANET at a baseline level. The extensive simulations show that NDN, when applied correctly, yields much lower data retrieval latency than IP and can lower the network transmission overhead in most cases. As a result, NDN's stateful forwarder can significantly increase the retrieval rate, offering a better trade-off at the network layer. Such performance comes from its caching, built-in multicast, and request aggregation without requiring an IP-like separate routing control plane.
△ Less
Submitted 16 May, 2021;
originally announced May 2021.