-
Design of a Rectangular Linear Microstrip Patch Antenna Array for 5G Communication
Authors:
Muhammad Asfar Saeed,
Augustine O. Nwajana
Abstract:
This paper presents the design and characterization of a rectangular microstrip patch antenna array optimized for operation within the Ku-band frequency range. The antenna array is impedance-matched to 50 Ohms and utilizes a microstrip line feeding mechanism for excitation. The design maintains compact dimensions, with the overall antenna occupying an area of 29.5x7 mm. The antenna structure is mo…
▽ More
This paper presents the design and characterization of a rectangular microstrip patch antenna array optimized for operation within the Ku-band frequency range. The antenna array is impedance-matched to 50 Ohms and utilizes a microstrip line feeding mechanism for excitation. The design maintains compact dimensions, with the overall antenna occupying an area of 29.5x7 mm. The antenna structure is modelled on an R03003 substrate material, featuring a dielectric constant of 3, a low-loss tangent of 0.0009, and a thickness of 1.574 mm. The substrate is backed by a conducting ground plane, and the array consists of six radiating patch elements positioned on top. Evaluation of the designed antenna array reveals a resonant frequency of 18GHz, with a -10 dB impedance bandwidth extending over 700MHz. The antenna demonstrates a high gain of 7.51dBi, making it well-suited for applications in 5G and future communication systems. Its compact form factor, cost-effectiveness, and broad impedance and radiation coverage further underscore its potential in these domains.
△ Less
Submitted 27 May, 2024;
originally announced May 2024.
-
Learning under Label Noise through Few-Shot Human-in-the-Loop Refinement
Authors:
Aaqib Saeed,
Dimitris Spathis,
Jungwoo Oh,
Edward Choi,
Ali Etemad
Abstract:
Wearable technologies enable continuous monitoring of various health metrics, such as physical activity, heart rate, sleep, and stress levels. A key challenge with wearable data is obtaining quality labels. Unlike modalities like video where the videos themselves can be effectively used to label objects or events, wearable data do not contain obvious cues about the physical manifestation of the us…
▽ More
Wearable technologies enable continuous monitoring of various health metrics, such as physical activity, heart rate, sleep, and stress levels. A key challenge with wearable data is obtaining quality labels. Unlike modalities like video where the videos themselves can be effectively used to label objects or events, wearable data do not contain obvious cues about the physical manifestation of the users and usually require rich metadata. As a result, label noise can become an increasingly thorny issue when labeling such data. In this paper, we propose a novel solution to address noisy label learning, entitled Few-Shot Human-in-the-Loop Refinement (FHLR). Our method initially learns a seed model using weak labels. Next, it fine-tunes the seed model using a handful of expert corrections. Finally, it achieves better generalizability and robustness by merging the seed and fine-tuned models via weighted parameter averaging. We evaluate our approach on four challenging tasks and datasets, and compare it against eight competitive baselines designed to deal with noisy labels. We show that FHLR achieves significantly better performance when learning from noisy labels and achieves state-of-the-art by a large margin, with up to 19% accuracy improvement under symmetric and asymmetric noise. Notably, we find that FHLR is particularly robust to increased label noise, unlike prior works that suffer from severe performance degradation. Our work not only achieves better generalization in high-stakes health sensing benchmarks but also sheds light on how noise affects commonly-used models.
△ Less
Submitted 25 January, 2024;
originally announced January 2024.
-
Time Series Diffusion Method: A Denoising Diffusion Probabilistic Model for Vibration Signal Generation
Authors:
Haiming Yi,
Lei Hou,
Yuhong Jin,
Nasser A. Saeed,
Ali Kandil,
Hao Duan
Abstract:
Diffusion models have demonstrated powerful data generation capabilities in various research fields such as image generation. However, in the field of vibration signal generation, the criteria for evaluating the quality of the generated signal are different from that of image generation and there is a fundamental difference between them. At present, there is no research on the ability of diffusion…
▽ More
Diffusion models have demonstrated powerful data generation capabilities in various research fields such as image generation. However, in the field of vibration signal generation, the criteria for evaluating the quality of the generated signal are different from that of image generation and there is a fundamental difference between them. At present, there is no research on the ability of diffusion model to generate vibration signal. In this paper, a Time Series Diffusion Method (TSDM) is proposed for vibration signal generation, leveraging the foundational principles of diffusion models. The TSDM uses an improved U-net architecture with attention block, ResBlock and TimeEmbedding to effectively segment and extract features from one-dimensional time series data. It operates based on forward diffusion and reverse denoising processes for time-series generation. Experimental validation is conducted using single-frequency, multi-frequency datasets, and bearing fault datasets. The results show that TSDM can accurately generate the single-frequency and multi-frequency features in the time series and retain the basic frequency features for the diffusion generation results of the bearing fault series. It is also found that the original DDPM could not generate high quality vibration signals, but the improved U-net in TSDM, which applied the combination of attention block and ResBlock, could effectively improve the quality of vibration signal generation. Finally, TSDM is applied to the small sample fault diagnosis of three public bearing fault datasets, and the results show that the accuracy of small sample fault diagnosis of the three datasets is improved by 32.380%, 18.355% and 9.298% at most, respectively.
△ Less
Submitted 30 June, 2024; v1 submitted 13 December, 2023;
originally announced December 2023.
-
Plug-and-Play Multilingual Few-shot Spoken Words Recognition
Authors:
Aaqib Saeed,
Vasileios Tsouvalas
Abstract:
As technology advances and digital devices become prevalent, seamless human-machine communication is increasingly gaining significance. The growing adoption of mobile, wearable, and other Internet of Things (IoT) devices has changed how we interact with these smart devices, making accurate spoken words recognition a crucial component for effective interaction. However, building robust spoken words…
▽ More
As technology advances and digital devices become prevalent, seamless human-machine communication is increasingly gaining significance. The growing adoption of mobile, wearable, and other Internet of Things (IoT) devices has changed how we interact with these smart devices, making accurate spoken words recognition a crucial component for effective interaction. However, building robust spoken words detection system that can handle novel keywords remains challenging, especially for low-resource languages with limited training data. Here, we propose PLiX, a multilingual and plug-and-play keyword spotting system that leverages few-shot learning to harness massive real-world data and enable the recognition of unseen spoken words at test-time. Our few-shot deep models are learned with millions of one-second audio clips across 20 languages, achieving state-of-the-art performance while being highly efficient. Extensive evaluations show that PLiX can generalize to novel spoken words given as few as just one support example and performs well on unseen languages out of the box. We release models and inference code to serve as a foundation for future research and voice-enabled user interface development for emerging devices.
△ Less
Submitted 3 May, 2023;
originally announced May 2023.
-
Active Learning of Non-semantic Speech Tasks with Pretrained Models
Authors:
Harlin Lee,
Aaqib Saeed,
Andrea L. Bertozzi
Abstract:
Pretraining neural networks with massive unlabeled datasets has become popular as it equips the deep models with a better prior to solve downstream tasks. However, this approach generally assumes that the downstream tasks have access to annotated data of sufficient size. In this work, we propose ALOE, a novel system for improving the data- and label-efficiency of non-semantic speech tasks with act…
▽ More
Pretraining neural networks with massive unlabeled datasets has become popular as it equips the deep models with a better prior to solve downstream tasks. However, this approach generally assumes that the downstream tasks have access to annotated data of sufficient size. In this work, we propose ALOE, a novel system for improving the data- and label-efficiency of non-semantic speech tasks with active learning. ALOE uses pretrained models in conjunction with active learning to label data incrementally and learn classifiers for downstream tasks, thereby mitigating the need to acquire labeled data beforehand. We demonstrate the effectiveness of ALOE on a wide range of tasks, uncertainty-based acquisition functions, and model architectures. Training a linear classifier on top of a frozen encoder with ALOE is shown to achieve performance similar to several baselines that utilize the entire labeled data.
△ Less
Submitted 25 February, 2023; v1 submitted 31 October, 2022;
originally announced November 2022.
-
On Out-of-Distribution Detection for Audio with Deep Nearest Neighbors
Authors:
Zaharah Bukhsh,
Aaqib Saeed
Abstract:
Out-of-distribution (OOD) detection is concerned with identifying data points that do not belong to the same distribution as the model's training data. For the safe deployment of predictive models in a real-world environment, it is critical to avoid making confident predictions on OOD inputs as it can lead to potentially dangerous consequences. However, OOD detection largely remains an under-explo…
▽ More
Out-of-distribution (OOD) detection is concerned with identifying data points that do not belong to the same distribution as the model's training data. For the safe deployment of predictive models in a real-world environment, it is critical to avoid making confident predictions on OOD inputs as it can lead to potentially dangerous consequences. However, OOD detection largely remains an under-explored area in the audio (and speech) domain. This is despite the fact that audio is a central modality for many tasks, such as speaker diarization, automatic speech recognition, and sound event detection. To address this, we propose to leverage feature-space of the model with deep k-nearest neighbors to detect OOD samples. We show that this simple and flexible method effectively detects OOD inputs across a broad category of audio (and speech) datasets. Specifically, it improves the false positive rate (FPR@TPR95) by 17% and the AUROC score by 7% than other prior techniques.
△ Less
Submitted 25 February, 2023; v1 submitted 27 October, 2022;
originally announced October 2022.
-
Automatic Sleep Scoring from Large-scale Multi-channel Pediatric EEG
Authors:
Harlin Lee,
Aaqib Saeed
Abstract:
Sleep is particularly important to the health of infants, children, and adolescents, and sleep scoring is the first step to accurate diagnosis and treatment of potentially life-threatening conditions. But pediatric sleep is severely under-researched compared to adult sleep in the context of machine learning for health, and sleep scoring algorithms developed for adults usually perform poorly on inf…
▽ More
Sleep is particularly important to the health of infants, children, and adolescents, and sleep scoring is the first step to accurate diagnosis and treatment of potentially life-threatening conditions. But pediatric sleep is severely under-researched compared to adult sleep in the context of machine learning for health, and sleep scoring algorithms developed for adults usually perform poorly on infants. Here, we present the first automated sleep scoring results on a recent large-scale pediatric sleep study dataset that was collected during standard clinical care. We develop a transformer-based model that learns to classify five sleep stages from millions of multi-channel electroencephalogram (EEG) sleep epochs with 78% overall accuracy. Further, we conduct an in-depth analysis of the model performance based on patient demographics and EEG channels. The results point to the growing need for machine learning research on pediatric sleep.
△ Less
Submitted 26 October, 2022; v1 submitted 30 June, 2022;
originally announced July 2022.
-
Distilled Non-Semantic Speech Embeddings with Binary Neural Networks for Low-Resource Devices
Authors:
Harlin Lee,
Aaqib Saeed
Abstract:
This work introduces BRILLsson, a novel binary neural network-based representation learning model for a broad range of non-semantic speech tasks. We train the model with knowledge distillation from a large and real-valued TRILLsson model with only a fraction of the dataset used to train TRILLsson. The resulting BRILLsson models are only 2MB in size with a latency less than 8ms, making them suitabl…
▽ More
This work introduces BRILLsson, a novel binary neural network-based representation learning model for a broad range of non-semantic speech tasks. We train the model with knowledge distillation from a large and real-valued TRILLsson model with only a fraction of the dataset used to train TRILLsson. The resulting BRILLsson models are only 2MB in size with a latency less than 8ms, making them suitable for deployment in low-resource devices such as wearables. We evaluate BRILLsson on eight benchmark tasks (including but not limited to spoken language identification, emotion recognition, health condition diagnosis, and keyword spotting), and demonstrate that our proposed ultra-light and low-latency models perform as well as large-scale models.
△ Less
Submitted 2 December, 2023; v1 submitted 12 July, 2022;
originally announced July 2022.
-
Airplane Type Identification Based on Mask RCNN and Drone Images
Authors:
W. T Alshaibani,
Mustafa Helvaci,
Ibraheem Shayea,
Sawsan A. Saad,
Azizul Azizan,
Fitri Yakub
Abstract:
For dealing with traffic bottlenecks at airports, aircraft object detection is insufficient. Every airport generally has a variety of planes with various physical and technological requirements as well as diverse service requirements. Detecting the presence of new planes will not address all traffic congestion issues. Identifying the type of airplane, on the other hand, will entirely fix the probl…
▽ More
For dealing with traffic bottlenecks at airports, aircraft object detection is insufficient. Every airport generally has a variety of planes with various physical and technological requirements as well as diverse service requirements. Detecting the presence of new planes will not address all traffic congestion issues. Identifying the type of airplane, on the other hand, will entirely fix the problem because it will offer important information about the plane's technical specifications (i.e., the time it needs to be served and its appropriate place in the airport). Several studies have provided various contributions to address airport traffic jams; however, their ultimate goal was to determine the existence of airplane objects. This paper provides a practical approach to identify the type of airplane in airports depending on the results provided by the airplane detection process using mask region convolution neural network. The key feature employed to identify the type of airplane is the surface area calculated based on the results of airplane detection. The surface area is used to assess the estimated cabin length which is considered as an additional key feature for identifying the airplane type. The length of any detected plane may be calculated by measuring the distance between the detected plane's two furthest points. The suggested approach's performance is assessed using average accuracies and a confusion matrix. The findings show that this method is dependable. This method will greatly aid in the management of airport traffic congestion.
△ Less
Submitted 29 August, 2021;
originally announced August 2021.
-
Federated Self-Training for Semi-Supervised Audio Recognition
Authors:
Vasileios Tsouvalas,
Aaqib Saeed,
Tanir Ozcelebi
Abstract:
Federated Learning is a distributed machine learning paradigm dealing with decentralized and personal datasets. Since data reside on devices like smartphones and virtual assistants, labeling is entrusted to the clients, or labels are extracted in an automated way. Specifically, in the case of audio data, acquiring semantic annotations can be prohibitively expensive and time-consuming. As a result,…
▽ More
Federated Learning is a distributed machine learning paradigm dealing with decentralized and personal datasets. Since data reside on devices like smartphones and virtual assistants, labeling is entrusted to the clients, or labels are extracted in an automated way. Specifically, in the case of audio data, acquiring semantic annotations can be prohibitively expensive and time-consuming. As a result, an abundance of audio data remains unlabeled and unexploited on users' devices. Most existing federated learning approaches focus on supervised learning without harnessing the unlabeled data. In this work, we study the problem of semi-supervised learning of audio models via self-training in conjunction with federated learning. We propose FedSTAR to exploit large-scale on-device unlabeled data to improve the generalization of audio recognition models. We further demonstrate that self-supervised pre-trained models can accelerate the training of on-device models, significantly improving convergence to within fewer training rounds. We conduct experiments on diverse public audio classification datasets and investigate the performance of our models under varying percentages of labeled and unlabeled data. Notably, we show that with as little as 3% labeled data available, FedSTAR on average can improve the recognition rate by 13.28% compared to the fully supervised federated model.
△ Less
Submitted 25 February, 2022; v1 submitted 14 July, 2021;
originally announced July 2021.
-
Throughput-Fairness Tradeoffs in Mobility Platforms
Authors:
Arjun Balasingam,
Karthik Gopalakrishnan,
Radhika Mittal,
Venkat Arun,
Ahmed Saeed,
Mohammad Alizadeh,
Hamsa Balakrishnan,
Hari Balakrishnan
Abstract:
This paper studies the problem of allocating tasks from different customers to vehicles in mobility platforms, which are used for applications like food and package delivery, ridesharing, and mobile sensing. A mobility platform should allocate tasks to vehicles and schedule them in order to optimize both throughput and fairness across customers. However, existing approaches to scheduling tasks in…
▽ More
This paper studies the problem of allocating tasks from different customers to vehicles in mobility platforms, which are used for applications like food and package delivery, ridesharing, and mobile sensing. A mobility platform should allocate tasks to vehicles and schedule them in order to optimize both throughput and fairness across customers. However, existing approaches to scheduling tasks in mobility platforms ignore fairness.
We introduce Mobius, a system that uses guided optimization to achieve both high throughput and fairness across customers. Mobius supports spatiotemporally diverse and dynamic customer demands. It provides a principled method to navigate inherent tradeoffs between fairness and throughput caused by shared mobility. Our evaluation demonstrates these properties, along with the versatility and scalability of Mobius, using traces gathered from ridesharing and aerial sensing applications. Our ridesharing case study shows that Mobius can schedule more than 16,000 tasks across 40 customers and 200 vehicles in an online manner.
△ Less
Submitted 25 May, 2021;
originally announced May 2021.
-
NuCLS: A scalable crowdsourcing, deep learning approach and dataset for nucleus classification, localization and segmentation
Authors:
Mohamed Amgad,
Lamees A. Atteya,
Hagar Hussein,
Kareem Hosny Mohammed,
Ehab Hafiz,
Maha A. T. Elsebaie,
Ahmed M. Alhusseiny,
Mohamed Atef AlMoslemany,
Abdelmagid M. Elmatboly,
Philip A. Pappalardo,
Rokia Adel Sakr,
Pooya Mobadersany,
Ahmad Rachid,
Anas M. Saad,
Ahmad M. Alkashash,
Inas A. Ruhban,
Anas Alrefai,
Nada M. Elgazar,
Ali Abdulkarim,
Abo-Alela Farag,
Amira Etman,
Ahmed G. Elsaeed,
Yahya Alagha,
Yomna A. Amer,
Ahmed M. Raslan
, et al. (12 additional authors not shown)
Abstract:
High-resolution mapping of cells and tissue structures provides a foundation for developing interpretable machine-learning models for computational pathology. Deep learning algorithms can provide accurate mappings given large numbers of labeled instances for training and validation. Generating adequate volume of quality labels has emerged as a critical barrier in computational pathology given the…
▽ More
High-resolution mapping of cells and tissue structures provides a foundation for developing interpretable machine-learning models for computational pathology. Deep learning algorithms can provide accurate mappings given large numbers of labeled instances for training and validation. Generating adequate volume of quality labels has emerged as a critical barrier in computational pathology given the time and effort required from pathologists. In this paper we describe an approach for engaging crowds of medical students and pathologists that was used to produce a dataset of over 220,000 annotations of cell nuclei in breast cancers. We show how suggested annotations generated by a weak algorithm can improve the accuracy of annotations generated by non-experts and can yield useful data for training segmentation algorithms without laborious manual tracing. We systematically examine interrater agreement and describe modifications to the MaskRCNN model to improve cell mapping. We also describe a technique we call Decision Tree Approximation of Learned Embeddings (DTALE) that leverages nucleus segmentations and morphologic features to improve the transparency of nucleus classification models. The annotation data produced in this study are freely available for algorithm development and benchmarking at: https://sites.google.com/view/nucls.
△ Less
Submitted 17 February, 2021;
originally announced February 2021.
-
A New Paradigm for Water Level Regulation using Three Pond Model with Fuzzy Inference System for Run of River Hydropower Plant
Authors:
Ahmad Saeed,
Ebrahim Shahzad,
Laeeq Aslam,
Ijaz Mansoor Qureshi,
Adnan Umar Khan,
Muhammad Iqbal
Abstract:
The energy generation of a run of river hydropower plant depends upon the flow of river and the variations in the water flow makes the energy production unreliable. This problem is usually solved by constructing a small pond in front of the run of river hydropower plant. However, changes in water level of conventional single pond model results in sags, surges and unpredictable power fluctuations.…
▽ More
The energy generation of a run of river hydropower plant depends upon the flow of river and the variations in the water flow makes the energy production unreliable. This problem is usually solved by constructing a small pond in front of the run of river hydropower plant. However, changes in water level of conventional single pond model results in sags, surges and unpredictable power fluctuations. This work proposes three pond model instead of traditional single pond model. The volume of water in three ponds is volumetrically equivalent to the traditional single pond but it reduces the dependency of the run of river power plant on the flow of river. Moreover, three pond model absorbs the water surges and disturbances more efficiently. The three pond system, modeled as non-linear hydraulic three tank system, is being applied with fuzzy inference system and standard PID based methods for smooth and efficient level regulation. The results of fuzzy inference system are across-the-board improved in terms of regulation and disturbances handling as compared to conventional PID controller.
△ Less
Submitted 26 November, 2020;
originally announced November 2020.
-
Learning from Heterogeneous EEG Signals with Differentiable Channel Reordering
Authors:
Aaqib Saeed,
David Grangier,
Olivier Pietquin,
Neil Zeghidour
Abstract:
We propose CHARM, a method for training a single neural network across inconsistent input channels. Our work is motivated by Electroencephalography (EEG), where data collection protocols from different headsets result in varying channel ordering and number, which limits the feasibility of transferring trained systems across datasets. Our approach builds upon attention mechanisms to estimate a late…
▽ More
We propose CHARM, a method for training a single neural network across inconsistent input channels. Our work is motivated by Electroencephalography (EEG), where data collection protocols from different headsets result in varying channel ordering and number, which limits the feasibility of transferring trained systems across datasets. Our approach builds upon attention mechanisms to estimate a latent reordering matrix from each input signal and map input channels to a canonical order. CHARM is differentiable and can be composed further with architectures expecting a consistent channel ordering to build end-to-end trainable classifiers. We perform experiments on four EEG classification datasets and demonstrate the efficacy of CHARM via simulated shuffling and masking of input channels. Moreover, our method improves the transfer of pre-trained representations between datasets collected with different protocols.
△ Less
Submitted 21 October, 2020;
originally announced October 2020.
-
Context Aware 3D UNet for Brain Tumor Segmentation
Authors:
Parvez Ahmad,
Saqib Qamar,
Linlin Shen,
Adnan Saeed
Abstract:
Deep convolutional neural network (CNN) achieves remarkable performance for medical image analysis. UNet is the primary source in the performance of 3D CNN architectures for medical imaging tasks, including brain tumor segmentation. The skip connection in the UNet architecture concatenates features from both encoder and decoder paths to extract multi-contextual information from image data. The mul…
▽ More
Deep convolutional neural network (CNN) achieves remarkable performance for medical image analysis. UNet is the primary source in the performance of 3D CNN architectures for medical imaging tasks, including brain tumor segmentation. The skip connection in the UNet architecture concatenates features from both encoder and decoder paths to extract multi-contextual information from image data. The multi-scaled features play an essential role in brain tumor segmentation. However, the limited use of features can degrade the performance of the UNet approach for segmentation. In this paper, we propose a modified UNet architecture for brain tumor segmentation. In the proposed architecture, we used densely connected blocks in both encoder and decoder paths to extract multi-contextual information from the concept of feature reusability. In addition, residual-inception blocks (RIB) are used to extract the local and global information by merging features of different kernel sizes. We validate the proposed architecture on the multi-modal brain tumor segmentation challenge (BRATS) 2020 testing dataset. The dice (DSC) scores of the whole tumor (WT), tumor core (TC), and enhancing tumor (ET) are 89.12%, 84.74%, and 79.12%, respectively.
△ Less
Submitted 27 November, 2020; v1 submitted 25 October, 2020;
originally announced October 2020.
-
Contrastive Learning of General-Purpose Audio Representations
Authors:
Aaqib Saeed,
David Grangier,
Neil Zeghidour
Abstract:
We introduce COLA, a self-supervised pre-training approach for learning a general-purpose representation of audio. Our approach is based on contrastive learning: it learns a representation which assigns high similarity to audio segments extracted from the same recording while assigning lower similarity to segments from different recordings. We build on top of recent advances in contrastive learnin…
▽ More
We introduce COLA, a self-supervised pre-training approach for learning a general-purpose representation of audio. Our approach is based on contrastive learning: it learns a representation which assigns high similarity to audio segments extracted from the same recording while assigning lower similarity to segments from different recordings. We build on top of recent advances in contrastive learning for computer vision and reinforcement learning to design a lightweight, easy-to-implement self-supervised model of audio. We pre-train embeddings on the large-scale Audioset database and transfer these representations to 9 diverse classification tasks, including speech, music, animal sounds, and acoustic scenes. We show that despite its simplicity, our method significantly outperforms previous self-supervised systems. We furthermore conduct ablation studies to identify key design choices and release a library to pre-train and fine-tune COLA models.
△ Less
Submitted 21 October, 2020;
originally announced October 2020.
-
Physical Action Categorization using Signal Analysis and Machine Learning
Authors:
Asad Mansoor Khan,
Ayesha Sadiq,
Sajid Gul Khawaja,
Norah Saleh Alghamdi,
Muhammad Usman Akram,
Ali Saeed
Abstract:
Daily life of thousands of individuals around the globe suffers due to physical or mental disability related to limb movement. The quality of life for such individuals can be made better by use of assistive applications and systems. In such scenario, mapping of physical actions from movement to a computer aided application can lead the way for solution. Surface Electromyography (sEMG) presents a n…
▽ More
Daily life of thousands of individuals around the globe suffers due to physical or mental disability related to limb movement. The quality of life for such individuals can be made better by use of assistive applications and systems. In such scenario, mapping of physical actions from movement to a computer aided application can lead the way for solution. Surface Electromyography (sEMG) presents a non-invasive mechanism through which we can translate the physical movement to signals for classification and use in applications. In this paper, we propose a machine learning based framework for classification of 4 physical actions. The framework looks into the various features from different modalities which contribution from time domain, frequency domain, higher order statistics and inter channel statistics. Next, we conducted a comparative analysis of k-NN, SVM and ELM classifier using the feature set. Effect of different combinations of feature set has also been recorded. Finally, the classifier accuracy with SVM and 1-NN based classifier for a subset of features gives an accuracy of 95.21 and 95.83 respectively. Additionally, we have also proposed that dimensionality reduction by use of PCA leads to only a minor drop of less than 5.55% in accuracy while using only 9.22% of the original feature set. These finding are useful for algorithm designer to choose the best approach keeping in mind the resources available for execution of algorithm.
△ Less
Submitted 1 February, 2022; v1 submitted 16 August, 2020;
originally announced August 2020.
-
Where is the Fake? Patch-Wise Supervised GANs for Texture Inpainting
Authors:
Ahmed Ben Saad,
Youssef Tamaazousti,
Josselin Kherroubi,
Alexis He
Abstract:
We tackle the problem of texture inpainting where the input images are textures with missing values along with masks that indicate the zones that should be generated. Many works have been done in image inpainting with the aim to achieve global and local consistency. But these works still suffer from limitations when dealing with textures. In fact, the local information in the image to be completed…
▽ More
We tackle the problem of texture inpainting where the input images are textures with missing values along with masks that indicate the zones that should be generated. Many works have been done in image inpainting with the aim to achieve global and local consistency. But these works still suffer from limitations when dealing with textures. In fact, the local information in the image to be completed needs to be used in order to achieve local continuities and visually realistic texture inpainting. For this, we propose a new segmentor discriminator that performs a patch-wise real/fake classification and is supervised by input masks. During training, it aims to locate the fake and thus backpropagates consistent signal to the generator. We tested our approach on the publicly available DTD dataset and showed that it achieves state-of-the-art performances and better deals with local consistency than existing methods.
△ Less
Submitted 9 March, 2020; v1 submitted 6 November, 2019;
originally announced November 2019.
-
Aerial Images Processing for Car Detection using Convolutional Neural Networks: Comparison between Faster R-CNN and YoloV3
Authors:
Adel Ammar,
Anis Koubaa,
Mohanned Ahmed,
Abdulrahman Saad,
Bilel Benjdira
Abstract:
In this paper, we address the problem of car detection from aerial images using Convolutional Neural Networks (CNN). This problem presents additional challenges as compared to car (or any object) detection from ground images because features of vehicles from aerial images are more difficult to discern. To investigate this issue, we assess the performance of two state-of-the-art CNN algorithms, nam…
▽ More
In this paper, we address the problem of car detection from aerial images using Convolutional Neural Networks (CNN). This problem presents additional challenges as compared to car (or any object) detection from ground images because features of vehicles from aerial images are more difficult to discern. To investigate this issue, we assess the performance of two state-of-the-art CNN algorithms, namely Faster R-CNN, which is the most popular region-based algorithm, and YOLOv3, which is known to be the fastest detection algorithm. We analyze two datasets with different characteristics to check the impact of various factors, such as UAV's altitude, camera resolution, and object size. A total of 39 training experiments were conducted to account for the effect of different hyperparameter values. The objective of this work is to conduct the most robust and exhaustive comparison between these two cutting-edge algorithms on the specific domain of aerial images. By using a variety of metrics, we show that YOLOv3 yields better performance in most configurations, except that it exhibits a lower recall and less confident detections when object sizes and scales in the testing dataset differ largely from those in the training dataset.
△ Less
Submitted 22 December, 2021; v1 submitted 16 October, 2019;
originally announced October 2019.
-
Smart Palm: An IoT Framework for Red Palm Weevil Early Detection
Authors:
Anis Koubaa,
Abdulrahman Aldawood,
Bassel Saeed,
Abdullatif Hadid,
Mohanned Ahmed,
Abdulrahman Saad,
Hesham Alkhouja,
Mohamed Alkanhal
Abstract:
Smart agriculture is an evolving trend in agriculture industry, where sensors are embedded into plants to collect vital data and help in decision making to ensure higher quality of crops and prevent pests, disease, and other possible threats. In Saudi Arabia, growing palms is the most important agricultural activity, and there is an increasing need to leverage smart agriculture technology to impro…
▽ More
Smart agriculture is an evolving trend in agriculture industry, where sensors are embedded into plants to collect vital data and help in decision making to ensure higher quality of crops and prevent pests, disease, and other possible threats. In Saudi Arabia, growing palms is the most important agricultural activity, and there is an increasing need to leverage smart agriculture technology to improve the production of dates and prevent diseases. One of the most critical diseases of palms if the red palm weevil, which is an insect that causes a lot of damage to palm trees and can devast large areas of palm trees. The most challenging problem is that the effect of the weevil is not visible by humans until the palm reaches an advanced infestation state. For this reason, there is a need to use advanced technology for early detection and prevention of infestation propagation. In this project, we have developed am IoT based smart palm monitoring prototype as a proof-of-concept that (1) allows to monitor palms remotely using smart agriculture sensors, (2) contribute to the early detection of red palm weevil. Users can use web/mobile application to interact with their palm farms and help them in getting early detection of possible infestations. We used Elm company IoT platform to interface between the sensor layer and the user layer. In addition, we have collected data using accelerometer sensors and we applied signal processing and statistical techniques to analyze collected data and determine a fingerprint of the infestation.
△ Less
Submitted 21 September, 2019;
originally announced October 2019.
-
Up and Away: A Cheap UAV Cyber-Physical Testbed (Work in Progress)
Authors:
Ahmed Saeed,
Azin Neishaboori,
Amr Mohamed,
Khaled Harras
Abstract:
Cyber-Physical Systems (CPS) have the promise of presenting the next evolution in computing with potential applications that include aerospace, transportation, robotics, and various automation systems. These applications motivate advances in the different sub-fields of CPS (e.g. mobile computing and communication, control, and vision). However, deploying and testing complete CPSs is known to be a…
▽ More
Cyber-Physical Systems (CPS) have the promise of presenting the next evolution in computing with potential applications that include aerospace, transportation, robotics, and various automation systems. These applications motivate advances in the different sub-fields of CPS (e.g. mobile computing and communication, control, and vision). However, deploying and testing complete CPSs is known to be a complex and expensive task. In this paper, we present the design, implementation, and evaluation of Up and Away (UnA): a testbed for Cyber-Physical Systems that use UAVs as their physical component. UnA aims at abstracting the control of physical components of the system to reduce the complexity of UAV oriented Cyber-Physical Systems experiments. In addition, UnA provides an API to allow for converting CPS simulations into physical experiments using a few simple steps. We present a case study bringing a mobile-camera-based surveillance system simulation to life using UnA.
△ Less
Submitted 8 May, 2014;
originally announced May 2014.
-
Paraglide: Interactive Parameter Space Partitioning for Computer Simulations
Authors:
Steven Bergner,
Michael Sedlmair,
Sareh Nabi,
Ahmed Saad,
Torsten Möller
Abstract:
In this paper we introduce paraglide, a visualization system designed for interactive exploration of parameter spaces of multi-variate simulation models. To get the right parameter configuration, model developers frequently have to go back and forth between setting parameters and qualitatively judging the outcomes of their model. During this process, they build up a grounded understanding of the p…
▽ More
In this paper we introduce paraglide, a visualization system designed for interactive exploration of parameter spaces of multi-variate simulation models. To get the right parameter configuration, model developers frequently have to go back and forth between setting parameters and qualitatively judging the outcomes of their model. During this process, they build up a grounded understanding of the parameter effects in order to pick the right setting. Current state-of-the-art tools and practices, however, fail to provide a systematic way of exploring these parameter spaces, making informed decisions about parameter settings a tedious and workload-intensive task. Paraglide endeavors to overcome this shortcoming by assisting the sampling of the parameter space and the discovery of qualitatively different model outcomes. This results in a decomposition of the model parameter space into regions of distinct behaviour. We developed paraglide in close collaboration with experts from three different domains, who all were involved in developing new models for their domain. We first analyzed current practices of six domain experts and derived a set of design requirements, then engaged in a longitudinal user-centered design process, and finally conducted three in-depth case studies underlining the usefulness of our approach.
△ Less
Submitted 24 October, 2011;
originally announced October 2011.