Search | arXiv e-print repository

Using Explainable AI for EEG-based Reduced Montage Neonatal Seizure Detection

Authors: Dinuka Sandun Udayantha, Kavindu Weerasinghe, Nima Wickramasinghe, Akila Abeyratne, Kithmin Wickremasinghe, Jithangi Wanigasinghe, Anjula De Silva, Chamira U. S. Edussooriya

Abstract: The neonatal period is the most vulnerable time for the development of seizures. Seizures in the immature brain lead to detrimental consequences, therefore require early diagnosis. The gold-standard for neonatal seizure detection currently relies on continuous video-EEG monitoring; which involves recording multi-channel electroencephalogram (EEG) alongside real-time video monitoring within a neona… ▽ More The neonatal period is the most vulnerable time for the development of seizures. Seizures in the immature brain lead to detrimental consequences, therefore require early diagnosis. The gold-standard for neonatal seizure detection currently relies on continuous video-EEG monitoring; which involves recording multi-channel electroencephalogram (EEG) alongside real-time video monitoring within a neonatal intensive care unit (NICU). However, video-EEG monitoring technology requires clinical expertise and is often limited to technologically advanced and resourceful settings. Cost-effective new techniques could help the medical fraternity make an accurate diagnosis and advocate treatment without delay. In this work, a novel explainable deep learning model to automate the neonatal seizure detection process with a reduced EEG montage is proposed, which employs convolutional nets, graph attention layers, and fully connected layers. Beyond its ability to detect seizures in real-time with a reduced montage, this model offers the unique advantage of real-time interpretability. By evaluating the performance on the Zenodo dataset with 10-fold cross-validation, the presented model achieves an absolute improvement of 8.31% and 42.86% in area under curve (AUC) and recall, respectively. △ Less

Submitted 14 August, 2024; v1 submitted 4 June, 2024; originally announced June 2024.

Comments: Paper is accepted to IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2024. Final Version

arXiv:2405.14323 [pdf, other]

SmartCS: Enabling the Creation of ML-Powered Computer Vision Mobile Apps for Citizen Science Applications without Coding

Authors: Fahim Hasan Khan, Akila de Silva, Gregory Dusek, James Davis, Alex Pang

Abstract: It is undeniable that citizen science contributes to the advancement of various fields of study. There are now software tools that facilitate the development of citizen science apps. However, apps developed with these tools rely on individual human skills to correctly collect useful data. Machine learning (ML)-aided apps provide on-field guidance to citizen scientists on data collection tasks. How… ▽ More It is undeniable that citizen science contributes to the advancement of various fields of study. There are now software tools that facilitate the development of citizen science apps. However, apps developed with these tools rely on individual human skills to correctly collect useful data. Machine learning (ML)-aided apps provide on-field guidance to citizen scientists on data collection tasks. However, these apps rely on server-side ML support, and therefore need a reliable internet connection. Furthermore, the development of citizen science apps with ML support requires a significant investment of time and money. For some projects, this barrier may preclude the use of citizen science effectively. We present a platform that democratizes citizen science by making it accessible to a much broader audience of both researchers and participants. The SmartCS platform allows one to create citizen science apps with ML support quickly and without coding skills. Apps developed using SmartCS have client-side ML support, making them usable in the field, even when there is no internet connection. The client-side ML helps educate users to better recognize the subjects, thereby enabling high-quality data collection. We present several citizen science apps created using SmartCS, some of which were conceived and created by high school students. △ Less

Submitted 23 May, 2024; originally announced May 2024.

arXiv:2404.01352 [pdf, other]

VortexViz: Finding Vortex Boundaries by Learning from Particle Trajectories

Authors: Akila de Silva, Nicholas Tee, Omkar Ghanekar, Fahim Hasan Khan, Gregory Dusek, James Davis, Alex Pang

Abstract: Vortices are studied in various scientific disciplines, offering insights into fluid flow behavior. Visualizing the boundary of vortices is crucial for understanding flow phenomena and detecting flow irregularities. This paper addresses the challenge of accurately extracting vortex boundaries using deep learning techniques. While existing methods primarily train on velocity components, we propose… ▽ More Vortices are studied in various scientific disciplines, offering insights into fluid flow behavior. Visualizing the boundary of vortices is crucial for understanding flow phenomena and detecting flow irregularities. This paper addresses the challenge of accurately extracting vortex boundaries using deep learning techniques. While existing methods primarily train on velocity components, we propose a novel approach incorporating particle trajectories (streamlines or pathlines) into the learning process. By leveraging the regional/local characteristics of the flow field captured by streamlines or pathlines, our methodology aims to enhance the accuracy of vortex boundary extraction. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: Under review

arXiv:2403.03293 [pdf, other]

AI Insights: A Case Study on Utilizing ChatGPT Intelligence for Research Paper Analysis

Authors: Anjalee De Silva, Janaka L. Wijekoon, Rashini Liyanarachchi, Rrubaa Panchendrarajan, Weranga Rajapaksha

Abstract: This paper discusses the effectiveness of leveraging Chatbot: Generative Pre-trained Transformer (ChatGPT) versions 3.5 and 4 for analyzing research papers for effective writing of scientific literature surveys. The study selected the \textit{Application of Artificial Intelligence in Breast Cancer Treatment} as the research topic. Research papers related to this topic were collected from three maj… ▽ More This paper discusses the effectiveness of leveraging Chatbot: Generative Pre-trained Transformer (ChatGPT) versions 3.5 and 4 for analyzing research papers for effective writing of scientific literature surveys. The study selected the \textit{Application of Artificial Intelligence in Breast Cancer Treatment} as the research topic. Research papers related to this topic were collected from three major publication databases Google Scholar, Pubmed, and Scopus. ChatGPT models were used to identify the category, scope, and relevant information from the research papers for automatic identification of relevant papers related to Breast Cancer Treatment (BCT), organization of papers according to scope, and identification of key information for survey paper writing. Evaluations performed using ground truth data annotated using subject experts reveal, that GPT-4 achieves 77.3\% accuracy in identifying the research paper categories and 50\% of the papers were correctly identified by GPT-4 for their scopes. Further, the results demonstrate that GPT-4 can generate reasons for its decisions with an average of 27\% new words, and 67\% of the reasons given by the model were completely agreeable to the subject experts. △ Less

Submitted 5 March, 2024; originally announced March 2024.

arXiv:2309.01288 [pdf, other]

How Crowd Worker Factors Influence Subjective Annotations: A Study of Tagging Misogynistic Hate Speech in Tweets

Authors: Danula Hettiachchi, Indigo Holcombe-James, Stephanie Livingstone, Anjalee de Silva, Matthew Lease, Flora D. Salim, Mark Sanderson

Abstract: Crowdsourced annotation is vital to both collecting labelled data to train and test automated content moderation systems and to support human-in-the-loop review of system decisions. However, annotation tasks such as judging hate speech are subjective and thus highly sensitive to biases stemming from annotator beliefs, characteristics and demographics. We conduct two crowdsourcing studies on Mechan… ▽ More Crowdsourced annotation is vital to both collecting labelled data to train and test automated content moderation systems and to support human-in-the-loop review of system decisions. However, annotation tasks such as judging hate speech are subjective and thus highly sensitive to biases stemming from annotator beliefs, characteristics and demographics. We conduct two crowdsourcing studies on Mechanical Turk to examine annotator bias in labelling sexist and misogynistic hate speech. Results from 109 annotators show that annotator political inclination, moral integrity, personality traits, and sexist attitudes significantly impact annotation accuracy and the tendency to tag content as hate speech. In addition, semi-structured interviews with nine crowd workers provide further insights regarding the influence of subjectivity on annotations. In exploring how workers interpret a task - shaped by complex negotiations between platform structures, task instructions, subjective motivations, and external contextual factors - we see annotations not only impacted by worker factors but also simultaneously shaped by the structures under which they labour. △ Less

Submitted 3 September, 2023; originally announced September 2023.

Comments: Accepted to the 11th AAAI Conference on Human Computation and Crowdsourcing (HCOMP 2023)

arXiv:2303.17829 [pdf]

Evaluation of Noise Reduction Methods for Sentence Recognition by Sinhala Speaking Listeners

Authors: Malitha Gunawardhana, Chathuki Navanjana, Dinithi Fernando, Nipuna Upeksha, Anjula De Silva

Abstract: Noise reduction is a crucial aspect of hearing aids, which researchers have been striving to address over the years. However, most existing noise reduction algorithms have primarily been evaluated using English. Considering the linguistic differences between English and Sinhala languages, including variation in syllable structures and vowel duration, it is very important to assess the performance… ▽ More Noise reduction is a crucial aspect of hearing aids, which researchers have been striving to address over the years. However, most existing noise reduction algorithms have primarily been evaluated using English. Considering the linguistic differences between English and Sinhala languages, including variation in syllable structures and vowel duration, it is very important to assess the performance of noise reduction tailored to the Sinhala language. This paper presents a comprehensive analysis between wavelet transformation and adaptive filters for noise reduction in Sinhala languages. We investigate the performance of ten wavelet families with soft and hard thresholding methods against adaptive filters with Normalized Least Mean Square, Least Mean Square Average Normalized Least Mean Square, Recursive Least Square, and Adaptive Filtering Averaging optimization algorithms along with cepstral and energy-based voice activity detection algorithms. The performance evaluation is done using objective metrics; Signal to Noise Ratio (SNR) and Perceptual Evaluation of Speech Quality (PESQ) and a subjective metric; Mean Opinion Score (MOS). A newly recorded Sinhala language audio dataset and the NOIZEUS database by the University of Texas, Dallas were used for the evaluation. Our code is available at https://github.com/ChathukiKet/Evaluation-of-Noise-Reduction-Methods △ Less

Submitted 27 June, 2023; v1 submitted 31 March, 2023; originally announced March 2023.

arXiv:2302.14186 [pdf, other]

Approximately optimal domain adaptation with Fisher's Linear Discriminant

Authors: Hayden S. Helm, Ashwin De Silva, Joshua T. Vogelstein, Carey E. Priebe, Weiwei Yang

Abstract: We propose a class of models based on Fisher's Linear Discriminant (FLD) in the context of domain adaptation. The class is the convex combination of two hypotheses: i) an average hypothesis representing previously seen source tasks and ii) a hypothesis trained on a new target task. For a particular generative setting we derive the optimal convex combination of the two models under 0-1 loss, propos… ▽ More We propose a class of models based on Fisher's Linear Discriminant (FLD) in the context of domain adaptation. The class is the convex combination of two hypotheses: i) an average hypothesis representing previously seen source tasks and ii) a hypothesis trained on a new target task. For a particular generative setting we derive the optimal convex combination of the two models under 0-1 loss, propose a computable approximation, and study the effect of various parameter settings on the relative risks between the optimal hypothesis, hypothesis i), and hypothesis ii). We demonstrate the effectiveness of the proposed optimal classifier in the context of EEG- and ECG-based classification settings and argue that the optimal classifier can be computed without access to direct information from any of the individual source tasks. We conclude by discussing further applications, limitations, and possible future directions. △ Less

Submitted 1 March, 2024; v1 submitted 27 February, 2023; originally announced February 2023.

arXiv:2302.12983 [pdf, other]

doi 10.1109/TVCG.2023.3243834

RipViz: Finding Rip Currents by Learning Pathline Behavior

Authors: Akila de Silva, Mona Zhao, Donald Stewart, Fahim Hasan Khan, Gregory Dusek, James Davis, Alex Pang

Abstract: We present a hybrid machine learning and flow analysis feature detection method, RipViz, to extract rip currents from stationary videos. Rip currents are dangerous strong currents that can drag beachgoers out to sea. Most people are either unaware of them or do not know what they look like. In some instances, even trained personnel such as lifeguards have difficulty identifying them. RipViz produc… ▽ More We present a hybrid machine learning and flow analysis feature detection method, RipViz, to extract rip currents from stationary videos. Rip currents are dangerous strong currents that can drag beachgoers out to sea. Most people are either unaware of them or do not know what they look like. In some instances, even trained personnel such as lifeguards have difficulty identifying them. RipViz produces a simple, easy to understand visualization of rip location overlaid on the source video. With RipViz, we first obtain an unsteady 2D vector field from the stationary video using optical flow. Movement at each pixel is analyzed over time. At each seed point, sequences of short pathlines, rather a single long pathline, are traced across the frames of the video to better capture the quasi-periodic flow behavior of wave activity. Because of the motion on the beach, the surf zone, and the surrounding areas, these pathlines may still appear very cluttered and incomprehensible. Furthermore, lay audiences are not familiar with pathlines and may not know how to interpret them. To address this, we treat rip currents as a flow anomaly in an otherwise normal flow. To learn about the normal flow behavior, we train an LSTM autoencoder with pathline sequences from normal ocean, foreground, and background movements. During test time, we use the trained LSTM autoencoder to detect anomalous pathlines (i.e., those in the rip zone). The origination points of such anomalous pathlines, over the course of the video, are then presented as points within the rip zone. RipViz is fully automated and does not require user input. Feedback from domain expert suggests that RipViz has the potential for wider use. △ Less

Submitted 24 February, 2023; originally announced February 2023.

Comments: This is the author's version of the article published in IEEE Transactions on Visualization and Computer Graphics, 2023

arXiv:2212.05411 [pdf, other]

doi 10.1145/3462204.3481743

Authoring Platform for Mobile Citizen Science Apps with Client-side ML

Authors: Fahim Hasan Khan, Akila de Silva, Gregory Dusek, James Davis, Alex Pang

Abstract: Data collection is an integral part of any citizen science project. Given the wide variety of projects, some level of expertise or, alternatively, some guidance for novice participants can greatly improve the quality of the collected data. A significant portion of citizen science projects depends on visual data, where photos or videos of different subjects are needed. Often these visual data are c… ▽ More Data collection is an integral part of any citizen science project. Given the wide variety of projects, some level of expertise or, alternatively, some guidance for novice participants can greatly improve the quality of the collected data. A significant portion of citizen science projects depends on visual data, where photos or videos of different subjects are needed. Often these visual data are collected from all over the world, including remote locations. In this article, we introduce an authoring platform for easily creating mobile apps for citizen science projects that are empowered with client-side machine learning (ML) guidance. The apps created with our platform can help participants recognize the correct data and increase the efficiency of the data collection process. We demonstrate the application of our proposed platform with two use cases: a rip current detection app for a planned pilot study and a detection app for biodiversity-related projects. △ Less

Submitted 11 December, 2022; originally announced December 2022.

arXiv:2211.02638 [pdf, other]

A Knowledge Distillation Framework For Enhancing Ear-EEG Based Sleep Staging With Scalp-EEG Data

Authors: Mithunjha Anandakumar, Jathurshan Pradeepkumar, Simon L. Kappel, Chamira U. S. Edussooriya, Anjula C. De Silva

Abstract: Sleep plays a crucial role in the well-being of human lives. Traditional sleep studies using Polysomnography are associated with discomfort and often lower sleep quality caused by the acquisition setup. Previous works have focused on developing less obtrusive methods to conduct high-quality sleep studies, and ear-EEG is among popular alternatives. However, the performance of sleep staging based on… ▽ More Sleep plays a crucial role in the well-being of human lives. Traditional sleep studies using Polysomnography are associated with discomfort and often lower sleep quality caused by the acquisition setup. Previous works have focused on developing less obtrusive methods to conduct high-quality sleep studies, and ear-EEG is among popular alternatives. However, the performance of sleep staging based on ear-EEG is still inferior to scalp-EEG based sleep staging. In order to address the performance gap between scalp-EEG and ear-EEG based sleep staging, we propose a cross-modal knowledge distillation strategy, which is a domain adaptation approach. Our experiments and analysis validate the effectiveness of the proposed approach with existing architectures, where it enhances the accuracy of the ear-EEG based sleep staging by 3.46% and Cohen's kappa coefficient by a margin of 0.038. △ Less

Submitted 26 October, 2022; originally announced November 2022.

Comments: Code available at : https://github.com/Mithunjha/EarEEG_KnowledgeDistillation

arXiv:2208.10967 [pdf, other]

The Value of Out-of-Distribution Data

Authors: Ashwin De Silva, Rahul Ramesh, Carey E. Priebe, Pratik Chaudhari, Joshua T. Vogelstein

Abstract: We expect the generalization error to improve with more samples from a similar task, and to deteriorate with more samples from an out-of-distribution (OOD) task. In this work, we show a counter-intuitive phenomenon: the generalization error of a task can be a non-monotonic function of the number of OOD samples. As the number of OOD samples increases, the generalization error on the target task imp… ▽ More We expect the generalization error to improve with more samples from a similar task, and to deteriorate with more samples from an out-of-distribution (OOD) task. In this work, we show a counter-intuitive phenomenon: the generalization error of a task can be a non-monotonic function of the number of OOD samples. As the number of OOD samples increases, the generalization error on the target task improves before deteriorating beyond a threshold. In other words, there is value in training on small amounts of OOD data. We use Fisher's Linear Discriminant on synthetic datasets and deep networks on computer vision benchmarks such as MNIST, CIFAR-10, CINIC-10, PACS and DomainNet to demonstrate and analyze this phenomenon. In the idealistic setting where we know which samples are OOD, we show that these non-monotonic trends can be exploited using an appropriately weighted objective of the target and OOD empirical risk. While its practical utility is limited, this does suggest that if we can detect OOD samples, then there may be ways to benefit from them. When we do not know which samples are OOD, we show how a number of go-to strategies such as data-augmentation, hyper-parameter optimization, and pre-training are not enough to ensure that the target generalization error does not deteriorate with the number of OOD samples in the dataset. △ Less

Submitted 13 July, 2023; v1 submitted 23 August, 2022; originally announced August 2022.

Comments: Previous versions of this work have been presented at the Out-of-Distribution Generalization in Computer Vision (OOD-CV) Workshop (ECCV 2022) and the Workshop on Distribution Shifts (NeurIPS 2022)

Journal ref: Proceedings of the 40th International Conference on Machine Learning, PMLR 202:7366-7389, 2023

arXiv:2208.06991 [pdf, other]

Towards Interpretable Sleep Stage Classification Using Cross-Modal Transformers

Authors: Jathurshan Pradeepkumar, Mithunjha Anandakumar, Vinith Kugathasan, Dhinesh Suntharalingham, Simon L. Kappel, Anjula C. De Silva, Chamira U. S. Edussooriya

Abstract: Accurate sleep stage classification is significant for sleep health assessment. In recent years, several machine-learning based sleep staging algorithms have been developed , and in particular, deep-learning based algorithms have achieved performance on par with human annotation. Despite improved performance, a limitation of most deep-learning based algorithms is their black-box behavior, which ha… ▽ More Accurate sleep stage classification is significant for sleep health assessment. In recent years, several machine-learning based sleep staging algorithms have been developed , and in particular, deep-learning based algorithms have achieved performance on par with human annotation. Despite improved performance, a limitation of most deep-learning based algorithms is their black-box behavior, which have limited their use in clinical settings. Here, we propose a cross-modal transformer, which is a transformer-based method for sleep stage classification. The proposed cross-modal transformer consists of a novel cross-modal transformer encoder architecture along with a multi-scale one-dimensional convolutional neural network for automatic representation learning. Our method outperforms the state-of-the-art methods and eliminates the black-box behavior of deep-learning models by utilizing the interpretability aspect of the attention modules. Furthermore, our method provides considerable reductions in the number of parameters and training time compared to the state-of-the-art methods. Our code is available at https://github.com/Jathurshan0330/Cross-Modal-Transformer. A demo of our work can be found at https://bit.ly/Cross_modal_transformer_demo. △ Less

Submitted 24 November, 2023; v1 submitted 14 August, 2022; originally announced August 2022.

Comments: 11 pages, 7 figures, 6 tables

arXiv:2201.13001 [pdf, other]

Deep Discriminative to Kernel Density Graph for In- and Out-of-distribution Calibrated Inference

Authors: Jayanta Dey, Haoyin Xu, Will LeVine, Ashwin De Silva, Tyler M. Tomita, Ali Geisa, Tiffany Chu, Jacob Desman, Joshua T. Vogelstein

Abstract: Deep discriminative approaches like random forests and deep neural networks have recently found applications in many important real-world scenarios. However, deploying these learning algorithms in safety-critical applications raises concerns, particularly when it comes to ensuring confidence calibration for both in-distribution and out-of-distribution data points. Many popular methods for in-distr… ▽ More Deep discriminative approaches like random forests and deep neural networks have recently found applications in many important real-world scenarios. However, deploying these learning algorithms in safety-critical applications raises concerns, particularly when it comes to ensuring confidence calibration for both in-distribution and out-of-distribution data points. Many popular methods for in-distribution (ID) calibration, such as isotonic and Platt's sigmoidal regression, exhibit excellent ID calibration performance. However, these methods are not calibrated for the entire feature space, leading to overconfidence in the case of out-of-distribution (OOD) samples. On the other end of the spectrum, existing out-of-distribution (OOD) calibration methods generally exhibit poor in-distribution (ID) calibration. In this paper, we address ID and OOD calibration problems jointly. We leveraged the fact that deep models, including both random forests and deep-nets, learn internal representations which are unions of polytopes with affine activation functions to conceptualize them both as partitioning rules of the feature space. We replace the affine function in each polytope populated by the training data with a Gaussian kernel. Our experiments on both tabular and vision benchmarks show that the proposed approaches obtain well-calibrated posteriors while mostly preserving or improving the classification accuracy of the original algorithm for ID region, and extrapolate beyond the training data to handle OOD inputs appropriately. △ Less

Submitted 7 June, 2024; v1 submitted 31 January, 2022; originally announced January 2022.

arXiv:2201.07372 [pdf, other]

Prospective Learning: Principled Extrapolation to the Future

Authors: Ashwin De Silva, Rahul Ramesh, Lyle Ungar, Marshall Hussain Shuler, Noah J. Cowan, Michael Platt, Chen Li, Leyla Isik, Seung-Eon Roh, Adam Charles, Archana Venkataraman, Brian Caffo, Javier J. How, Justus M Kebschull, John W. Krakauer, Maxim Bichuch, Kaleab Alemayehu Kinfu, Eva Yezerets, Dinesh Jayaraman, Jong M. Shin, Soledad Villar, Ian Phillips, Carey E. Priebe, Thomas Hartung, Michael I. Miller , et al. (18 additional authors not shown)

Abstract: Learning is a process which can update decision rules, based on past experience, such that future performance improves. Traditionally, machine learning is often evaluated under the assumption that the future will be identical to the past in distribution or change adversarially. But these assumptions can be either too optimistic or pessimistic for many problems in the real world. Real world scenari… ▽ More Learning is a process which can update decision rules, based on past experience, such that future performance improves. Traditionally, machine learning is often evaluated under the assumption that the future will be identical to the past in distribution or change adversarially. But these assumptions can be either too optimistic or pessimistic for many problems in the real world. Real world scenarios evolve over multiple spatiotemporal scales with partially predictable dynamics. Here we reformulate the learning problem to one that centers around this idea of dynamic futures that are partially learnable. We conjecture that certain sequences of tasks are not retrospectively learnable (in which the data distribution is fixed), but are prospectively learnable (in which distributions may be dynamic), suggesting that prospective learning is more difficult in kind than retrospective learning. We argue that prospective learning more accurately characterizes many real world problems that (1) currently stymie existing artificial intelligence solutions and/or (2) lack adequate explanations for how natural intelligences solve them. Thus, studying prospective learning will lead to deeper insights and solutions to currently vexing challenges in both natural and artificial intelligences. △ Less

Submitted 13 July, 2023; v1 submitted 18 January, 2022; originally announced January 2022.

Comments: Accepted at the 2nd Conference on Lifelong Learning Agents (CoLLAs), 2023

arXiv:2110.03578 [pdf, other]

Towards Accurate Cross-Domain In-Bed Human Pose Estimation

Authors: Mohamed Afham, Udith Haputhanthri, Jathurshan Pradeepkumar, Mithunjha Anandakumar, Ashwin De Silva, Chamira Edussooriya

Abstract: Human behavioral monitoring during sleep is essential for various medical applications. Majority of the contactless human pose estimation algorithms are based on RGB modality, causing ineffectiveness in in-bed pose estimation due to occlusions by blankets and varying illumination conditions. Long-wavelength infrared (LWIR) modality based pose estimation algorithms overcome the aforementioned chall… ▽ More Human behavioral monitoring during sleep is essential for various medical applications. Majority of the contactless human pose estimation algorithms are based on RGB modality, causing ineffectiveness in in-bed pose estimation due to occlusions by blankets and varying illumination conditions. Long-wavelength infrared (LWIR) modality based pose estimation algorithms overcome the aforementioned challenges; however, ground truth pose generations by a human annotator under such conditions are not feasible. A feasible solution to address this issue is to transfer the knowledge learned from images with pose labels and no occlusions, and adapt it towards real world conditions (occlusions due to blankets). In this paper, we propose a novel learning strategy comprises of two-fold data augmentation to reduce the cross-domain discrepancy and knowledge distillation to learn the distribution of unlabeled images in real world conditions. Our experiments and analysis show the effectiveness of our approach over multiple standard human pose estimation baselines. △ Less

Submitted 7 October, 2021; originally announced October 2021.

Comments: Code is available at https://github.com/MohamedAfham/CD_HPE

arXiv:2105.09489 [pdf, other]

Social Behaviour Understanding using Deep Neural Networks: Development of Social Intelligence Systems

Authors: Ethan Lim Ding Feng, Zhi-Wei Neo, Aaron William De Silva, Kellie Sim, Hong-Ray Tan, Thi-Thanh Nguyen, Karen Wei Ling Koh, Wenru Wang, Hoang D. Nguyen

Abstract: With the rapid development in artificial intelligence, social computing has evolved beyond social informatics toward the birth of social intelligence systems. This paper, therefore, takes initiatives to propose a social behaviour understanding framework with the use of deep neural networks for social and behavioural analysis. The integration of information fusion, person and object detection, soci… ▽ More With the rapid development in artificial intelligence, social computing has evolved beyond social informatics toward the birth of social intelligence systems. This paper, therefore, takes initiatives to propose a social behaviour understanding framework with the use of deep neural networks for social and behavioural analysis. The integration of information fusion, person and object detection, social signal understanding, behaviour understanding, and context understanding plays a harmonious role to elicit social behaviours. Three systems, including depression detection, activity recognition and cognitive impairment screening, are developed to evidently demonstrate the importance of social intelligence. The study considerably contributes to the cumulative development of social computing and health informatics. It also provides a number of implications for academic bodies, healthcare practitioners, and developers of socially intelligent agents. △ Less

Submitted 19 May, 2021; originally announced May 2021.

arXiv:2102.02902 [pdf, other]

Automated Rip Current Detection with Region based Convolutional Neural Networks

Authors: Akila de Silva, Issei Mori, Gregory Dusek, James Davis, Alex Pang

Abstract: This paper presents a machine learning approach for the automatic identification of rip currents with breaking waves. Rip currents are dangerous fast moving currents of water that result in many deaths by sweeping people out to sea. Most people do not know how to recognize rip currents in order to avoid them. Furthermore, efforts to forecast rip currents are hindered by lack of observations to hel… ▽ More This paper presents a machine learning approach for the automatic identification of rip currents with breaking waves. Rip currents are dangerous fast moving currents of water that result in many deaths by sweeping people out to sea. Most people do not know how to recognize rip currents in order to avoid them. Furthermore, efforts to forecast rip currents are hindered by lack of observations to help train and validate hazard models. The presence of web cams and smart phones have made video and still imagery of the coast ubiquitous and provide a potential source of rip current observations. These same devices could aid public awareness of the presence of rip currents. What is lacking is a method to detect the presence or absence of rip currents from coastal imagery. This paper provides expert labeled training and test data sets for rip currents. We use Faster-RCNN and a custom temporal aggregation stage to make detections from still images or videos with higher measured accuracy than both humans and other methods of rip current detection previously reported in the literature. △ Less

Submitted 4 February, 2021; originally announced February 2021.

arXiv:2102.01728 [pdf, other]

A Novel Transfer Learning-Based Approach for Screening Pre-existing Heart Diseases Using Synchronized ECG Signals and Heart Sounds

Authors: Ramith Hettiarachchi, Udith Haputhanthri, Kithmini Herath, Hasindu Kariyawasam, Shehan Munasinghe, Kithmin Wickramasinghe, Duminda Samarasinghe, Anjula De Silva, Chamira U. S. Edussooriya

Abstract: Diagnosing pre-existing heart diseases early in life is important as it helps prevent complications such as pulmonary hypertension, heart rhythm problems, blood clots, heart failure and sudden cardiac arrest. To identify such diseases, phonocardiogram (PCG) and electrocardiogram (ECG) waveforms convey important information. Therefore, effectively using these two modalities of data has the potentia… ▽ More Diagnosing pre-existing heart diseases early in life is important as it helps prevent complications such as pulmonary hypertension, heart rhythm problems, blood clots, heart failure and sudden cardiac arrest. To identify such diseases, phonocardiogram (PCG) and electrocardiogram (ECG) waveforms convey important information. Therefore, effectively using these two modalities of data has the potential to improve the disease screening process. We evaluate this hypothesis on a subset of the PhysioNet Challenge 2016 Dataset which contains simultaneously acquired PCG and ECG recordings. Our novel Dual-Convolutional Neural Network based approach uses transfer learning to tackle the problem of having limited amounts of simultaneous PCG and ECG data that is publicly available, while having the potential to adapt to larger datasets. In addition, we introduce two main evaluation frameworks named record-wise and sample-wise evaluation which leads to a rich performance evaluation for the transfer learning approach. Comparisons with methods which used single or dual modality data show that our method can lead to better performance. Furthermore, our results show that individually collected ECG or PCG waveforms are able to provide transferable features which could effectively help to make use of a limited number of synchronized PCG and ECG waveforms and still achieve significant classification performance. △ Less

Submitted 14 February, 2021; v1 submitted 2 February, 2021; originally announced February 2021.

Comments: Paper accepted to IEEE International Symposium on Circuits and Systems (ISCAS) 2021

arXiv:2010.13268 [pdf, ps, other]

A Joint Convolutional and Spatial Quad-Directional LSTM Network for Phase Unwrapping

Authors: Malsha V. Perera, Ashwin De Silva

Abstract: Phase unwrapping is a classical ill-posed problem which aims to recover the true phase from wrapped phase. In this paper, we introduce a novel Convolutional Neural Network (CNN) that incorporates a Spatial Quad-Directional Long Short Term Memory (SQD-LSTM) for phase unwrapping, by formulating it as a regression problem. Incorporating SQD-LSTM can circumvent the typical CNNs' inherent difficulty of… ▽ More Phase unwrapping is a classical ill-posed problem which aims to recover the true phase from wrapped phase. In this paper, we introduce a novel Convolutional Neural Network (CNN) that incorporates a Spatial Quad-Directional Long Short Term Memory (SQD-LSTM) for phase unwrapping, by formulating it as a regression problem. Incorporating SQD-LSTM can circumvent the typical CNNs' inherent difficulty of learning global spatial dependencies which are vital when recovering the true phase. Furthermore, we employ a problem specific composite loss function to train this network. The proposed network is found to be performing better than the existing methods under severe noise conditions (Normalized Root Mean Square Error of 1.3 % at SNR = 0 dB) while spending a significantly less computational time (0.054 s). The network also does not require a large scale dataset during training, thus making it ideal for applications with limited data that require fast and accurate phase unwrapping. △ Less

Submitted 25 October, 2020; originally announced October 2020.

Comments: Under Review

arXiv:2010.06584 [pdf, other]

doi 10.1109/MASS50613.2020.00046

Jointly Optimizing Sensing Pipelines for Multimodal Mixed Reality Interaction

Authors: Darshana Rathnayake, Ashen de Silva, Dasun Puwakdandawa, Lakmal Meegahapola, Archan Misra, Indika Perera

Abstract: Natural human interactions for Mixed Reality Applications are overwhelmingly multimodal: humans communicate intent and instructions via a combination of visual, aural and gestural cues. However, supporting low-latency and accurate comprehension of such multimodal instructions (MMI), on resource-constrained wearable devices, remains an open challenge, especially as the state-of-the-art comprehensio… ▽ More Natural human interactions for Mixed Reality Applications are overwhelmingly multimodal: humans communicate intent and instructions via a combination of visual, aural and gestural cues. However, supporting low-latency and accurate comprehension of such multimodal instructions (MMI), on resource-constrained wearable devices, remains an open challenge, especially as the state-of-the-art comprehension techniques for each individual modality increasingly utilize complex Deep Neural Network models. We demonstrate the possibility of overcoming the core limitation of latency--vs.--accuracy tradeoff by exploiting cross-modal dependencies -- i.e., by compensating for the inferior performance of one model with an increased accuracy of more complex model of a different modality. We present a sensor fusion architecture that performs MMI comprehension in a quasi-synchronous fashion, by fusing visual, speech and gestural input. The architecture is reconfigurable and supports dynamic modification of the complexity of the data processing pipeline for each individual modality in response to contextual changes. Using a representative "classroom" context and a set of four common interaction primitives, we then demonstrate how the choices between low and high complexity models for each individual modality are coupled. In particular, we show that (a) a judicious combination of low and high complexity models across modalities can offer a dramatic 3-fold decrease in comprehension latency together with an increase 10-15% in accuracy, and (b) the right collective choice of models is context dependent, with the performance of some model combinations being significantly more sensitive to changes in scene context or choice of interaction. △ Less

Submitted 18 December, 2020; v1 submitted 13 October, 2020; originally announced October 2020.

Comments: 17th IEEE International Conference on Mobile Ad-Hoc and Sensor Systems (MASS) - 2020

arXiv:2009.02575 [pdf, ps, other]

doi 10.1109/SMC42975.2020.9283285

Low-cost Active Dry-Contact Surface EMG Sensor for Bionic Arms

Authors: Asma M. Naim, Kithmin Wickramasinghe, Ashwin De Silva, Malsha V. Perera, Thilina Dulantha Lalitharatne, Simon L. Kappel

Abstract: Surface electromyography (sEMG) is a popular bio-signal used for controlling prostheses and finger gesture recognition mechanisms. Myoelectric prostheses are costly, and most commercially available sEMG acquisition systems are not suitable for real-time gesture recognition. In this paper, a method of acquiring sEMG signals using novel low-cost, active, dry-contact, flexible sensors has been propos… ▽ More Surface electromyography (sEMG) is a popular bio-signal used for controlling prostheses and finger gesture recognition mechanisms. Myoelectric prostheses are costly, and most commercially available sEMG acquisition systems are not suitable for real-time gesture recognition. In this paper, a method of acquiring sEMG signals using novel low-cost, active, dry-contact, flexible sensors has been proposed. Since the active sEMG sensor was developed to be used along with a bionic arm, the sensor was tested for its ability to acquire sEMG signals that could be used for real-time classification of five selected gestures. In a study of 4 subjects, the average classification accuracy for real-time gesture classification using the active sEMG sensor system was 85%. The common-mode rejection ratio of the sensor was measured to 59 dB, and thus the sensor's performance was not substantially limited by its active circuitry. The proposed sensors can be interfaced with a variety of amplifiers to perform fully wearable sEMG acquisition. This satisfies the need for a low-cost sEMG acquisition system for prostheses. △ Less

Submitted 9 September, 2020; v1 submitted 5 September, 2020; originally announced September 2020.

Comments: Paper accepted to IEEE International Conference on Systems, Man, and Cybernetics (SMC) 2020

arXiv:1901.03329 [pdf, other]

BrailleBand: Blind Support Haptic Wearable Band for Communication using Braille Language

Authors: H. P. Savindu, K. A. Iroshan, C. D. Panangala, W. L. D. W. P. Perera, A. C De Silva

Abstract: Visually impaired people are neglected from many modern communication and interaction procedures. Assistive technologies such as text-to-speech and braille displays are the most commonly used means of connecting such visually impaired people with mobile phones and other smart devices. Both these solutions face usability issues, thus this study focused on developing a user friendly wearable solutio… ▽ More Visually impaired people are neglected from many modern communication and interaction procedures. Assistive technologies such as text-to-speech and braille displays are the most commonly used means of connecting such visually impaired people with mobile phones and other smart devices. Both these solutions face usability issues, thus this study focused on developing a user friendly wearable solution called the "BrailleBand" with haptic technology while preserving affordability. The "BrailleBand" enables passive reading using the Braille language. Connectivity between the BrailleBand and the smart device (phone) is established using Bluetooth protocol. It consists of six nodes in three bands worn on the arm to map the braille alphabet, which are actuated to give the sense of touch corresponding to the characters. Three mobile applications were developed for training the visually impaired and to integrate existing smart mobile applications such as navigation and short message service (SMS) with the device BrailleBand. The adaptability, usability and efficiency of reading was tested on a sample of blind users which reflected progressive results. Even though, the reading accuracy depends on the time duration between the characters (character gap) an average Character Transfer Rate of 0.4375 characters per second can be achieved with a character gap of 1000 ms. △ Less

Submitted 10 January, 2019; originally announced January 2019.

Comments: 6 pages, 4 figures, In proceedings of 2017 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 1381-1386. Banff, Canada

arXiv:1901.02868 [pdf]

doi 10.5769/J201801001

Fuzzy neural networks to create an expert system for detecting attacks by SQL Injection

Authors: Lucas Oliveira Batista, Gabriel Adriano de Silva, Vanessa Souza Araújo, Vinícius Jonathan Silva Araújo, Thiago Silva Rezende, Augusto Junio Guimarães, Paulo Vitor de Campos Souza

Abstract: Its constant technological evolution characterizes the contemporary world, and every day the processes, once manual, become computerized. Data are stored in the cyberspace, and as a consequence, one must increase the concern with the security of this environment. Cyber-attacks are represented by a growing worldwide scale and are characterized as one of the significant challenges of the century. Th… ▽ More Its constant technological evolution characterizes the contemporary world, and every day the processes, once manual, become computerized. Data are stored in the cyberspace, and as a consequence, one must increase the concern with the security of this environment. Cyber-attacks are represented by a growing worldwide scale and are characterized as one of the significant challenges of the century. This article aims to propose a computational system based on intelligent hybrid models, which through fuzzy rules allows the construction of expert systems in cybernetic data attacks, focusing on the SQL Injection attack. The tests were performed with real bases of SQL Injection attacks on government computers, using fuzzy neural networks. According to the results obtained, the feasibility of constructing a system based on fuzzy rules, with the classification accuracy of cybernetic invasions within the margin of the standard deviation (compared to the state-of-the-art model in solving this type of problem) is real. The model helps countries prepare to protect their data networks and information systems, as well as create opportunities for expert systems to automate the identification of attacks in cyberspace. △ Less

Submitted 9 January, 2019; originally announced January 2019.

Journal ref: The International Journal of Forensic Computer Science, Volume 13, Number 1, pages 8-21, 2018

Showing 1–23 of 23 results for author: De Silva, A