Search | arXiv e-print repository

Deep Learning for Plant Identification and Disease Classification from Leaf Images: Multi-prediction Approaches

Authors: Jianping Yao, Son N. Tran, Saurabh Garg, Samantha Sawyer

Abstract: Deep learning plays an important role in modern agriculture, especially in plant pathology using leaf images where convolutional neural networks (CNN) are attracting a lot of attention. While numerous reviews have explored the applications of deep learning within this research domain, there remains a notable absence of an empirical study to offer insightful comparisons due to the employment of var… ▽ More Deep learning plays an important role in modern agriculture, especially in plant pathology using leaf images where convolutional neural networks (CNN) are attracting a lot of attention. While numerous reviews have explored the applications of deep learning within this research domain, there remains a notable absence of an empirical study to offer insightful comparisons due to the employment of varied datasets in the evaluation. Furthermore, a majority of these approaches tend to address the problem as a singular prediction task, overlooking the multifaceted nature of predicting various aspects of plant species and disease types. Lastly, there is an evident need for a more profound consideration of the semantic relationships that underlie plant species and disease types. In this paper, we start our study by surveying current deep learning approaches for plant identification and disease classification. We categorise the approaches into multi-model, multi-label, multi-output, and multi-task, in which different backbone CNNs can be employed. Furthermore, based on the survey of existing approaches in plant pathology and the study of available approaches in machine learning, we propose a new model named Generalised Stacking Multi-output CNN (GSMo-CNN). To investigate the effectiveness of different backbone CNNs and learning approaches, we conduct an intensive experiment on three benchmark datasets Plant Village, Plant Leaves, and PlantDoc. The experimental results demonstrate that InceptionV3 can be a good choice for a backbone CNN as its performance is better than AlexNet, VGG16, ResNet101, EfficientNet, MobileNet, and a custom CNN developed by us. Interestingly, empirical results support the hypothesis that using a single model can be comparable or better than using two models. Finally, we show that the proposed GSMo-CNN achieves state-of-the-art performance on three benchmark datasets. △ Less

Submitted 24 October, 2023; originally announced October 2023.

Comments: Jianping and Son are joint first authors (equal contribution)

arXiv:2310.12509 [pdf, other]

doi 10.1007/s10462-023-10610-4

Machine Learning for Leaf Disease Classification: Data, Techniques and Applications

Authors: Jianping Yao, Son N. Tran, Samantha Sawyer, Saurabh Garg

Abstract: The growing demand for sustainable development brings a series of information technologies to help agriculture production. Especially, the emergence of machine learning applications, a branch of artificial intelligence, has shown multiple breakthroughs which can enhance and revolutionize plant pathology approaches. In recent years, machine learning has been adopted for leaf disease classification… ▽ More The growing demand for sustainable development brings a series of information technologies to help agriculture production. Especially, the emergence of machine learning applications, a branch of artificial intelligence, has shown multiple breakthroughs which can enhance and revolutionize plant pathology approaches. In recent years, machine learning has been adopted for leaf disease classification in both academic research and industrial applications. Therefore, it is enormously beneficial for researchers, engineers, managers, and entrepreneurs to have a comprehensive view about the recent development of machine learning technologies and applications for leaf disease detection. This study will provide a survey in different aspects of the topic including data, techniques, and applications. The paper will start with publicly available datasets. After that, we summarize common machine learning techniques, including traditional (shallow) learning, deep learning, and augmented learning. Finally, we discuss related applications. This paper would provide useful resources for future study and application of machine learning for smart agriculture in general and leaf disease classification in particular. △ Less

Submitted 19 October, 2023; originally announced October 2023.

Journal ref: Artificial Intelligence Review 2023

arXiv:2302.08505 [pdf, other]

Rapid-Motion-Track: Markerless Tracking of Fast Human Motion with Deeper Learning

Authors: Renjie Li, Chun Yu Lao, Rebecca St. George, Katherine Lawler, Saurabh Garg, Son N. Tran, Quan Bai, Jane Alty

Abstract: Objective The coordination of human movement directly reflects function of the central nervous system. Small deficits in movement are often the first sign of an underlying neurological problem. The objective of this research is to develop a new end-to-end, deep learning-based system, Rapid-Motion-Track (RMT) that can track the fastest human movement accurately when webcams or laptop cameras are us… ▽ More Objective The coordination of human movement directly reflects function of the central nervous system. Small deficits in movement are often the first sign of an underlying neurological problem. The objective of this research is to develop a new end-to-end, deep learning-based system, Rapid-Motion-Track (RMT) that can track the fastest human movement accurately when webcams or laptop cameras are used. Materials and Methods We applied RMT to finger tapping, a well-validated test of motor control that is one of the most challenging human motions to track with computer vision due to the small keypoints of digits and the high velocities that are generated. We recorded 160 finger tapping assessments simultaneously with a standard 2D laptop camera (30 frames/sec) and a high-speed wearable sensor-based 3D motion tracking system (250 frames/sec). RMT and a range of DLC models were applied to the video data with tapping frequencies up to 8Hz to extract movement features. Results The movement features (e.g. speed, rhythm, variance) identified with the new RMT system exhibited very high concurrent validity with the gold-standard measurements (97.3\% of RMT measures were within +/-0.5Hz of the Optotrak measures), and outperformed DLC and other advanced computer vision tools (around 88.2\% of DLC measures were within +/-0.5Hz of the Optotrak measures). RMT also accurately tracked a range of other rapid human movements such as foot tapping, head turning and sit-to -stand movements. Conclusion: With the ubiquity of video technology in smart devices, the RMT method holds potential to transform access and accuracy of human movement assessment. △ Less

Submitted 18 January, 2023; originally announced February 2023.

arXiv:2207.02376 [pdf, other]

A Comprehensive Review on Deep Supervision: Theories and Applications

Authors: Renjie Li, Xinyi Wang, Guan Huang, Wenli Yang, Kaining Zhang, Xiaotong Gu, Son N. Tran, Saurabh Garg, Jane Alty, Quan Bai

Abstract: Deep supervision, or known as 'intermediate supervision' or 'auxiliary supervision', is to add supervision at hidden layers of a neural network. This technique has been increasingly applied in deep neural network learning systems for various computer vision applications recently. There is a consensus that deep supervision helps improve neural network performance by alleviating the gradient vanishi… ▽ More Deep supervision, or known as 'intermediate supervision' or 'auxiliary supervision', is to add supervision at hidden layers of a neural network. This technique has been increasingly applied in deep neural network learning systems for various computer vision applications recently. There is a consensus that deep supervision helps improve neural network performance by alleviating the gradient vanishing problem, as one of the many strengths of deep supervision. Besides, in different computer vision applications, deep supervision can be applied in different ways. How to make the most use of deep supervision to improve network performance in different applications has not been thoroughly investigated. In this paper, we provide a comprehensive in-depth review of deep supervision in both theories and applications. We propose a new classification of different deep supervision networks, and discuss advantages and limitations of current deep supervision networks in computer vision applications. △ Less

Submitted 5 July, 2022; originally announced July 2022.

arXiv:2112.05841 [pdf, other]

Logical Boltzmann Machines

Authors: Son N. Tran, Artur d'Avila Garcez

Abstract: The idea of representing symbolic knowledge in connectionist systems has been a long-standing endeavour which has attracted much attention recently with the objective of combining machine learning and scalable sound reasoning. Early work has shown a correspondence between propositional logic and symmetrical neural networks which nevertheless did not scale well with the number of variables and whos… ▽ More The idea of representing symbolic knowledge in connectionist systems has been a long-standing endeavour which has attracted much attention recently with the objective of combining machine learning and scalable sound reasoning. Early work has shown a correspondence between propositional logic and symmetrical neural networks which nevertheless did not scale well with the number of variables and whose training regime was inefficient. In this paper, we introduce Logical Boltzmann Machines (LBM), a neurosymbolic system that can represent any propositional logic formula in strict disjunctive normal form. We prove equivalence between energy minimization in LBM and logical satisfiability thus showing that LBM is capable of sound reasoning. We evaluate reasoning empirically to show that LBM is capable of finding all satisfying assignments of a class of logical formulae by searching fewer than 0.75% of the possible (approximately 1 billion) assignments. We compare learning in LBM with a symbolic inductive logic programming system, a state-of-the-art neurosymbolic system and a purely neural network-based system, achieving better learning performance in five out of seven data sets. △ Less

Submitted 10 December, 2021; originally announced December 2021.

Comments: 15 pages, 5 figures, 2 tables

MSC Class: 68T01 ACM Class: I.2.4; I.2.6; I.2.11

arXiv:2110.14461 [pdf, other]

doi 10.1007/s00521-022-08090-8

Hand gesture detection in tests performed by older adults

Authors: Guan Huang, Son N. Tran, Quan Bai, Jane Alty

Abstract: Our team are developing a new online test that analyses hand movement features associated with ageing that can be completed remotely from the research centre. To obtain hand movement features, participants will be asked to perform a variety of hand gestures using their own computer cameras. However, it is challenging to collect high quality hand movement video data, especially for older participan… ▽ More Our team are developing a new online test that analyses hand movement features associated with ageing that can be completed remotely from the research centre. To obtain hand movement features, participants will be asked to perform a variety of hand gestures using their own computer cameras. However, it is challenging to collect high quality hand movement video data, especially for older participants, many of whom have no IT background. During the data collection process, one of the key steps is to detect whether the participants are following the test instructions correctly and also to detect similar gestures from different devices. Furthermore, we need this process to be automated and accurate as we expect many thousands of participants to complete the test. We have implemented a hand gesture detector to detect the gestures in the hand movement tests and our detection mAP is 0.782 which is better than the state-of-the-art. In this research, we have processed 20,000 images collected from hand movement tests and labelled 6,450 images to detect different hand gestures in the hand movement tests. This paper has the following three contributions. Firstly, we compared and analysed the performance of different network structures for hand gesture detection. Secondly, we have made many attempts to improve the accuracy of the model and have succeeded in improving the classification accuracy for similar gestures by implementing attention layers. Thirdly, we have created two datasets and included 20 percent of blurred images in the dataset to investigate how different network structures were impacted by noisy data, our experiments have also shown our network has better performance on the noisy dataset. △ Less

Submitted 28 October, 2021; v1 submitted 27 October, 2021; originally announced October 2021.

Journal ref: Neural Comput & Applic (2022)

arXiv:2105.04356 [pdf, other]

doi 10.1049/cvi2.12028

Coconut trees detection and segmentation in aerial imagery using mask region-based convolution neural network

Authors: Muhammad Shakaib Iqbal, Hazrat Ali, Son N. Tran, Talha Iqbal

Abstract: Food resources face severe damages under extraordinary situations of catastrophes such as earthquakes, cyclones, and tsunamis. Under such scenarios, speedy assessment of food resources from agricultural land is critical as it supports aid activity in the disaster hit areas. In this article, a deep learning approach is presented for the detection and segmentation of coconut tress in aerial imagery… ▽ More Food resources face severe damages under extraordinary situations of catastrophes such as earthquakes, cyclones, and tsunamis. Under such scenarios, speedy assessment of food resources from agricultural land is critical as it supports aid activity in the disaster hit areas. In this article, a deep learning approach is presented for the detection and segmentation of coconut tress in aerial imagery provided through the AI competition organized by the World Bank in collaboration with OpenAerialMap and WeRobotics. Maked Region-based Convolutional Neural Network approach was used identification and segmentation of coconut trees. For the segmentation task, Mask R-CNN model with ResNet50 and ResNet1010 based architectures was used. Several experiments with different configuration parameters were performed and the best configuration for the detection of coconut trees with more than 90% confidence factor was reported. For the purpose of evaluation, Microsoft COCO dataset evaluation metric namely mean average precision (mAP) was used. An overall 91% mean average precision for coconut trees detection was achieved. △ Less

Submitted 10 May, 2021; originally announced May 2021.

Comments: Published in IET Computer Vision, 09 April 2021

arXiv:2004.13236 [pdf, other]

Deep Auto-Encoders with Sequential Learning for Multimodal Dimensional Emotion Recognition

Authors: Dung Nguyen, Duc Thanh Nguyen, Rui Zeng, Thanh Thi Nguyen, Son N. Tran, Thin Nguyen, Sridha Sridharan, Clinton Fookes

Abstract: Multimodal dimensional emotion recognition has drawn a great attention from the affective computing community and numerous schemes have been extensively investigated, making a significant progress in this area. However, several questions still remain unanswered for most of existing approaches including: (i) how to simultaneously learn compact yet representative features from multimodal data, (ii)… ▽ More Multimodal dimensional emotion recognition has drawn a great attention from the affective computing community and numerous schemes have been extensively investigated, making a significant progress in this area. However, several questions still remain unanswered for most of existing approaches including: (i) how to simultaneously learn compact yet representative features from multimodal data, (ii) how to effectively capture complementary features from multimodal streams, and (iii) how to perform all the tasks in an end-to-end manner. To address these challenges, in this paper, we propose a novel deep neural network architecture consisting of a two-stream auto-encoder and a long short term memory for effectively integrating visual and audio signal streams for emotion recognition. To validate the robustness of our proposed architecture, we carry out extensive experiments on the multimodal emotion in the wild dataset: RECOLA. Experimental results show that the proposed method achieves state-of-the-art recognition performance and surpasses existing schemes by a significant margin. △ Less

Submitted 27 April, 2020; originally announced April 2020.

Comments: Under Review on Transaction on Multimedia

arXiv:2003.11136 [pdf, other]

Joint Deep Cross-Domain Transfer Learning for Emotion Recognition

Authors: Dung Nguyen, Sridha Sridharan, Duc Thanh Nguyen, Simon Denman, Son N. Tran, Rui Zeng, Clinton Fookes

Abstract: Deep learning has been applied to achieve significant progress in emotion recognition. Despite such substantial progress, existing approaches are still hindered by insufficient training data, and the resulting models do not generalize well under mismatched conditions. To address this challenge, we propose a learning strategy which jointly transfers the knowledge learned from rich datasets to sourc… ▽ More Deep learning has been applied to achieve significant progress in emotion recognition. Despite such substantial progress, existing approaches are still hindered by insufficient training data, and the resulting models do not generalize well under mismatched conditions. To address this challenge, we propose a learning strategy which jointly transfers the knowledge learned from rich datasets to source-poor datasets. Our method is also able to learn cross-domain features which lead to improved recognition performance. To demonstrate the robustness of our proposed framework, we conducted experiments on three benchmark emotion datasets including eNTERFACE, SAVEE, and EMODB. Experimental results show that the proposed method surpassed state-of-the-art transfer learning schemes by a significant margin. △ Less

Submitted 24 March, 2020; originally announced March 2020.

arXiv:1905.06088 [pdf, other]

Neural-Symbolic Computing: An Effective Methodology for Principled Integration of Machine Learning and Reasoning

Authors: Artur d'Avila Garcez, Marco Gori, Luis C. Lamb, Luciano Serafini, Michael Spranger, Son N. Tran

Abstract: Current advances in Artificial Intelligence and machine learning in general, and deep learning in particular have reached unprecedented impact not only across research communities, but also over popular media channels. However, concerns about interpretability and accountability of AI have been raised by influential thinkers. In spite of the recent impact of AI, several works have identified the ne… ▽ More Current advances in Artificial Intelligence and machine learning in general, and deep learning in particular have reached unprecedented impact not only across research communities, but also over popular media channels. However, concerns about interpretability and accountability of AI have been raised by influential thinkers. In spite of the recent impact of AI, several works have identified the need for principled knowledge representation and reasoning mechanisms integrated with deep learning-based systems to provide sound and explainable models for such systems. Neural-symbolic computing aims at integrating, as foreseen by Valiant, two most fundamental cognitive abilities: the ability to learn from the environment, and the ability to reason from what has been learned. Neural-symbolic computing has been an active topic of research for many years, reconciling the advantages of robust learning in neural networks and reasoning and interpretability of symbolic representation. In this paper, we survey recent accomplishments of neural-symbolic computing as a principled methodology for integrated machine learning and reasoning. We illustrate the effectiveness of the approach by outlining the main characteristics of the methodology: principled integration of neural learning with symbolic knowledge representation and reasoning allowing for the construction of explainable AI systems. The insights provided by neural-symbolic computing shed new light on the increasingly prominent need for interpretable and accountable AI systems. △ Less

Submitted 15 May, 2019; originally announced May 2019.

arXiv:1903.10453 [pdf, other]

dpUGC: Learn Differentially Private Representation for User Generated Contents

Authors: Xuan-Son Vu, Son N. Tran, Lili Jiang

Abstract: This paper firstly proposes a simple yet efficient generalized approach to apply differential privacy to text representation (i.e., word embedding). Based on it, we propose a user-level approach to learn personalized differentially private word embedding model on user generated contents (UGC). To our best knowledge, this is the first work of learning user-level differentially private word embeddin… ▽ More This paper firstly proposes a simple yet efficient generalized approach to apply differential privacy to text representation (i.e., word embedding). Based on it, we propose a user-level approach to learn personalized differentially private word embedding model on user generated contents (UGC). To our best knowledge, this is the first work of learning user-level differentially private word embedding model from text for sharing. The proposed approaches protect the privacy of the individual from re-identification, especially provide better trade-off of privacy and data utility on UGC data for sharing. The experimental results show that the trained embedding models are applicable for the classic text analysis tasks (e.g., regression). Moreover, the proposed approaches of learning differentially private embedding models are both framework- and data- independent, which facilitates the deployment and sharing. The source code is available at https://github.com/sonvx/dpText. △ Less

Submitted 25 March, 2019; originally announced March 2019.

Journal ref: Proceedings of the 20th International Conference on Computational Linguistics and Intelligent Text Processing, La Rochelle, France, 2019

arXiv:1903.04433 [pdf, other]

ETNLP: a visual-aided systematic approach to select pre-trained embeddings for a downstream task

Authors: Xuan-Son Vu, Thanh Vu, Son N. Tran, Lili Jiang

Abstract: Given many recent advanced embedding models, selecting pre-trained word embedding (a.k.a., word representation) models best fit for a specific downstream task is non-trivial. In this paper, we propose a systematic approach, called ETNLP, for extracting, evaluating, and visualizing multiple sets of pre-trained word embeddings to determine which embeddings should be used in a downstream task. For ex… ▽ More Given many recent advanced embedding models, selecting pre-trained word embedding (a.k.a., word representation) models best fit for a specific downstream task is non-trivial. In this paper, we propose a systematic approach, called ETNLP, for extracting, evaluating, and visualizing multiple sets of pre-trained word embeddings to determine which embeddings should be used in a downstream task. For extraction, we provide a method to extract subsets of the embeddings to be used in the downstream task. For evaluation, we analyse the quality of pre-trained embeddings using an input word analogy list. Finally, we visualize the word representations in the embedding space to explore the embedded words interactively. We demonstrate the effectiveness of the proposed approach on our pre-trained word embedding models in Vietnamese to select which models are suitable for a named entity recognition (NER) task. Specifically, we create a large Vietnamese word analogy list to evaluate and select the pre-trained embedding models for the task. We then utilize the selected embeddings for the NER task and achieve the new state-of-the-art results on the task benchmark dataset. We also apply the approach to another downstream task of privacy-guaranteed embedding selection, and show that it helps users quickly select the most suitable embeddings. In addition, we create an open-source system using the proposed systematic approach to facilitate similar studies on other NLP tasks. The source code and data are available at https://github.com/vietnlp/etnlp. △ Less

Submitted 3 August, 2019; v1 submitted 11 March, 2019; originally announced March 2019.

Comments: 10 pages

Journal ref: Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP), 2019

arXiv:1806.06611 [pdf, other]

On Multi-resident Activity Recognition in Ambient Smart-Homes

Authors: Son N. Tran, Qing Zhang, Mohan Karunanithi

Abstract: Increasing attention to the research on activity monitoring in smart homes has motivated the employment of ambient intelligence to reduce the deployment cost and solve the privacy issue. Several approaches have been proposed for multi-resident activity recognition, however, there still lacks a comprehensive benchmark for future research and practical selection of models. In this paper we study dif… ▽ More Increasing attention to the research on activity monitoring in smart homes has motivated the employment of ambient intelligence to reduce the deployment cost and solve the privacy issue. Several approaches have been proposed for multi-resident activity recognition, however, there still lacks a comprehensive benchmark for future research and practical selection of models. In this paper we study different methods for multi-resident activity recognition and evaluate them on same sets of data. The experimental results show that recurrent neural network with gated recurrent units is better than other models and also considerably efficient, and that using combined activities as single labels is more effective than represent them as separate labels. △ Less

Submitted 18 June, 2018; originally announced June 2018.

arXiv:1710.02245 [pdf, other]

Linear-Time Sequence Classification using Restricted Boltzmann Machines

Authors: Son N. Tran, Srikanth Cherla, Artur Garcez, Tillman Weyde

Abstract: Classification of sequence data is the topic of interest for dynamic Bayesian models and Recurrent Neural Networks (RNNs). While the former can explicitly model the temporal dependencies between class variables, the latter have a capability of learning representations. Several attempts have been made to improve performance by combining these two approaches or increasing the processing capability o… ▽ More Classification of sequence data is the topic of interest for dynamic Bayesian models and Recurrent Neural Networks (RNNs). While the former can explicitly model the temporal dependencies between class variables, the latter have a capability of learning representations. Several attempts have been made to improve performance by combining these two approaches or increasing the processing capability of the hidden units in RNNs. This often results in complex models with a large number of learning parameters. In this paper, a compact model is proposed which offers both representation learning and temporal inference of class variables by rolling Restricted Boltzmann Machines (RBMs) and class variables over time. We address the key issue of intractability in this variant of RBMs by optimising a conditional distribution, instead of a joint distribution. Experiments reported in the paper on melody modelling and optical character recognition show that the proposed model can outperform the state-of-the-art. Also, the experimental results on optical character recognition, part-of-speech tagging and text chunking demonstrate that our model is comparable to recurrent neural networks with complex memory gates while requiring far fewer parameters. △ Less

Submitted 8 March, 2018; v1 submitted 5 October, 2017; originally announced October 2017.

arXiv:1706.01991 [pdf, other]

Unsupervised Neural-Symbolic Integration

Authors: Son N. Tran

Abstract: Symbolic has been long considered as a language of human intelligence while neural networks have advantages of robust computation and dealing with noisy data. The integration of neural-symbolic can offer better learning and reasoning while providing a means for interpretability through the representation of symbolic knowledge. Although previous works focus intensively on supervised feedforward neu… ▽ More Symbolic has been long considered as a language of human intelligence while neural networks have advantages of robust computation and dealing with noisy data. The integration of neural-symbolic can offer better learning and reasoning while providing a means for interpretability through the representation of symbolic knowledge. Although previous works focus intensively on supervised feedforward neural networks, little has been done for the unsupervised counterparts. In this paper we show how to integrate symbolic knowledge into unsupervised neural networks. We exemplify our approach with knowledge in different forms, including propositional logic for DNA promoter prediction and first-order logic for understanding family relationship. △ Less

Submitted 22 June, 2017; v1 submitted 6 June, 2017; originally announced June 2017.

arXiv:1705.10899 [pdf, other]

Propositional Knowledge Representation and Reasoning in Restricted Boltzmann Machines

Authors: Son N. Tran

Abstract: While knowledge representation and reasoning are considered the keys for human-level artificial intelligence, connectionist networks have been shown successful in a broad range of applications due to their capacity for robust learning and flexible inference under uncertainty. The idea of representing symbolic knowledge in connectionist networks has been well-received and attracted much attention f… ▽ More While knowledge representation and reasoning are considered the keys for human-level artificial intelligence, connectionist networks have been shown successful in a broad range of applications due to their capacity for robust learning and flexible inference under uncertainty. The idea of representing symbolic knowledge in connectionist networks has been well-received and attracted much attention from research community as this can establish a foundation for integration of scalable learning and sound reasoning. In previous work, there exist a number of approaches that map logical inference rules with feed-forward propagation of artificial neural networks (ANN). However, the discriminative structure of an ANN requires the separation of input/output variables which makes it difficult for general reasoning where any variables should be inferable. Other approaches address this issue by employing generative models such as symmetric connectionist networks, however, they are difficult and convoluted. In this paper we propose a novel method to represent propositional formulas in restricted Boltzmann machines which is less complex, especially in the cases of logical implications and Horn clauses. An integration system is then developed and evaluated in real datasets which shows promising results. △ Less

Submitted 29 May, 2018; v1 submitted 30 May, 2017; originally announced May 2017.

arXiv:1604.01806 [pdf, ps, other]

Generalising the Discriminative Restricted Boltzmann Machine

Authors: Srikanth Cherla, Son N Tran, Tillman Weyde, Artur d'Avila Garcez

Abstract: We present a novel theoretical result that generalises the Discriminative Restricted Boltzmann Machine (DRBM). While originally the DRBM was defined assuming the {0, 1}-Bernoulli distribution in each of its hidden units, this result makes it possible to derive cost functions for variants of the DRBM that utilise other distributions, including some that are often encountered in the literature. This… ▽ More We present a novel theoretical result that generalises the Discriminative Restricted Boltzmann Machine (DRBM). While originally the DRBM was defined assuming the {0, 1}-Bernoulli distribution in each of its hidden units, this result makes it possible to derive cost functions for variants of the DRBM that utilise other distributions, including some that are often encountered in the literature. This is illustrated with the Binomial and {-1, +1}-Bernoulli distributions here. We evaluate these two DRBM variants and compare them with the original one on three benchmark datasets, namely the MNIST and USPS digit classification datasets, and the 20 Newsgroups document classification dataset. Results show that each of the three compared models outperforms the remaining two in one of the three datasets, thus indicating that the proposed theoretical generalisation of the DRBM may be valuable in practice. △ Less

Submitted 6 April, 2016; originally announced April 2016.

Comments: Submitted to ECML 2016 conference track

arXiv:1312.6190 [pdf, other]

Adaptive Feature Ranking for Unsupervised Transfer Learning

Authors: Son N. Tran, Artur d'Avila Garcez

Abstract: Transfer Learning is concerned with the application of knowledge gained from solving a problem to a different but related problem domain. In this paper, we propose a method and efficient algorithm for ranking and selecting representations from a Restricted Boltzmann Machine trained on a source domain to be transferred onto a target domain. Experiments carried out using the MNIST, ICDAR and TiCC im… ▽ More Transfer Learning is concerned with the application of knowledge gained from solving a problem to a different but related problem domain. In this paper, we propose a method and efficient algorithm for ranking and selecting representations from a Restricted Boltzmann Machine trained on a source domain to be transferred onto a target domain. Experiments carried out using the MNIST, ICDAR and TiCC image datasets show that the proposed adaptive feature ranking and transfer learning method offers statistically significant improvements on the training of RBMs. Our method is general in that the knowledge chosen by the ranking function does not depend on its relation to any specific target domain, and it works with unsupervised learning and knowledge-based transfer. △ Less

Submitted 28 May, 2014; v1 submitted 20 December, 2013; originally announced December 2013.

Comments: 9 pages 7 figures, new experimental results on ranking and transfer have been added, typo fixed

Showing 1–18 of 18 results for author: Tran, S N