Convolutional and recurrent neural network for human activity recognition: Application on American sign language

PLoS One. 2020 Feb 19;15(2):e0228869. doi: 10.1371/journal.pone.0228869. eCollection 2020.

Abstract

Human activity recognition is an important and difficult topic to study because of the important variability between tasks repeated several times by a subject and between subjects. This work is motivated by providing time-series signal classification and a robust validation and test approaches. This study proposes to classify 60 signs from the American Sign Language based on data provided by the LeapMotion sensor by using different conventional machine learning and deep learning models including a model called DeepConvLSTM that integrates convolutional and recurrent layers with Long-Short Term Memory cells. A kinematic model of the right and left forearm/hand/fingers/thumb is proposed as well as the use of a simple data augmentation technique to improve the generalization of neural networks. DeepConvLSTM and convolutional neural network demonstrated the highest accuracy compared to other models with 91.1 (3.8) and 89.3 (4.0) % respectively compared to the recurrent neural network or multi-layer perceptron. Integrating convolutional layers in a deep learning model seems to be an appropriate solution for sign language recognition with depth sensors data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Biomechanical Phenomena
  • Deep Learning
  • Gestures
  • Hand
  • Humans
  • Machine Learning
  • Male
  • Movement
  • Neural Networks, Computer*
  • Sign Language*

Grants and funding

This study was funded by the Institute of Global Innovation Research, Tokyo University of Agriculture and Technology. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.