Adversarial Autoencoder and Multi-Armed Bandit for Dynamic Difficulty Adjustment in Immersive Virtual Reality for Rehabilitation: Application to Hand Movement

Sensors (Basel). 2022 Jun 14;22(12):4499. doi: 10.3390/s22124499.

Abstract

Motor rehabilitation is used to improve motor control skills to improve the patient's quality of life. Regular adjustments based on the effect of therapy are necessary, but this can be time-consuming for the clinician. This study proposes to use an efficient tool for high-dimensional data by considering a deep learning approach for dimensionality reduction of hand movement recorded using a wireless remote control embedded with the Oculus Rift S. This latent space is created as a visualization tool also for use in a reinforcement learning (RL) algorithm employed to provide a decision-making framework. The data collected consists of motions drawn with wireless remote control in an immersive VR environment for six different motions called "Cube", "Cylinder", "Heart", "Infinity", "Sphere", and "Triangle". From these collected data, different artificial databases were created to simulate variations of the data. A latent space representation is created using an adversarial autoencoder (AAE), taking into account unsupervised (UAAE) and semi-supervised (SSAAE) training. Then, each test point is represented by a distance metric and used as a reward for two classes of Multi-Armed Bandit (MAB) algorithms, namely Boltzmann and Sibling Kalman filters. The results showed that AAE models can represent high-dimensional data in a two-dimensional latent space and that MAB agents can efficiently and quickly learn the distance evolution in the latent space. The results show that Sibling Kalman filter exploration outperforms Boltzmann exploration with an average cumulative weighted probability error of 7.9 versus 19.9 using the UAAE latent space representation and 8.0 versus 20.0 using SSAAE. In conclusion, this approach provides an effective approach to visualize and track current motor control capabilities regarding a target in order to reflect the patient's abilities in VR games in the context of DDA.

Keywords: dynamic difficulty adjustment; end effector; immersive virtual reality; machine learning; multi-armed bandit; reinforcement learning.

MeSH terms

  • Hand
  • Humans
  • Movement
  • Quality of Life*
  • Upper Extremity
  • Virtual Reality*

Grants and funding

This work was financially supported by SurfClean Inc., Sagamihara, Kanagawa, Japan.