Search | arXiv e-print repository

Of Mice and Pose: 2D Mouse Pose Estimation from Unlabelled Data and Synthetic Prior

Authors: Jose Sosa, Sharn Perry, Jane Alty, David Hogg

Abstract: Numerous fields, such as ecology, biology, and neuroscience, use animal recordings to track and measure animal behaviour. Over time, a significant volume of such data has been produced, but some computer vision techniques cannot explore it due to the lack of annotations. To address this, we propose an approach for estimating 2D mouse body pose from unlabelled images using a synthetically generated… ▽ More Numerous fields, such as ecology, biology, and neuroscience, use animal recordings to track and measure animal behaviour. Over time, a significant volume of such data has been produced, but some computer vision techniques cannot explore it due to the lack of annotations. To address this, we propose an approach for estimating 2D mouse body pose from unlabelled images using a synthetically generated empirical pose prior. Our proposal is based on a recent self-supervised method for estimating 2D human pose that uses single images and a set of unpaired typical 2D poses within a GAN framework. We adapt this method to the limb structure of the mouse and generate the empirical prior of 2D poses from a synthetic 3D mouse model, thereby avoiding manual annotation. In experiments on a new mouse video dataset, we evaluate the performance of the approach by comparing pose predictions to a manually obtained ground truth. We also compare predictions with those from a supervised state-of-the-art method for animal pose estimation. The latter evaluation indicates promising results despite the lack of paired training data. Finally, qualitative results using a dataset of horse images show the potential of the setting to adapt to other animal species. △ Less

Submitted 25 July, 2023; originally announced July 2023.

Comments: Accepted at the International Conference on Computer Vision Systems 2023

arXiv:2304.14937 [pdf, other]

Contactless hand tremor amplitude measurement using smartphones: development and pilot evaluation

Authors: James Bungay, Osasenaga Emokpae, Samuel D. Relton, Jane Alty, Stefan Williams, Hui Fang, David C. Wong

Abstract: Background: Physiological tremor is defined as an involuntary and rhythmic shaking. Tremor of the hand is a key symptom of multiple neurological diseases, and its frequency and amplitude differs according to both disease type and disease progression. In routine clinical practice, tremor frequency and amplitude are assessed by expert rating using a 0 to 4 integer scale. Such ratings are subjective… ▽ More Background: Physiological tremor is defined as an involuntary and rhythmic shaking. Tremor of the hand is a key symptom of multiple neurological diseases, and its frequency and amplitude differs according to both disease type and disease progression. In routine clinical practice, tremor frequency and amplitude are assessed by expert rating using a 0 to 4 integer scale. Such ratings are subjective and have poor inter-rater reliability. There is thus a clinical need for a practical and accurate method for objectively assessing hand tremor. Objective: to develop a proof of principle method to measure hand tremor amplitude from smartphone videos. Methods: We created a computer vision pipeline that automatically extracts salient points on the hand and produces a 1-D time series of movement due to tremor, in pixels. Using the smartphones' depth measurement, we convert this measure into real distance units. We assessed the accuracy of the method using 60 videos of simulated tremor of different amplitudes from two healthy adults. Videos were taken at distances of 50, 75 and 100 cm between hand and camera. The participants had skin tone II and VI on the Fitzpatrick scale. We compared our method to a gold-standard measurement from a slide rule. Bland-Altman methods agreement analysis indicated a bias of 0.04 cm and 95% limits of agreement from -1.27 to 1.20 cm. Furthermore, we qualitatively observed that the method was robust to differences in skin tone and limited occlusion, such as a band-aid affixed to the participant's hand. Clinical relevance: We have demonstrated how tremor amplitude can be measured from smartphone videos. In conjunction with tremor frequency, this approach could be used to help diagnose and monitor neurological diseases △ Less

Submitted 28 April, 2023; originally announced April 2023.

Comments: Accepted to IEEE EMBC 2023, Sydney (pre-refereed version)

arXiv:2302.08505 [pdf, other]

Rapid-Motion-Track: Markerless Tracking of Fast Human Motion with Deeper Learning

Authors: Renjie Li, Chun Yu Lao, Rebecca St. George, Katherine Lawler, Saurabh Garg, Son N. Tran, Quan Bai, Jane Alty

Abstract: Objective The coordination of human movement directly reflects function of the central nervous system. Small deficits in movement are often the first sign of an underlying neurological problem. The objective of this research is to develop a new end-to-end, deep learning-based system, Rapid-Motion-Track (RMT) that can track the fastest human movement accurately when webcams or laptop cameras are us… ▽ More Objective The coordination of human movement directly reflects function of the central nervous system. Small deficits in movement are often the first sign of an underlying neurological problem. The objective of this research is to develop a new end-to-end, deep learning-based system, Rapid-Motion-Track (RMT) that can track the fastest human movement accurately when webcams or laptop cameras are used. Materials and Methods We applied RMT to finger tapping, a well-validated test of motor control that is one of the most challenging human motions to track with computer vision due to the small keypoints of digits and the high velocities that are generated. We recorded 160 finger tapping assessments simultaneously with a standard 2D laptop camera (30 frames/sec) and a high-speed wearable sensor-based 3D motion tracking system (250 frames/sec). RMT and a range of DLC models were applied to the video data with tapping frequencies up to 8Hz to extract movement features. Results The movement features (e.g. speed, rhythm, variance) identified with the new RMT system exhibited very high concurrent validity with the gold-standard measurements (97.3\% of RMT measures were within +/-0.5Hz of the Optotrak measures), and outperformed DLC and other advanced computer vision tools (around 88.2\% of DLC measures were within +/-0.5Hz of the Optotrak measures). RMT also accurately tracked a range of other rapid human movements such as foot tapping, head turning and sit-to -stand movements. Conclusion: With the ubiquity of video technology in smart devices, the RMT method holds potential to transform access and accuracy of human movement assessment. △ Less

Submitted 18 January, 2023; originally announced February 2023.

arXiv:2207.02376 [pdf, other]

A Comprehensive Review on Deep Supervision: Theories and Applications

Authors: Renjie Li, Xinyi Wang, Guan Huang, Wenli Yang, Kaining Zhang, Xiaotong Gu, Son N. Tran, Saurabh Garg, Jane Alty, Quan Bai

Abstract: Deep supervision, or known as 'intermediate supervision' or 'auxiliary supervision', is to add supervision at hidden layers of a neural network. This technique has been increasingly applied in deep neural network learning systems for various computer vision applications recently. There is a consensus that deep supervision helps improve neural network performance by alleviating the gradient vanishi… ▽ More Deep supervision, or known as 'intermediate supervision' or 'auxiliary supervision', is to add supervision at hidden layers of a neural network. This technique has been increasingly applied in deep neural network learning systems for various computer vision applications recently. There is a consensus that deep supervision helps improve neural network performance by alleviating the gradient vanishing problem, as one of the many strengths of deep supervision. Besides, in different computer vision applications, deep supervision can be applied in different ways. How to make the most use of deep supervision to improve network performance in different applications has not been thoroughly investigated. In this paper, we provide a comprehensive in-depth review of deep supervision in both theories and applications. We propose a new classification of different deep supervision networks, and discuss advantages and limitations of current deep supervision networks in computer vision applications. △ Less

Submitted 5 July, 2022; originally announced July 2022.

arXiv:2112.10275 [pdf, other]

Parallel Multi-Scale Networks with Deep Supervision for Hand Keypoint Detection

Authors: Renjie Li, Son Tran, Saurabh Garg, Katherine Lawler, Jane Alty, Quan Bai

Abstract: Keypoint detection plays an important role in a wide range of applications. However, predicting keypoints of small objects such as human hands is a challenging problem. Recent works fuse feature maps of deep Convolutional Neural Networks (CNNs), either via multi-level feature integration or multi-resolution aggregation. Despite achieving some success, the feature fusion approaches increase the com… ▽ More Keypoint detection plays an important role in a wide range of applications. However, predicting keypoints of small objects such as human hands is a challenging problem. Recent works fuse feature maps of deep Convolutional Neural Networks (CNNs), either via multi-level feature integration or multi-resolution aggregation. Despite achieving some success, the feature fusion approaches increase the complexity and the opacity of CNNs. To address this issue, we propose a novel CNN model named Multi-Scale Deep Supervision Network (P-MSDSNet) that learns feature maps at different scales with deep supervisions to produce attention maps for adaptive feature propagation from layers to layers. P-MSDSNet has a multi-stage architecture which makes it scalable while its deep supervision with spatial attention improves transparency to the feature learning at each stage. We show that P-MSDSNet outperforms the state-of-the-art approaches on benchmark datasets while requiring fewer number of parameters. We also show the application of P-MSDSNet to quantify finger tapping hand movements in a neuroscience study. △ Less

Submitted 19 December, 2021; originally announced December 2021.

arXiv:2110.14461 [pdf, other]

doi 10.1007/s00521-022-08090-8

Hand gesture detection in tests performed by older adults

Authors: Guan Huang, Son N. Tran, Quan Bai, Jane Alty

Abstract: Our team are developing a new online test that analyses hand movement features associated with ageing that can be completed remotely from the research centre. To obtain hand movement features, participants will be asked to perform a variety of hand gestures using their own computer cameras. However, it is challenging to collect high quality hand movement video data, especially for older participan… ▽ More Our team are developing a new online test that analyses hand movement features associated with ageing that can be completed remotely from the research centre. To obtain hand movement features, participants will be asked to perform a variety of hand gestures using their own computer cameras. However, it is challenging to collect high quality hand movement video data, especially for older participants, many of whom have no IT background. During the data collection process, one of the key steps is to detect whether the participants are following the test instructions correctly and also to detect similar gestures from different devices. Furthermore, we need this process to be automated and accurate as we expect many thousands of participants to complete the test. We have implemented a hand gesture detector to detect the gestures in the hand movement tests and our detection mAP is 0.782 which is better than the state-of-the-art. In this research, we have processed 20,000 images collected from hand movement tests and labelled 6,450 images to detect different hand gestures in the hand movement tests. This paper has the following three contributions. Firstly, we compared and analysed the performance of different network structures for hand gesture detection. Secondly, we have made many attempts to improve the accuracy of the model and have succeeded in improving the classification accuracy for similar gestures by implementing attention layers. Thirdly, we have created two datasets and included 20 percent of blurred images in the dataset to investigate how different network structures were impacted by noisy data, our experiments have also shown our network has better performance on the noisy dataset. △ Less

Submitted 28 October, 2021; v1 submitted 27 October, 2021; originally announced October 2021.

Journal ref: Neural Comput & Applic (2022)

arXiv:2104.14073 [pdf, ps, other]

Applications of Artificial Intelligence to aid detection of dementia: a narrative review on current capabilities and future directions

Authors: Renjie Li, Xinyi Wang, Katherine Lawler, Saurabh Garg, Quan Bai, Jane Alty

Abstract: With populations ageing, the number of people with dementia worldwide is expected to triple to 152 million by 2050. Seventy percent of cases are due to Alzheimer's disease (AD) pathology and there is a 10-20 year 'pre-clinical' period before significant cognitive decline occurs. We urgently need, cost effective, objective methods to detect AD, and other dementias, at an early stage. Risk factor mo… ▽ More With populations ageing, the number of people with dementia worldwide is expected to triple to 152 million by 2050. Seventy percent of cases are due to Alzheimer's disease (AD) pathology and there is a 10-20 year 'pre-clinical' period before significant cognitive decline occurs. We urgently need, cost effective, objective methods to detect AD, and other dementias, at an early stage. Risk factor modification could prevent 40% of cases and drug trials would have greater chances of success if participants are recruited at an earlier stage. Currently, detection of dementia is largely by pen and paper cognitive tests but these are time consuming and insensitive to pre-clinical phases. Specialist brain scans and body fluid biomarkers can detect the earliest stages of dementia but are too invasive or expensive for widespread use. With the advancement of technology, Artificial Intelligence (AI) shows promising results in assisting with detection of early-stage dementia. Existing AI-aided methods and potential future research directions are reviewed and discussed. △ Less

Submitted 28 April, 2021; originally announced April 2021.

Comments: 11 pages

arXiv:1311.5434 [pdf]

The Well-tempered Compiler? The Aesthetics of Program Auralization

Authors: Paul Vickers, James L Alty

Abstract: In this chapter we are concerned with external auditory representations of programs, also known as program auralization. As program auralization systems tend to use musical representations they are necessarily affected by artistic and aesthetic considerations. Therefore, it is instructive to explore program auralization in the light of aesthetic computing principles. In this chapter we are concerned with external auditory representations of programs, also known as program auralization. As program auralization systems tend to use musical representations they are necessarily affected by artistic and aesthetic considerations. Therefore, it is instructive to explore program auralization in the light of aesthetic computing principles. △ Less

Submitted 21 November, 2013; originally announced November 2013.

Comments: in Aesthetic Computing (P. A. Fishwick, ed.), ch. 17, pp. 335-354, Boston, MA: MIT Press, 2006

arXiv:1311.4349 [pdf]

The CAITLIN Auralization System: Hierarchical Leitmotif Design as a Clue to Program Comprehension

Authors: James L. Alty, Paul Vickers

Abstract: Early experiments have suggested that program auralization can convey information about program structure [8]. Languages like Pascal contain classes of construct that are similar in nature allowing hierarchical classification of their features. This taxonomy can be reflected in the design of musical signatures which are used within the CAITLIN program auralization system. Experiments using these h… ▽ More Early experiments have suggested that program auralization can convey information about program structure [8]. Languages like Pascal contain classes of construct that are similar in nature allowing hierarchical classification of their features. This taxonomy can be reflected in the design of musical signatures which are used within the CAITLIN program auralization system. Experiments using these hierarchical leitmotifs indicate whether or not their similarities can be put to good use in communicating information about program structure and state. (Note, at time of going to press experimental results could not be included. These will be presented at the conference and included later.) △ Less

Submitted 18 November, 2013; originally announced November 2013.

Comments: In ICAD '97 Fourth International Conference on Auditory Display (J. A. Ballas and E. D. Mynatt, eds.), (Palo Alto), pp. 89-96, Xerox PARC, Palo Alto, CA 94304, 1997, 7 pages, 10 figures

ACM Class: H.5.1; H.5.2; H.5.5; D.2.5; D.2.6

Showing 1–9 of 9 results for author: Alty, J