Robustness of the Coarse-grained Yoga Datasets verified in Contrastive Learning classification and Yoga Pose Estimation

Konferenz: CAIBDA 2022 - 2nd International Conference on Artificial Intelligence, Big Data and Algorithms
17.06.2022 - 19.06.2022 in Nanjing, China

Tagungsband: CAIBDA 2022

Seiten: 8Sprache: EnglischTyp: PDF

Autoren:
Shi, Mingyu (University of California, Irvine, USA)
Wei, Zihao (North University of China, Taiyuan, China)

Inhalt:
Yoga pose analysis is always a hot problem in the computer vision field. Especially in recent years, due to the impact of COVID-19, people need to spend more time staying at home, and moderate indoor yoga can help keep them healthy. Motion detection developed for this reason can be used for posture recognition to help better people complete the correct motion. However, due to the lack of data samples and the influence of non-standard real-time motion capture, the recognition of specific actions is always a challenge. To overcome this problem, we constructed the Coarse-grained yoga dataset that contains at least 5500 asana images of 5 different asanas obtained from Kaggle. We use two deep learning methodologies for robustness testing: convolutional neural network (CNN) and supervised contrastive learning, demonstrating that these techniques achieve high recognition rates on the proposed datasets. In CNN classification, supervised contrastive learning has been compared with cross-entropy loss, which has an outstanding property in Resnet-32, achieving an accuracy of 94.357%. At the same time, we use MediaPipe, an action recognition module based on Blazepose proposed by the Google team, to enhance the original results further. This module can better adapt to the new environment by fine-tuning a few parameters, thus greatly simplifying the training system. The Blazepose is used to extract the coordinate information of the human body's landmarks and serve as features to realize various machine learning classification models. We tested the coordinate data using different machine learning classification models and achieved a maximum accuracy of 97% with ease.