Face Mask Detection with Vision Transformer

Konferenz: CAIBDA 2022 - 2nd International Conference on Artificial Intelligence, Big Data and Algorithms
17.06.2022 - 19.06.2022 in Nanjing, China

Tagungsband: CAIBDA 2022

Seiten: 6Sprache: EnglischTyp: PDF

Autoren:
Guo, Jinpei (School of Electronics and Information, Shanghai Jiao Tong University, China)

Inhalt:
The outbreak of COVID-19 in 2019 had a great impact on human production and life. Until 2022, the epidemic is still spreading around the world. An effective way to prevent spread is to wear face mask. However, only 23.1% of people use face masks correctly. Making use of technology to detect people’s mask wearing is in need. This work implemented detection using vision transformer architecture on Face Mask Detection 12K Images Dataset. Four different loss functions, named binary cross entropy, categorical cross entropy, categorical hinge and huber were compared. The result demonstrated that vision transformer with categorical cross entropy loss function had the best performance, which reached an accuracy of 0.9274 on Face Mask Detection 12K Images Dataset. In this paper, we initially make use of vision transformer to capture global deep semantic features for the face mask detection task and leverage data augmentation to further improve the performance. Data augmentation has the risk of increasing noise, which may reduce the accuracy of the model, so the impact of data augmentation was compared. The result was that model trained without data augmentation had a better performance on the data set, which reached an accuracy of 0.9798.