The Study of NER Methods Based on Bi-LSTM+CRF Model

Conference: CIBDA 2022 - 3rd International Conference on Computer Information and Big Data Applications
03/25/2022 - 03/27/2022 at Wuhan, China

Proceedings: CIBDA 2022

Pages: 8Language: englishTyp: PDF

Authors:
Xu, Aiping (School of Computer Science, Wuhan Donghu University, Wuhan, China & School of Computer, Wuhan University, Wuhan, China)
Wang, Chao (The State Key Laboratory of Information Engineering in Surveying Mapping and Remote Sensing, Wuhan University, Wuhan, China)

Abstract:
Named entity recognition technology is one of the key technologies of natural language processing. The naming entity recognition technology mostly adopts artificial construction rules in early, and now named entity recognition technology is mostly around neural networks, machine learning or deep learning. When LSTM models sentences it is impossible to encode information from the back to the front and there are less probabilistic annotations based on LSTM in named entity recognition. Aiming at these issues this article improves LSTM to bidirectional LSTM (Bi-LSTM) model and add a layer of Conditional Random Field (CRF) model after the Bi-LSTM model. The advent of the CRF layer allows the model to learn some constraints to avoid too many labels with a low probability. This article also improves the input features from word-level features to character-level features which breaks through the limitations of traditional word-level features. The experiments show that Bi-LSTM+CRF model and character-level features can greatly improve model performance in named entity recognition. Finally, active learning methods are proposed to achieve better performance in the case of less labeled data being trained.