Research on Chinese Name Recognition Using BERT-BiLSTM-CRF Model Incorporating Coreference Resolution
Conference: CAIBDA 2022 - 2nd International Conference on Artificial Intelligence, Big Data and Algorithms
06/17/2022 - 06/19/2022 at Nanjing, China
Proceedings: CAIBDA 2022
Pages: 9Language: englishTyp: PDF
Authors:
Jing, Guanjun; Dong, Zhiming; Zheng, Ran; Zhang, Chaoying (China Telecom Research Institute, Beijing, China)
Chen, Yu (China Telecom Research Institute, Beijing, China & School of Software, Nankai University, Tianjin, China)
Wang, Zhaohui; Long, Xianjun (China Telecom Research Institute, Guangzhou, China)
Abstract:
Named Entity Recognition is a basic task in the field of natural language processing. Entities include people’s names, place names and organization names. Compared with other entities, people's names are unique as they may be related to positions, jobs, skills, or personal pronouns. In the entity recognition of people's names, incomplete names become a difficulty in entity recognition. Based on observations, this paper proposes a sequential annotation method combining with coreference resolution to improve name recognition, which can effectively alleviate the problem of incomplete name recognition, and solve the problems of unclear pronoun reference avoiding high labor cost. Early warning system of internet public opinion refers to the necessary means taken to resolve and deal with the crisis during the period between early signs of crisis and perceptible loss has been caused. Specifically, data enhancement is firstly carried out by using public-opinion-related information, which can effectively solve the problem of insufficient annotation data in practical application. Secondly, in order to better learn contextual features, this paper uses BERT, a language pre-training model, and bidirectional short and long-term memory network. Then, conditional random field modeling is used to label the sequence relations. Finally, for personal pronouns in the text, coreference resolution algorithm is added to improve personal name recognition. Experimental results on public data sets and the chosen data sets show the effectiveness of the proposed method.