Knowledge distillation based on channel attention
Konferenz: CIBDA 2022 - 3rd International Conference on Computer Information and Big Data Applications
25.03.2022 - 27.03.2022 in Wuhan, China
Tagungsband: CIBDA 2022
Seiten: 7Sprache: EnglischTyp: PDF
Autoren:
Meng, Xianfa; Liu, Fang (National Key Laboratory of Science and Technology on Automatic Target Recognition, National Defense University of Science and Technology, Changsha, Hunan, China)
Inhalt:
With the development of convolutional neural network (CNN), the depth and width of CNN are increasing, which raises the demand for computing resources and storage space more and more. Knowledge distillation aims at transferring knowledge extracted from a teacher network as an additional label to guide the training of a lightweight student network. As an effective method of network compression, it has made a lot of research progress in multiple types of tasks. Aiming at the information redundancy problem of feature maps, we introduce a novel approach, dubbed Channel Attention Knowledge Distillation (CAKD). By extracting the channel weight knowledge, the student can learn the channel assigning ratio of the teacher network and then correct its channel information of feature maps. Extensive experiments on multiple datasets show that the proposed method can significantly improve the performance of student networks compared to the other knowledge distillation methods of the same type.