Value analysis of user consumption behavior based on XGBoost model and K-Means model

Conference: ISCTT 2022 - 7th International Conference on Information Science, Computer Technology and Transportation
05/27/2022 - 05/29/2022 at Xishuangbanna, China

Proceedings: ISCTT 2022

Pages: 6Language: englishTyp: PDF

Authors:
Chen, Liying; Lin, Yijun; Zhang, Chunfu; Bai, Jing (School of Disciplinary Basics and Applied Statistics, Zhuhai College of Science and Technology, Zhuhai, China)

Abstract:
In recent years, with the continuous development of the Internet, digital online learning methods have become a new way for people to receive education. In this paper, user attributes and behavioral data of an online education platform are selected as samples, and the original data are pre-processed because the original data set contains a large amount of noise. Then, principal component analysis is adopted to determine the optimal number of features, and correlation analysis is used to select mutually independent features, and finally a feature set of 33 features is constructed. Based on the constructed feature set, the XGBoost model was used to predict the users' final purchase intention, and the prediction effect of the XGBoost model was found to be good based on the accuracy rate, F1 index and AUC index. Finally, the RFM key features are constructed by combining the feature importance score map drawn by RFM model and XGBoost model, and a reasonable user value assessment model is established by using K-Means clustering algorithm to divide users into 3 categories, and then feature analysis is conducted on different user groups to compare the customer value of different types of customers, so as to develop corresponding marketing strategies.