CN116501979A - Information recommendation method, information recommendation device, computer equipment and computer readable storage medium - Google Patents
Information recommendation method, information recommendation device, computer equipment and computer readable storage medium Download PDFInfo
- Publication number
- CN116501979A CN116501979A CN202310787586.XA CN202310787586A CN116501979A CN 116501979 A CN116501979 A CN 116501979A CN 202310787586 A CN202310787586 A CN 202310787586A CN 116501979 A CN116501979 A CN 116501979A
- Authority
- CN
- China
- Prior art keywords
- user
- machine learning
- target
- feature
- contribution
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 49
- 238000010801 machine learning Methods 0.000 claims abstract description 182
- 238000012360 testing method Methods 0.000 claims abstract description 84
- 238000012549 training Methods 0.000 claims abstract description 73
- 238000012795 verification Methods 0.000 claims abstract description 55
- 238000012545 processing Methods 0.000 claims abstract description 30
- 230000006399 behavior Effects 0.000 claims description 24
- 230000001364 causal effect Effects 0.000 claims description 17
- 238000012163 sequencing technique Methods 0.000 claims description 10
- 238000004590 computer program Methods 0.000 claims description 9
- 230000003993 interaction Effects 0.000 claims description 8
- 238000000926 separation method Methods 0.000 claims description 5
- 230000003542 behavioural effect Effects 0.000 claims description 3
- 238000010200 validation analysis Methods 0.000 claims description 2
- 230000004044 response Effects 0.000 description 11
- 230000008569 process Effects 0.000 description 6
- 230000008901 benefit Effects 0.000 description 5
- 238000010276 construction Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 239000000654 additive Substances 0.000 description 1
- 230000000996 additive effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000012854 evaluation process Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The application discloses an information recommendation method, an information recommendation device, computer equipment and a computer readable storage medium, relates to the technical field of data processing, and aims to screen out common features with high contribution degree, realize accurate positioning of a user and improve information recommendation precision. The method comprises the following steps: determining a target user to be subjected to information recommendation, acquiring test data and a plurality of user characteristics associated with the target user, and dividing a data set consisting of the test data and the plurality of user characteristics into a training set and a verification set; training a plurality of machine learning models by adopting a training set, analyzing contribution degrees based on the plurality of machine learning models and a verification set to obtain contribution degrees of each user feature to a modeling target of each machine learning model, and determining common features in the plurality of user features; constructing a user positioning model, inputting the common characteristics into the user positioning model to obtain a user positioning result of the target user, and recommending information to the target user according to the user positioning result.
Description
Technical Field
The present disclosure relates to the field of data processing technologies, and in particular, to an information recommendation method, an information recommendation device, a computer device, and a computer readable storage medium.
Background
In recent years, with the rapid development of the internet, the modern society has become an informationized and digitalized society, the data fills the whole world, and the information explosion has become normal. However, in the face of a large amount of data, the utilization rate of information by users is reduced, that is, the problem of information overload (Information over load) is generated, and for this reason, the recommendation system is one of key technologies for effectively solving the problem of information overload.
In the related art, a recommendation system can acquire historical behavior data of a user, such as whether the user clicks, purchases, forms a record, and the like, a response model is constructed by utilizing the historical behavior data, and the preference, habit, and the like of the user are analyzed through the response model, so that the user is positioned, and personalized recommendation is performed to the user.
In realizing the related art, the applicant has recognized that the related art has at least the following problems:
when the constructed response model is utilized to analyze the user characteristics, all the user characteristics of the user are required to be input into the response model, but actually, when the response model is constructed, the contribution degree of different user characteristics to a modeling target is different, the user characteristics with low contribution degree easily influence the result output by the response model, so that the positioning of the user is inaccurate, and the information recommendation precision is not high.
Disclosure of Invention
In view of this, the present application provides an information recommendation method, apparatus, computer device and computer readable storage medium, and mainly aims to solve the problem that the user characteristics with low contribution degree at present easily affect the result output by the response model, resulting in inaccurate positioning of the user and low accuracy of information recommendation.
According to a first aspect of the present application, there is provided an information recommendation method, including:
determining a target user to be subjected to information recommendation, acquiring test data and a plurality of user characteristics associated with the target user, and dividing a data set consisting of the test data and the plurality of user characteristics into a training set and a verification set;
training a plurality of machine learning models by adopting the training set, and analyzing contribution degrees based on the plurality of machine learning models and the verification set to obtain the contribution degrees of each user feature to a modeling target of each machine learning model, wherein common features are determined in the plurality of user features, and the common features are user features, which are larger than a contribution degree threshold value, in the plurality of user features for constructing each machine learning model;
Constructing a user positioning model, inputting the common features into the user positioning model to obtain a user positioning result of the target user, and recommending information to the target user according to the user positioning result.
Optionally, the acquiring the test data and the plurality of user features associated with the target user, and dividing a data set composed of the test data and the plurality of user features into a training set and a verification set, includes:
determining a plurality of initial user characteristics based on historical behavior data of the target user and user basic data, wherein the historical behavior data is data generated by historical behaviors of the target user and/or data generated by interactions with the target user in a historical period;
determining a feature type of each initial user feature in the plurality of initial user features, and processing the plurality of initial user features according to the feature type to obtain the plurality of user features, wherein the feature type is any one of a numerical feature or a category feature;
acquiring test data associated with the target user, and forming the data set by utilizing the test data and the plurality of user characteristics, wherein the test data is obtained after the target user is tested aiming at a specified target;
And determining a preset grouping proportion, and grouping the data set according to the preset grouping proportion to obtain the training set and the verification set.
Optionally, after determining a plurality of initial user features based on the historical behavior data of the target user and the user basic data, the method further includes:
identifying an initial user feature with a null value from the plurality of initial user features as a missing user feature, and identifying a feature type of the missing user feature;
and obtaining a type identifier corresponding to the feature type, and filling the value of the missing user feature by adopting the type identifier.
Optionally, the determining the feature type of each initial user feature in the plurality of initial user features separately, and processing the plurality of initial user features according to the feature type, to obtain the plurality of user features, includes:
identifying the feature types of the plurality of initial user features, and determining at least one first initial user feature and at least one second initial user feature, wherein the feature type of the first initial user feature is a numerical type feature, and the feature type of the second initial user feature is a category type feature;
The method comprises the steps of carrying out standardization and barrel separation processing on at least one first initial user characteristic, carrying out label coding processing on at least one second initial user characteristic, and taking the processed at least one first initial user characteristic and the processed at least one second initial user characteristic as the plurality of user characteristics.
Optionally, the training a plurality of machine learning models by using the training set, performing contribution analysis based on the plurality of machine learning models and the verification set to obtain contribution of each user feature to a modeling target of each machine learning model, including:
determining a preset machine learning algorithm, and performing model training operation on the training set for a plurality of times according to the preset machine learning algorithm to obtain a plurality of machine learning models;
acquiring a contribution analysis tool, selecting any machine learning model from the plurality of machine learning models as a target machine learning model, and inputting the target machine learning model and the verification set into the contribution analysis tool to obtain a marginal contribution value of each user feature as the contribution of each user feature to a modeling target of the target machine learning model;
And continuously selecting any machine learning model from the rest machine learning models and inputting the machine learning model and the verification set into the contribution analysis tool until traversing the plurality of machine learning models to obtain the contribution of each user feature to the modeling target of each machine learning model, wherein the rest machine learning models are machine learning models except the target machine learning model in the plurality of machine learning models.
Optionally, the constructing the user positioning model includes:
acquiring test data associated with the target user and acquiring a preset control data set;
counting the data difference of the experimental data group and the control data group on the appointed target in the test data, wherein the test data is obtained after the target user is tested aiming at the appointed target;
determining a causal model algorithm, and performing model training on the data difference by adopting the causal model algorithm to obtain the user positioning model.
Optionally, the recommending information to the target user according to the user positioning result includes:
extracting at least one user positioning label from the user positioning result, and respectively inquiring recommended content associated with each user positioning label in the at least one user positioning label to obtain at least one recommended content;
Counting the occurrence times of each recommended content in the at least one recommended content respectively, and sequencing the at least one recommended content according to the sequence from high to low of the occurrence times to obtain a content sequencing result;
and selecting one or more recommended contents ranked at first from the content ranking result as information to be recommended, and recommending the information to be recommended to the target user.
According to a second aspect of the present application, there is provided an information recommendation apparatus, comprising:
the acquisition module is used for determining a target user to be subjected to information recommendation, acquiring test data and a plurality of user characteristics associated with the target user, and dividing a data set consisting of the test data and the plurality of user characteristics into a training set and a verification set;
the analysis module is used for training a plurality of machine learning models by adopting the training set, carrying out contribution analysis on the basis of the plurality of machine learning models and the verification set to obtain the contribution of each user characteristic to a modeling target of each machine learning model, and determining common characteristics in the plurality of user characteristics, wherein the common characteristics are user characteristics, of the plurality of user characteristics, for constructing each machine learning model, the contribution is greater than a contribution threshold;
And the recommending module is used for constructing a user positioning model, inputting the common characteristics into the user positioning model, obtaining a user positioning result of the target user, and recommending information to the target user according to the user positioning result.
Optionally, the acquiring module is configured to determine a plurality of initial user features based on historical behavior data of the target user and user basic data, where the historical behavior data is data generated by historical behaviors of the target user and/or data generated by interactions with the target user in a historical period; determining a feature type of each initial user feature in the plurality of initial user features, and processing the plurality of initial user features according to the feature type to obtain the plurality of user features, wherein the feature type is any one of a numerical feature or a category feature; acquiring test data associated with the target user, and forming the data set by utilizing the test data and the plurality of user characteristics, wherein the test data is obtained after the target user is tested aiming at a specified target; and determining a preset grouping proportion, and grouping the data set according to the preset grouping proportion to obtain the training set and the verification set.
Optionally, the acquiring module is further configured to identify an initial user feature with a null value from the plurality of initial user features as a missing user feature, and identify a feature type of the missing user feature; and obtaining a type identifier corresponding to the feature type, and filling the value of the missing user feature by adopting the type identifier.
Optionally, the acquiring module is configured to identify a feature type of the plurality of initial user features, determine at least one first initial user feature and at least one second initial user feature, where a feature type of the first initial user feature is a numeric feature, and a feature type of the second initial user feature is a category feature; the method comprises the steps of carrying out standardization and barrel separation processing on at least one first initial user characteristic, carrying out label coding processing on at least one second initial user characteristic, and taking the processed at least one first initial user characteristic and the processed at least one second initial user characteristic as the plurality of user characteristics.
Optionally, the analysis module is configured to determine a preset machine learning algorithm, and perform model training operations on the training set for multiple times according to the preset machine learning algorithm to obtain the multiple machine learning models; acquiring a contribution analysis tool, selecting any machine learning model from the plurality of machine learning models as a target machine learning model, and inputting the target machine learning model and the verification set into the contribution analysis tool to obtain a marginal contribution value of each user feature as the contribution of each user feature to a modeling target of the target machine learning model; and continuously selecting any machine learning model from the rest machine learning models and inputting the machine learning model and the verification set into the contribution analysis tool until traversing the plurality of machine learning models to obtain the contribution of each user feature to the modeling target of each machine learning model, wherein the rest machine learning models are machine learning models except the target machine learning model in the plurality of machine learning models.
Optionally, the recommendation module is configured to obtain test data associated with the target user, and obtain a preset control data set; counting the data difference of the experimental data group and the control data group on the appointed target in the test data, wherein the test data is obtained after the target user is tested aiming at the appointed target; determining a causal model algorithm, and performing model training on the data difference by adopting the causal model algorithm to obtain the user positioning model.
Optionally, the recommendation module is configured to extract at least one user positioning tag from the user positioning result, and query recommended content associated with each user positioning tag in the at least one user positioning tag to obtain at least one recommended content; counting the occurrence times of each recommended content in the at least one recommended content respectively, and sequencing the at least one recommended content according to the sequence from high to low of the occurrence times to obtain a content sequencing result; and selecting one or more recommended contents ranked at first from the content ranking result as information to be recommended, and recommending the information to be recommended to the target user.
According to a third aspect of the present application there is provided a computer device comprising a memory storing a computer program and a processor implementing the steps of the method of any of the first aspects described above when the computer program is executed by the processor.
According to a fourth aspect of the present application there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method of any of the first aspects described above.
By means of the technical scheme, the information recommendation method, the information recommendation device, the computer equipment and the computer readable storage medium, the target user to be subjected to information recommendation is determined, test data and a plurality of user characteristics associated with the target user are obtained, a data set consisting of the test data and the plurality of user characteristics is divided into a training set and a verification set, the plurality of machine learning models are trained by the training set, contribution degree analysis is conducted on the basis of the plurality of machine learning models and the verification set, contribution degrees of each user characteristic to a modeling target of each machine learning model are obtained, common characteristics, which are larger than contribution degree threshold values, of each user characteristic for building each machine learning model are determined in the plurality of user characteristics, a user positioning model is built, the common characteristics are input into the user positioning model, a user positioning result of the target user is obtained, recommendation information is recommended to the target user according to the user positioning result, the common characteristics, the user is positioned around the common characteristics, which have higher contribution degrees to the modeling target of each machine learning model are screened out, the user is prevented from influencing the positioning result due to the user characteristics with low contribution degrees, and accurate positioning of the user is achieved.
The foregoing description is only an overview of the technical solutions of the present application, and may be implemented according to the content of the specification in order to make the technical means of the present application more clearly understood, and in order to make the above-mentioned and other objects, features and advantages of the present application more clearly understood, the following detailed description of the present application will be given.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
fig. 1 shows a flow chart of an information recommendation method provided in an embodiment of the present application;
fig. 2A is a schematic flow chart of another information recommendation method according to an embodiment of the present application;
fig. 2B illustrates an architecture diagram of an information recommendation system according to an embodiment of the present application;
fig. 2C is a schematic flow chart of an information recommendation method according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an information recommendation device according to an embodiment of the present application;
Fig. 4 shows a schematic device structure of a computer device according to an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present application are shown in the drawings, it should be understood that the present application may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The embodiment of the application provides an information recommendation method, as shown in fig. 1, which comprises the following steps:
101. determining a target user to be subjected to information recommendation, acquiring test data and a plurality of user characteristics associated with the target user, and dividing a data set consisting of the test data and the plurality of user characteristics into a training set and a verification set.
The embodiment of the Application can be applied to an information recommendation system, and the information recommendation system can be carried in APP (Application) such as shopping, video, chat and the like so as to locate a user and recommend content possibly interested in the user according to the preference of the user. In order to achieve accurate positioning of the user, information recommendation is performed subsequently according to positioning of the user, and in the embodiment of the application, after a target user to be subjected to information recommendation is determined, test data and a plurality of user characteristics associated with the target user are acquired. The test data can be A/B experimental data, which is obtained by testing the target user aiming at a specified target, a specific commercial target can be set in practical application, and AB test is carried out aiming at the specific commercial target, and which of A and B is better is judged, so that the test data is obtained. The plurality of user characteristics may be determined based on historical behavioral data of the target user and user base data; specifically, the historical behavior data may include outbound records of the target user, historical donation records, and the like, where the outbound records mainly include historical connection times, transfer times, connection time, average waiting time, and the like, and the historical donation records mainly include past donation times of the target user, donation case access times, whether donation records exist in the past 30 days, and the like; the user basic data is personal attribute data of the user, and specifically can include age, occupation, family income, whether social security is paid or not, and the like of the target user.
After the test data and the plurality of user features are obtained, since the contribution degree of each user feature to the modeling target of the model needs to be analyzed in the following embodiments of the present application, the test data and the plurality of user features need to be grouped, one part is used for constructing the model, and the other part is used for carrying out contribution degree analysis of the user features when the contribution degree analysis is subsequently input into a contribution degree analysis tool together with the constructed model. The user features used for constructing the model may be set as a training set, and the user features used for inputting the user features together with the constructed model into the contribution analysis tool for contribution analysis of the user features may be set as a verification set, and when the test data and the plurality of user features are grouped, the user features may be grouped according to a preset grouping ratio, for example, the preset grouping ratio is 7: and 3, seven three data sets consisting of test data and a plurality of user features can be used, the data with the proportion of 7 is used as a training set, the data with the proportion of 3 is used as a verification set, and the specific value of the preset grouping proportion is not limited.
102. And training a plurality of machine learning models by adopting a training set, analyzing the contribution degree based on the plurality of machine learning models and the verification set to obtain the contribution degree of each user characteristic to the modeling target of each machine learning model, and determining common characteristics in the plurality of user characteristics.
After the division of the training set and the verification set is completed, the training set may be used to train a plurality of machine learning models, and specifically, the constructed machine learning models may be XGBoost (eXtreme GradientBoosting, extreme gradient lifting), catBoost (Categorical Features GradientBoosting, class-type feature gradient lifting), lightGBM (LightGradient Boosting Machine, gradient lifting framework), and the like, which is not specifically limited in this application. After a plurality of machine learning models are built, modeling targets exist in the construction of each machine learning model, for example, when a user positioning model is built, the modeling targets are used for positioning users and determining the preference of the users, the contribution degree of each user feature participating in the construction of the user positioning model to the modeling targets is different in practice, the contribution degree of the user features related to the user preference, such as historical purchase records, historical browsing records and the like, to the modeling targets is higher, and the contribution degree of the user features not closely related to the user preference, such as addresses, user names and the like, to the modeling targets is lower. Therefore, the information recommendation system can start contribution degree analysis based on the machine learning models and the verification sets to obtain the contribution degree of each user feature for constructing each machine learning model, and therefore user features with higher contribution degree are selected later to be used as keys for positioning the user, and accurate positioning of the user is achieved. Specifically, when the contribution analysis is performed, the contribution analysis tool such as the shape analysis tool may be used to input a plurality of machine learning models and verification sets obtained through training as the contribution analysis tool, and the contribution of each user feature to the modeling target may be obtained after the contribution analysis processing.
Then, after the contribution degree of each user feature to the modeling target of each machine learning model is obtained, since the user feature with higher contribution degree is actually a main factor affecting the information content recommended to the user, and the user feature set with higher contribution degree is taken as the main guiding user recommendation information, the information recommendation system determines a common feature in the plurality of user features, and the common feature is a user feature with the contribution degree of each user feature to the modeling target of each machine learning model being greater than the contribution degree threshold value, and then information recommendation is performed to the user based on the common feature. In this embodiment, a common feature with a contribution degree greater than 0 for building each machine learning model may be used as at least one common feature.
103. Constructing a user positioning model, inputting the common characteristics into the user positioning model to obtain a user positioning result of the target user, and recommending information to the target user according to the user positioning result.
After the common features are determined, because the common features are user features with contribution degrees larger than a contribution degree threshold value to a modeling target of each machine learning model, the accuracy of locating users around the common features and recommending information is higher, so in the embodiment of the application, an information recommendation system can construct a user locating model, input the common features into the user locating model, enable the user locating model to locate users by utilizing the common features, judge which information is interested by the users, obtain a user locating result of the target users, and recommend information to the target users according to the user locating result. Specifically, the user positioning model can be constructed by adopting a causal model algorithm, and the user positioning result can be a label type result, namely, the preference of the user is represented by a label, so that when information is recommended to a target user according to the user positioning result, the information associated with the label can be recommended to the user as information to be recommended for the user to browse.
According to the method provided by the embodiment of the application, the target user to be subjected to information recommendation is determined, the test data and the user characteristics related to the target user are acquired, the data set formed by the test data and the user characteristics is divided into a training set and a verification set, the plurality of machine learning models are trained by the training set, contribution analysis is performed on the basis of the machine learning models and the verification set, contribution of each user characteristic to a modeling target of each machine learning model is obtained, common characteristics with the contribution greater than a contribution threshold value for constructing each machine learning model are determined in the user characteristics, a user positioning model is constructed, the common characteristics are input into the user positioning model, a user positioning result of the target user is obtained, and recommendation information is recommended to the target user according to the user positioning result, the common characteristics with higher contribution to the modeling target of each machine learning model are screened out through the recognition of the contribution, the user is positioned around the common characteristics, the influence of the user characteristics with low contribution on the positioning result is avoided, accurate positioning of the user is achieved, and the information recommendation accuracy is improved.
Further, as a refinement and extension of the foregoing embodiment, in order to fully describe a specific implementation procedure of the embodiment, the embodiment of the present application provides another information recommendation method, as shown in fig. 2A, where the method includes:
201. a plurality of initial user characteristics are determined based on the historical behavioral data of the target user and the user base data.
The applicant realizes that in the personalized recommendation task of the user, aiming at the problems of clicking, purchasing and the like of the user, a common solution is to construct a response model, but the defect of the method is that marketing sensitive people and natural conversion people cannot be distinguished, and modeling targets and business targets are not completely matched; the feature importance is based on the whole data set, does not meet the personalized recommendation requirement of the user, and how to consider the accurate positioning of the marketing sensitive crowd and realize the interpretable modeling is one of the problems to be solved for realizing the personalized recommendation of the user. Therefore, the information recommendation method is provided, the contribution degree of each user characteristic to the modeling target of each machine learning model is determined through analysis of the contribution degree, common characteristics with higher contribution degree to the modeling target of each machine learning model are screened out, the user is positioned around the common characteristics, the influence of the user characteristics with low contribution degree on the positioning result is avoided, accurate positioning of the user is realized, and the information recommendation precision is improved; in addition, in the embodiment of the application, a large amount of A/B experimental data is also used as a part of a training machine learning model, and a user positioning model is constructed by utilizing a causal model algorithm, so that not only can the marketing sensitive crowd be precisely positioned, but also a single case can be analyzed, the reasons of different predicted values and actual values and the contribution degree of each user characteristic to a modeling target can be judged, and the personalized recommendation requirement of a user can be met.
The embodiment of the application can be applied to an information recommendation system, the information recommendation system can be carried in APP (application) such as shopping, video, chat and the like so as to position a user, and content possibly interested in the user is recommended according to the preference of the user, and the system architecture of the information recommendation system is introduced firstly:
as shown in fig. 2B, the information recommendation system includes a database layer, a data layer, a model layer, and a result layer. The database layer is used for storing test data, historical behavior data and user basic data related to the user. The data link exists between the data layer and the database layer, and is used for determining the characteristics of the user according to the test data, the historical behavior data and the user basic data provided by the database layer, and the data provided by the database layer has the attribute data of the user and the interaction data generated when the user interacts with the APP, so that the characteristics related to the user, which are determined by the data layer, actually comprise the user attribute characteristics and the user interaction characteristics. And a data link exists between the model layer and the data layer and is used for analyzing the contribution degree of the user characteristics and constructing a user positioning model. And a data link exists between the result layer and the model layer and is used for positioning the user by utilizing the user positioning model constructed by the model layer and the contribution analysis result, so that personalized information recommendation of the user is realized.
For the embodiment of the application, since the database layer stores the test data, the historical behavior data and the user basic data associated with the user, the information recommendation system determines the target user to be subjected to information recommendation, determines a plurality of initial user features based on the historical behavior data and the user basic data of the target user, and acquires the test data associated with the target user. The historical behavior data are data generated by historical behaviors of a target user and/or data generated by interaction with the target user in a historical period, the test data are obtained by testing the target user aiming at a specified target, specifically, the test data can be A/B experimental data which are obtained by testing the target user aiming at the specified target, a specific commercial target can be set in practical application, AB test is carried out aiming at the specific commercial target, and whether A and B are better is judged, so that test data are obtained; the historical behavior data can comprise outbound records of the target user, historical donation records and the like, wherein the outbound records mainly comprise historical connection times, transfer times, connection time, average waiting time and the like, and the historical donation records mainly comprise past donation times of the target user, donation case access times, whether donation records exist in the past 30 days or not and the like; the user basic data is personal attribute data of the user, and specifically can include age, occupation, family income, whether social security is paid or not, and the like of the target user. The test data, the historical behavior data and the user basic data associated with the target user are acquired, and the acquired data can be directly used as a plurality of initial user characteristics so as to process the plurality of initial user characteristics later and determine a plurality of user characteristics finally used for contribution evaluation.
202. And respectively determining the feature type of each initial user feature in the plurality of initial user features, and processing the plurality of initial user features according to the feature type to obtain a plurality of user features.
In the embodiment of the present application, each initial user feature corresponds to a feature type, and the feature type is any one of a numerical feature or a category feature. Wherein, the numerical value type characteristic refers to the value of the initial user characteristic, such as age, historical connection times and the like; the category type feature refers to that the value of the initial user feature is a certain category, such as gender, the gender can be divided into a male category and a female category, and when the initial user features are processed, the information recommendation system adopts different processing modes so as to unify formats of the initial user features with different feature types and facilitate subsequent unified contribution analysis. The process of processing a plurality of initial user features by feature type is described below:
considering that some users may have missing information due to insufficient information, for example, some users have no age, the value of the age characteristic of the user obtained by the information recommendation system is a null value, and for example, some users have no gender, the value of the gender characteristic of the user obtained by the information recommendation system is a null value. In order to avoid the need for additional special processing of the null values in the subsequent modeling and contribution evaluation processes, the information recommendation system identifies missing initial user features before processing the plurality of initial user features, and performs missing processing on the missing initial user features. Specifically, the information recommendation system identifies an initial user feature with a null value from a plurality of initial user features as a missing user feature, identifies a feature type of the missing user feature, acquires a type identifier corresponding to the feature type, and fills the value of the missing user feature by using the type identifier. For example, assuming that the type identifier corresponding to the numerical feature is 0 and the type identifier corresponding to the category feature is-1, when the determined missing user feature is an age, the missing user feature is a numerical feature, and the value of the age of the user is set to 0; and when the determined missing user feature is gender, the missing user feature is a category type feature, and the value of the gender of the user is set to be-1.
After the processing of the missing values is completed, the processing of the plurality of initial user features according to the feature types can be started to obtain a plurality of user features. The information recommendation system can identify the feature types of the plurality of initial user features, and determine at least one first initial user feature and at least one second initial user feature, wherein the feature types of the first initial user feature are numerical type features, and the feature types of the second initial user feature are category type features. Then, the at least one first initial user feature is normalized and barreled, the at least one second initial user feature is tag coded, and the processed at least one first initial user feature and the processed at least one second initial user feature are used as a plurality of user features. Specifically, the Label coding processing performed on the second initial user feature may be Label-encoding (Label coding) on the second initial user feature, so that formats of the initial user features with different feature types are unified, a plurality of user features are obtained, and subsequent model training and contribution analysis are facilitated.
203. And acquiring test data associated with the target user, and forming a data set by utilizing the test data and a plurality of user characteristics.
The test data can be A/B experimental data, which is obtained by testing a target user aiming at a specified target, a specific commercial target can be set in practical application, and AB test is carried out aiming at the specific commercial target, and the better of A and B is judged, so that the test data is obtained. In the embodiment of the application, a large amount of A/B experimental data is also used as a part for training a plurality of machine learning models, and a user positioning model is constructed by utilizing a causal model algorithm, so that not only can the marketing sensitive crowd be precisely positioned, but also a single case can be analyzed, the reasons of different predicted values and actual values and the contribution degree of each user characteristic to a modeling target can be judged, and the personalized recommendation requirement of a user can be met.
Thus, embodiments of the present application may obtain test data associated with a target user, and utilize the test data and a plurality of user characteristics to form a data set for subsequent and data set-based training of a plurality of machine learning models.
204. And determining a preset grouping proportion, and grouping the data set according to the preset grouping proportion to obtain a training set and a verification set.
In this embodiment of the present application, after obtaining test data and a plurality of user features, since in the subsequent embodiments of the present application, the contribution degree of each user feature to the model construction needs to be analyzed, a data set formed by the test data and the plurality of user features needs to be grouped, one part of the data set is used to construct the model, and the other part of the data set is used to be input into a contribution degree analysis tool together with the constructed model for contribution degree analysis of the user features. The user features used for constructing the model may be set as a training set, and the user features used for inputting the user features together with the constructed model into the contribution analysis tool for contribution analysis of the user features may be set as a verification set, and when the data sets are grouped, the data sets may be grouped according to a preset grouping proportion, for example, the preset grouping proportion is 7: and 3, the data set can be divided into seven three, the data with the proportion of 7 is used as a training set, the data with the proportion of 3 is used as a verification set, and the specific value of the preset grouping proportion is not limited.
205. And training a plurality of machine learning models by adopting a training set, analyzing the contribution degree based on the plurality of machine learning models and the verification set to obtain the contribution degree of each user characteristic to the modeling target of each machine learning model, and determining common characteristics in the plurality of user characteristics.
In this embodiment of the present application, after the division of the training set and the verification set is completed, the training set may be used to train a plurality of machine learning models, and specifically, the constructed machine learning model may be XGBoost, catBoost, lightGBM or the like, which is not specifically limited in this application. After a plurality of machine learning models are constructed, the information recommendation system starts contribution analysis based on the plurality of machine learning models and the verification set to obtain contribution of each user feature to a modeling target of each machine learning model, specifically, contribution analysis can be performed by means of contribution analysis tools such as a shape (SHapley Additive exPlanations, machine learning interpretability tool) analysis tool, and the like, the plurality of trained machine learning models and the verification set are used as inputs of the contribution analysis tools, and contribution of each user feature to modeling is obtained after the contribution analysis processing. The shape analysis tool uses game theory to interpret the output of the machine learning model, and is applied to the machine learning model to interpret what each feature contributes to the corresponding predicted value. The core concept of the shape analysis tool is the shape value, which is calculated as the marginal contribution of each feature, namely the difference between the contribution when the feature was added and the sum of the contributions before the feature was added, namely the marginal contribution of the feature. The contribution is also partly called a base value, which is calculated from the distribution of the target variable. For the regression problem, the base value is the mean value of the target variable; for classification problems, the base value is the mean of the encoded target variables. Each feature is calculated to obtain a Shapley value, and the final prediction result is the base value plus the marginal contribution of all the features. The process of training a plurality of machine learning models and performing contribution analysis based on the plurality of machine learning models and the validation set is described below:
Firstly, the information recommendation system determines a preset machine learning algorithm, and performs model training operations on a training set for multiple times according to the preset machine learning algorithm to obtain multiple machine learning models, wherein the preset machine learning algorithm can be XGBoost, catBoost, lightGBM algorithm.
Then, the information recommendation system acquires a contribution analysis tool, selects any machine learning model from the plurality of machine learning models as a target machine learning model, and inputs the target machine learning model and the verification set into the contribution analysis tool to obtain a marginal contribution value of each user feature as the contribution of each user feature to a modeling target of the target machine learning model. And then, continuously selecting any machine learning model from the rest machine learning models and inputting the machine learning model and the verification set into a contribution degree analysis tool until a plurality of machine learning models are traversed to obtain the contribution degree of each user characteristic to the modeling target of each machine learning model, wherein the rest machine learning models are machine learning models except for the target machine learning model in the plurality of machine learning models. That is, the information recommendation system selects one machine learning model at a time, inputs the machine learning model and the verification set into the contribution analysis tool, and then obtains the contribution of each user feature participating in building the machine learning model to the model modeling target; and then, continuously selecting other machine learning models to execute the same operation, and inputting the machine learning models and the verification set into a contribution analysis tool, so that a plurality of contribution degrees of each user characteristic to modeling targets of a plurality of machine learning models are obtained, and then, referring to the corresponding plurality of contribution degrees of each user characteristic, selecting the user characteristics with the corresponding plurality of contribution degrees larger than a contribution degree threshold value from the user characteristics as common characteristics to position the user.
Then, after the contribution degree of each user feature to the modeling target of each machine learning model is obtained, because the user feature with higher contribution degree is actually a main factor affecting the information content recommended to the user, the information recommendation system can determine common features in a plurality of user features by taking the user feature with higher contribution degree as the main guiding user recommendation information, and the common features are user features with contribution degree greater than a contribution degree threshold value in the plurality of user features for constructing each machine learning model, and then information recommendation is performed to the user based on the common features. In this embodiment, a common feature that the contribution degree of the modeling target of each machine learning model is greater than 0 may be used as at least one common feature. For example, assuming that 3 machine learning models are constructed, model_1, model_2, and model_3, respectively, and the user characteristic is age, the contribution of age to the modeling target of model_1 is +5.79, the contribution of age to the modeling target of model_2 is +6.08, and the contribution of age to the modeling target of model_3 is +4.57, the user characteristic of age can be taken as a common characteristic.
In the practical application process, if the shape analysis tool is used for analyzing the contribution degree, the shape analysis tool outputs a beewave diagram, and the diagram can show the contribution degree of each user feature to the model object.
206. And constructing a user positioning model, and inputting the common characteristics into the user positioning model to obtain a user positioning result of the target user.
After the common features are determined, because the common features are user features with contribution degrees larger than a contribution degree threshold value to modeling targets of each machine learning model, the accuracy of locating users around the common features and recommending information is higher, and therefore in the embodiment of the application, an information recommendation system can construct a user locating model, input the common features into the user locating model, enable the user locating model to locate users by utilizing the common features, judge which information is interested by the users, and obtain a user locating result of the target users. The user positioning result can be a label type result, namely, the preference of the user is expressed by the label, so that when information is recommended to the target user according to the user positioning result, the information associated with the label can be recommended to the user as information to be recommended for the user to browse.
It should be noted that, the user positioning model is constructed by adopting a causal model algorithm, and the specific construction process is as follows: firstly, the information recommendation system acquires test data associated with a target user and acquires a preset comparison data set. And then, counting the data difference of the experimental data set and the control data set on the appointed target in the test data, determining a causal model algorithm, and carrying out model training on the data difference by adopting the causal model algorithm to obtain a user positioning model. The method comprises the steps of determining the data difference of an experimental data set and a control data set in test data on a specified target, namely determining an update increment, and modeling the update increment by using a causal model algorithm to obtain a user positioning model capable of accurately positioning a user.
207. And recommending information to the target user according to the user positioning result.
In the embodiment of the application, after the user positioning result is determined, information can be recommended to the target user according to the user positioning result. The information recommendation system extracts at least one user positioning label from the user positioning results, and queries recommended content associated with each user positioning label in the at least one user positioning label to obtain at least one recommended content. And then, counting the occurrence times of each recommended content in at least one recommended content, and sequencing the at least one recommended content according to the sequence from high to low of the occurrence times to obtain a content sequencing result. And finally, selecting one or more recommended contents ranked at first in the content ranking result as information to be recommended, and recommending the information to be recommended to the target user. Assuming that the user locates the tags as tag 1, tag 2 and tag 3, wherein the recommended content acquired through tag 1 is A, B, C, the recommended content acquired through tag 2 is A, C, D, and the recommended content acquired through tag 3 is A, E, F, the ranking result obtained by ranking the recommended content according to the occurrence number is A, C, B, E, F, and a or A, C can be recommended to the user as the information to be recommended for the user to browse.
In summary, the logic process of the technical scheme of the application is summarized as follows: as shown in fig. 2C, a plurality of initial user characteristics of the target user are obtained, and the plurality of initial user characteristics may specifically include user attribute characteristics and user interaction characteristics. And then, identifying the feature type of each initial user feature, determining which are numerical type features and which are category type features, carrying out standardization processing and barrel separation processing on the numerical type features, and carrying out tag coding processing on the category type features to obtain a plurality of user features. And then dividing a data set consisting of a plurality of user characteristics of the test data associated with the target user into a training set and a verification set, training a plurality of machine learning models by adopting the training set, analyzing contribution degrees based on the plurality of machine learning models and the verification set, obtaining the contribution degrees of each user characteristic to the modeling target of each machine learning model, and determining common characteristics among the plurality of user characteristics. And inputting the common features into a user positioning model constructed by using a causal model algorithm to realize positioning and information recommendation of the user.
Therefore, the method and the device replace the traditional response model by using causal inference and contribution degree analysis, and solve the problems that the response model cannot distinguish natural conversion crowd from marketing sensitive crowd, cannot analyze single cases, cannot identify the contribution degree of each feature to a modeling target and cannot meet the requirement of personalized recommendation; on the other hand, compared with a response model, the method can bring more remarkable benefits while reducing the operation cost.
According to the method provided by the embodiment of the application, the target user to be subjected to information recommendation is determined, the test data and the user characteristics related to the target user are acquired, the data set formed by the test data and the user characteristics is divided into a training set and a verification set, the plurality of machine learning models are trained by the training set, contribution analysis is performed on the basis of the machine learning models and the verification set, contribution of each user characteristic to a modeling target of each machine learning model is obtained, common characteristics with the contribution greater than a contribution threshold value for constructing each machine learning model are determined in the user characteristics, a user positioning model is constructed, the common characteristics are input into the user positioning model, a user positioning result of the target user is obtained, and recommendation information is recommended to the target user according to the user positioning result, the common characteristics with higher contribution to the modeling target of each machine learning model are screened out through the recognition of the contribution, the user is positioned around the common characteristics, the influence of the user characteristics with low contribution on the positioning result is avoided, accurate positioning of the user is achieved, and the information recommendation accuracy is improved.
Further, as a specific implementation of the method shown in fig. 1, an embodiment of the present application provides an information recommendation apparatus, as shown in fig. 3, where the apparatus includes: an acquisition module 301, an analysis module 302 and a recommendation module 303.
The acquiring module 301 is configured to determine a target user to be recommended for information, acquire test data and a plurality of user features associated with the target user, and divide a data set formed by the test data and the plurality of user features into a training set and a verification set;
the analysis module 302 is configured to train a plurality of machine learning models by using the training set, perform contribution analysis based on the plurality of machine learning models and the verification set, obtain contribution of each user feature to a modeling target of each machine learning model, and determine a common feature in the plurality of user features, where the common feature is a user feature in the plurality of user features, and the contribution of the user feature to building each machine learning model is greater than a contribution threshold;
the recommending module 303 is configured to construct a user positioning model, input the common features into the user positioning model, obtain a user positioning result of the target user, and recommend information to the target user according to the user positioning result.
In a specific application scenario, the obtaining module 301 is configured to determine a plurality of initial user features based on historical behavior data of the target user and user basic data, where the historical behavior data is data generated by historical behavior of the target user and/or data generated by interaction with the target user in a historical period; determining a feature type of each initial user feature in the plurality of initial user features, and processing the plurality of initial user features according to the feature type to obtain the plurality of user features, wherein the feature type is any one of a numerical feature or a category feature; acquiring test data associated with the target user, and forming the data set by utilizing the test data and the plurality of user characteristics, wherein the test data is obtained after the target user is tested aiming at a specified target; and determining a preset grouping proportion, and grouping the data set according to the preset grouping proportion to obtain the training set and the verification set.
In a specific application scenario, the obtaining module 301 is further configured to identify, from the plurality of initial user features, an initial user feature with a null value as a missing user feature, and identify a feature type of the missing user feature; and obtaining a type identifier corresponding to the feature type, and filling the value of the missing user feature by adopting the type identifier.
In a specific application scenario, the obtaining module 301 is configured to identify a feature type of the plurality of initial user features, determine at least one first initial user feature and at least one second initial user feature, where a feature type of the first initial user feature is a numeric feature, and a feature type of the second initial user feature is a category feature; the method comprises the steps of carrying out standardization and barrel separation processing on at least one first initial user characteristic, carrying out label coding processing on at least one second initial user characteristic, and taking the processed at least one first initial user characteristic and the processed at least one second initial user characteristic as the plurality of user characteristics.
In a specific application scenario, the analysis module 302 is configured to determine a preset machine learning algorithm, and perform multiple model training operations on the training set according to the preset machine learning algorithm to obtain multiple machine learning models; acquiring a contribution analysis tool, selecting any machine learning model from the plurality of machine learning models as a target machine learning model, and inputting the target machine learning model and the verification set into the contribution analysis tool to obtain a marginal contribution value of each user feature as the contribution of each user feature to a modeling target of the target machine learning model; and continuously selecting any machine learning model from the rest machine learning models and inputting the machine learning model and the verification set into the contribution analysis tool until traversing the plurality of machine learning models to obtain the contribution of each user feature to the modeling target of each machine learning model, wherein the rest machine learning models are machine learning models except the target machine learning model in the plurality of machine learning models.
In a specific application scenario, the recommendation module 303 is configured to obtain test data associated with the target user, and obtain a preset comparison data set; counting the data difference of the experimental data group and the control data group on the appointed target in the test data, wherein the test data is obtained after the target user is tested aiming at the appointed target; determining a causal model algorithm, and performing model training on the data difference by adopting the causal model algorithm to obtain the user positioning model.
In a specific application scenario, the recommendation module 303 is configured to extract at least one user positioning tag from the user positioning result, and query recommended content associated with each user positioning tag in the at least one user positioning tag to obtain at least one recommended content; counting the occurrence times of each recommended content in the at least one recommended content respectively, and sequencing the at least one recommended content according to the sequence from high to low of the occurrence times to obtain a content sequencing result; and selecting one or more recommended contents ranked at first from the content ranking result as information to be recommended, and recommending the information to be recommended to the target user.
According to the device provided by the embodiment of the application, the target user to be subjected to information recommendation is determined, the test data and the user characteristics related to the target user are acquired, the data set formed by the test data and the user characteristics is divided into the training set and the verification set, the training set is adopted to train the machine learning models, contribution degree analysis is carried out on the basis of the machine learning models and the verification set, contribution degree of each user characteristic to the modeling target of each machine learning model is obtained, common characteristics with the contribution degree greater than the contribution degree threshold value for constructing each machine learning model are determined in the user characteristics, a user positioning model is constructed, the common characteristics are input into the user positioning model, a user positioning result of the target user is obtained, and recommendation information is recommended to the target user according to the user positioning result, the common characteristics with higher contribution degree to the modeling target of each machine learning model are screened out through the identification of the contribution degree, the user is positioned around the common characteristics, the influence of the user characteristics with low contribution degree on the positioning result is avoided, accurate positioning of the user is realized, and the information recommendation precision is improved.
It should be noted that, for other corresponding descriptions of each functional unit related to the information recommending apparatus provided in the embodiment of the present application, reference may be made to corresponding descriptions in fig. 1 and fig. 2A to fig. 2C, and no further description is given here.
It should be noted that, user information (including but not limited to user equipment information, user personal information, etc.) and data (including but not limited to data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.
In an exemplary embodiment, referring to fig. 4, there is also provided a computer device, which includes a bus, a processor, a memory, and a communication interface, and may further include an input-output interface and a display device, where each functional unit may perform communication with each other through the bus. The memory stores a computer program and a processor for executing the program stored in the memory to perform the information recommendation method in the above embodiment.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the information recommendation method.
From the above description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented in hardware, or may be implemented by means of software plus necessary general hardware platforms. Based on such understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.), and includes several instructions for causing a computer device (may be a personal computer, a server, or a network device, etc.) to perform the methods described in various implementation scenarios of the present application.
Those skilled in the art will appreciate that the drawings are merely schematic illustrations of one preferred implementation scenario, and that the modules or flows in the drawings are not necessarily required to practice the present application.
Those skilled in the art will appreciate that modules in an apparatus in an implementation scenario may be distributed in an apparatus in an implementation scenario according to an implementation scenario description, or that corresponding changes may be located in one or more apparatuses different from the implementation scenario. The modules of the implementation scenario may be combined into one module, or may be further split into a plurality of sub-modules.
The foregoing application serial numbers are merely for description, and do not represent advantages or disadvantages of the implementation scenario.
The foregoing disclosure is merely a few specific implementations of the present application, but the present application is not limited thereto and any variations that can be considered by a person skilled in the art shall fall within the protection scope of the present application.
Claims (10)
1. An information recommendation method, comprising:
determining a target user to be subjected to information recommendation, acquiring test data and a plurality of user characteristics associated with the target user, and dividing a data set consisting of the test data and the plurality of user characteristics into a training set and a verification set;
Training a plurality of machine learning models by adopting the training set, and analyzing contribution degrees based on the plurality of machine learning models and the verification set to obtain the contribution degrees of each user feature to a modeling target of each machine learning model, wherein common features are determined in the plurality of user features, and the common features are user features, which are larger than a contribution degree threshold value, in the plurality of user features for constructing each machine learning model;
constructing a user positioning model, inputting the common features into the user positioning model to obtain a user positioning result of the target user, and recommending information to the target user according to the user positioning result.
2. The method of claim 1, wherein the obtaining the test data and the plurality of user features associated with the target user and dividing the data set of the test data and the plurality of user features into a training set and a validation set comprises:
determining a plurality of initial user characteristics based on historical behavior data of the target user and user basic data, wherein the historical behavior data is data generated by historical behaviors of the target user and/or data generated by interactions with the target user in a historical period;
Determining a feature type of each initial user feature in the plurality of initial user features, and processing the plurality of initial user features according to the feature type to obtain the plurality of user features, wherein the feature type is any one of a numerical feature or a category feature;
acquiring test data associated with the target user, and forming the data set by utilizing the test data and the plurality of user characteristics, wherein the test data is obtained after the target user is tested aiming at a specified target;
and determining a preset grouping proportion, and grouping the data set according to the preset grouping proportion to obtain the training set and the verification set.
3. The method of claim 2, wherein after determining a plurality of initial user characteristics based on the historical behavioral data of the target user and user base data, the method further comprises:
identifying an initial user feature with a null value from the plurality of initial user features as a missing user feature, and identifying a feature type of the missing user feature;
and obtaining a type identifier corresponding to the feature type, and filling the value of the missing user feature by adopting the type identifier.
4. The method of claim 2, wherein the determining the feature type of each of the plurality of initial user features, and the processing the plurality of initial user features according to the feature type, respectively, to obtain the plurality of user features, comprises:
identifying the feature types of the plurality of initial user features, and determining at least one first initial user feature and at least one second initial user feature, wherein the feature type of the first initial user feature is a numerical type feature, and the feature type of the second initial user feature is a category type feature;
the method comprises the steps of carrying out standardization and barrel separation processing on at least one first initial user characteristic, carrying out label coding processing on at least one second initial user characteristic, and taking the processed at least one first initial user characteristic and the processed at least one second initial user characteristic as the plurality of user characteristics.
5. The method of claim 1, wherein training a plurality of machine learning models using the training set, performing a contribution analysis based on the plurality of machine learning models and the verification set to obtain a contribution of each user feature to a modeling target of each machine learning model, comprises:
Determining a preset machine learning algorithm, and performing model training operation on the training set for a plurality of times according to the preset machine learning algorithm to obtain a plurality of machine learning models;
acquiring a contribution analysis tool, selecting any machine learning model from the plurality of machine learning models as a target machine learning model, and inputting the target machine learning model and the verification set into the contribution analysis tool to obtain a marginal contribution value of each user feature as the contribution of each user feature to a modeling target of the target machine learning model;
and continuously selecting any machine learning model from the rest machine learning models and inputting the machine learning model and the verification set into the contribution analysis tool until traversing the plurality of machine learning models to obtain the contribution of each user feature to the modeling target of each machine learning model, wherein the rest machine learning models are machine learning models except the target machine learning model in the plurality of machine learning models.
6. The method of claim 1, wherein the constructing a user location model comprises:
Acquiring test data associated with the target user and acquiring a preset control data set;
counting the data difference of the experimental data group and the control data group on the appointed target in the test data, wherein the test data is obtained after the target user is tested aiming at the appointed target;
determining a causal model algorithm, and performing model training on the data difference by adopting the causal model algorithm to obtain the user positioning model.
7. The method of claim 1, wherein said recommending information to said target user in accordance with said user location result comprises:
extracting at least one user positioning label from the user positioning result, and respectively inquiring recommended content associated with each user positioning label in the at least one user positioning label to obtain at least one recommended content;
counting the occurrence times of each recommended content in the at least one recommended content respectively, and sequencing the at least one recommended content according to the sequence from high to low of the occurrence times to obtain a content sequencing result;
and selecting one or more recommended contents ranked at first from the content ranking result as information to be recommended, and recommending the information to be recommended to the target user.
8. An information recommendation device, characterized by comprising:
the acquisition module is used for determining a target user to be subjected to information recommendation, acquiring test data and a plurality of user characteristics associated with the target user, and dividing a data set consisting of the test data and the plurality of user characteristics into a training set and a verification set;
the analysis module is used for training a plurality of machine learning models by adopting the training set, carrying out contribution analysis on the basis of the plurality of machine learning models and the verification set to obtain the contribution of each user characteristic to a modeling target of each machine learning model, and determining common characteristics in the plurality of user characteristics, wherein the common characteristics are user characteristics, of the plurality of user characteristics, for constructing each machine learning model, the contribution is greater than a contribution threshold;
and the recommending module is used for constructing a user positioning model, inputting the common characteristics into the user positioning model, obtaining a user positioning result of the target user, and recommending information to the target user according to the user positioning result.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any of claims 1 to 7 when the computer program is executed.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310787586.XA CN116501979A (en) | 2023-06-30 | 2023-06-30 | Information recommendation method, information recommendation device, computer equipment and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310787586.XA CN116501979A (en) | 2023-06-30 | 2023-06-30 | Information recommendation method, information recommendation device, computer equipment and computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116501979A true CN116501979A (en) | 2023-07-28 |
Family
ID=87323541
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310787586.XA Pending CN116501979A (en) | 2023-06-30 | 2023-06-30 | Information recommendation method, information recommendation device, computer equipment and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116501979A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117390292A (en) * | 2023-12-12 | 2024-01-12 | 深圳格隆汇信息科技有限公司 | Application program information recommendation method, system and equipment based on machine learning |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109934407A (en) * | 2019-03-14 | 2019-06-25 | 武汉大学 | A kind of volunteers working intention prediction technique based on Logistic generalized linear regression model |
CN112989217A (en) * | 2021-02-25 | 2021-06-18 | 清华大学 | System for managing human veins |
WO2022222224A1 (en) * | 2021-04-19 | 2022-10-27 | 平安科技(深圳)有限公司 | Deep learning model-based data augmentation method and apparatus, device, and medium |
CN115495663A (en) * | 2022-10-18 | 2022-12-20 | 康键信息技术(深圳)有限公司 | Information recommendation method and device, electronic equipment and storage medium |
CN116205310A (en) * | 2023-02-14 | 2023-06-02 | 中国水利水电科学研究院 | Soil water content influence factor sensitive interval judging method based on interpretable integrated learning model |
CN116258273A (en) * | 2023-03-31 | 2023-06-13 | 重庆长安汽车股份有限公司 | Hydraulic prediction method and system for wet double-clutch transmission, vehicle and storage medium |
-
2023
- 2023-06-30 CN CN202310787586.XA patent/CN116501979A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109934407A (en) * | 2019-03-14 | 2019-06-25 | 武汉大学 | A kind of volunteers working intention prediction technique based on Logistic generalized linear regression model |
CN112989217A (en) * | 2021-02-25 | 2021-06-18 | 清华大学 | System for managing human veins |
WO2022222224A1 (en) * | 2021-04-19 | 2022-10-27 | 平安科技(深圳)有限公司 | Deep learning model-based data augmentation method and apparatus, device, and medium |
CN115495663A (en) * | 2022-10-18 | 2022-12-20 | 康键信息技术(深圳)有限公司 | Information recommendation method and device, electronic equipment and storage medium |
CN116205310A (en) * | 2023-02-14 | 2023-06-02 | 中国水利水电科学研究院 | Soil water content influence factor sensitive interval judging method based on interpretable integrated learning model |
CN116258273A (en) * | 2023-03-31 | 2023-06-13 | 重庆长安汽车股份有限公司 | Hydraulic prediction method and system for wet double-clutch transmission, vehicle and storage medium |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117390292A (en) * | 2023-12-12 | 2024-01-12 | 深圳格隆汇信息科技有限公司 | Application program information recommendation method, system and equipment based on machine learning |
CN117390292B (en) * | 2023-12-12 | 2024-02-09 | 深圳格隆汇信息科技有限公司 | Application program information recommendation method, system and equipment based on machine learning |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109582876B (en) | Tourist industry user portrait construction method and device and computer equipment | |
CN109816483B (en) | Information recommendation method and device and readable storage medium | |
CN111815415A (en) | Commodity recommendation method, system and equipment | |
CN110008397B (en) | Recommendation model training method and device | |
CN112632405A (en) | Recommendation method, device, equipment and storage medium | |
CN113688326B (en) | Recommendation method, device, equipment and computer readable storage medium | |
CN114997916A (en) | Prediction method, system, electronic device and storage medium of potential user | |
CN116501979A (en) | Information recommendation method, information recommendation device, computer equipment and computer readable storage medium | |
CN116764236A (en) | Game prop recommending method, game prop recommending device, computer equipment and storage medium | |
CN114693409A (en) | Product matching method, device, computer equipment, storage medium and program product | |
CN113065911A (en) | Recommendation information generation method and device, storage medium and electronic equipment | |
CN113935788A (en) | Model evaluation method, device, equipment and computer readable storage medium | |
Kabra et al. | Potent real-time recommendations using multimodel contextual reinforcement learning | |
CN111475720A (en) | Recommendation method, recommendation device, server and storage medium | |
CN110705889A (en) | Enterprise screening method, device, equipment and storage medium | |
CN116304851A (en) | Data standard determining method, apparatus, device, medium and computer program product | |
CN114529399A (en) | User data processing method, device, computer equipment and storage medium | |
CN110503482B (en) | Article processing method, device, terminal and storage medium | |
US20200342302A1 (en) | Cognitive forecasting | |
CN113807870B (en) | Vehicle information authentication method, device, computer equipment and storage medium | |
US20240061866A1 (en) | Methods and systems for a standardized data asset generator based on ontologies detected in knowledge graphs of keywords for existing data assets | |
CN113420214B (en) | Electronic transaction object recommendation method, device and equipment | |
CN112801744B (en) | Activity recommendation method and device, electronic equipment and storage medium | |
CN117036041A (en) | Service information pushing method, device, computer equipment and storage medium | |
CN111754262A (en) | Pricing determination method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |