Nothing Special   »   [go: up one dir, main page]

WO2018090545A1 - 融合时间因素的协同过滤方法、装置、服务器和存储介质 - Google Patents

融合时间因素的协同过滤方法、装置、服务器和存储介质 Download PDF

Info

Publication number
WO2018090545A1
WO2018090545A1 PCT/CN2017/079565 CN2017079565W WO2018090545A1 WO 2018090545 A1 WO2018090545 A1 WO 2018090545A1 CN 2017079565 W CN2017079565 W CN 2017079565W WO 2018090545 A1 WO2018090545 A1 WO 2018090545A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
time period
model
user preference
smoothing
Prior art date
Application number
PCT/CN2017/079565
Other languages
English (en)
French (fr)
Inventor
曹路洋
王建明
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Priority to US15/578,368 priority Critical patent/US10565525B2/en
Priority to KR1020187015328A priority patent/KR102251302B1/ko
Priority to SG11201709930TA priority patent/SG11201709930TA/en
Priority to JP2017566628A priority patent/JP6484730B2/ja
Priority to EP17801315.7A priority patent/EP3543941A4/en
Priority to AU2017268629A priority patent/AU2017268629A1/en
Publication of WO2018090545A1 publication Critical patent/WO2018090545A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0269Targeted advertisements based on user profile or attribute
    • G06Q30/0271Personalized advertisement
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • G06N5/022Knowledge engineering; Knowledge acquisition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0241Advertisements
    • G06Q30/0251Targeted advertisements
    • G06Q30/0255Targeted advertisements based on user history
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Definitions

  • the present invention relates to the field of computer technologies, and in particular, to a collaborative filtering method, apparatus, server, and storage medium that incorporate time factors.
  • a collaborative filtering method, apparatus, server, and storage medium incorporating a time factor are provided.
  • a collaborative filtering method that integrates time factors including:
  • the training is performed by the collaborative filtering model, and the predicted values of the plurality of users to be predicted in the sparse matrix are calculated.
  • a collaborative filtering device that integrates time factors including:
  • a model building module for establishing an exponential smoothing model
  • An obtaining module configured to acquire a time period for the exponential smoothing model, where the time period includes multiple time periods; acquiring a plurality of user identifiers and user preference values of the user identifiers for the specified products in the plurality of time periods;
  • a smoothing module configured to perform an iterative calculation on the user preference level value by using the exponential smoothing model, to obtain a smoothing result corresponding to a time period
  • a matrix generating module configured to generate a sparse matrix by using the user identifier and the smoothing result corresponding to the time period, where the sparse matrix includes a plurality of user preferences to be predicted;
  • the obtaining module is further configured to acquire a collaborative filtering model
  • a first training module configured to input a smoothing result corresponding to the time period to the collaborative filtering model; and perform training by using the collaborative filtering model to calculate a prediction of a plurality of users to be predicted in the sparse matrix value.
  • a server comprising a memory and a processor, the memory storing computer executable instructions, the computer executable instructions being executed by the processor, such that the processor performs the following steps:
  • the training is performed by the collaborative filtering model, and the predicted values of the plurality of users to be predicted in the sparse matrix are calculated.
  • One or more non-volatile readable storage media storing computer-executable instructions, when executed by one or more processors, cause the one or more processors to perform the following steps:
  • the training is performed by the collaborative filtering model, and the predicted values of the plurality of users to be predicted in the sparse matrix are calculated.
  • FIG. 1 is an application scenario diagram of a collaborative filtering method for merging time factors in an embodiment
  • FIG. 2 is a flow chart of a collaborative filtering method for merging time factors in an embodiment
  • Figure 3 is a schematic illustration of recorded points in a two-dimensional space in one embodiment
  • Figure 4 is a block diagram of a server in one embodiment
  • Figure 5 is a block diagram of a collaborative filtering device incorporating time factors in one embodiment
  • FIG. 6 is a block diagram of a collaborative filtering device incorporating time factors in another embodiment
  • Figure 7 is a block diagram of a collaborative filtering device incorporating time factors in still another embodiment
  • Figure 8 is a block diagram of a collaborative filtering device incorporating time factors in yet another embodiment.
  • the collaborative filtering method of the fusion time factor provided in the embodiment of the present application can be applied to the application scenario shown in FIG. 1 .
  • the terminal 102 and the server 104 are connected through a network. There may be multiple terminals 102.
  • An application that can access the server is installed on the terminal 102.
  • the server 104 returns a corresponding page to the terminal 102. Users can click, collect, and purchase products displayed on the page.
  • the server 104 can collect the user identification and the user behavior described above.
  • the server 104 obtains a user preference level value by collecting user behavior for a specified product within a preset time period.
  • Server 104 is established Exponential smoothing model.
  • the server 104 can formulate a corresponding time period for the exponential smoothing model, and there can be multiple time periods within the time period.
  • the server 104 obtains a plurality of user identities and user preference values for the specified products over a plurality of time periods.
  • the server 104 inputs the user preference level values corresponding to the plurality of time periods into the exponential smoothing model, and iteratively calculates the user preference level values of the plurality of time periods to obtain a plurality of smoothing results corresponding to the time periods.
  • the server 104 generates a sparse matrix corresponding to the product identifier by using the smoothing result corresponding to the user identifier and the time period, and the sparse matrix includes a plurality of user preferences to be predicted.
  • the server 104 acquires a collaborative filtering model, and inputs a smoothing result corresponding to the time period to the collaborative filtering model. Through the collaborative filtering model, the predicted values of the plurality of users to be predicted in the sparse matrix are calculated.
  • a collaborative filtering method for merging time factors is provided. It should be understood that although the steps in the flowchart of FIG. 2 are sequentially displayed as indicated by the arrows, these steps are It is not necessarily performed in the order indicated by the arrows. Except as explicitly stated herein, the execution of these steps is not strictly limited, and may be performed in other sequences. Moreover, at least some of the steps in FIG. 2 may include a plurality of sub-steps or stages, which are not necessarily performed at the same time, but may be executed at different times, and the order of execution thereof is not necessarily This may be performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of the other steps.
  • the method is applied to the server as an example, and specifically includes:
  • step 202 an exponential smoothing model is established.
  • Step 204 Obtain a time period for the exponential smoothing model, where the time period includes multiple time periods.
  • User preference refers to the user's preference for a given product.
  • User preferences can be expressed in numerical values.
  • User preference data is pre-stored on the server.
  • the user preference data includes a user identifier, a product identifier, and a corresponding user preference value.
  • the user preference value may be obtained by the server collecting user behaviors for the specified product within a preset time period, and the user behavior includes: clicking, purchasing, and collecting.
  • the degree of user preference may be corresponding to the time period. For different designated products, the time period corresponding to the user's preference may be the same or different. For example, in a game product, the time period corresponding to the degree of user preference may be one day. For insurance products, the time period corresponding to the user's preference may be one month or one month.
  • the server establishes an exponential smoothing model.
  • the user preference of multiple time periods is fused by an exponential smoothing model.
  • the server can formulate a corresponding time period for the exponential smoothing model, and there can be multiple time periods in the time period.
  • the time period can be formulated according to the specified product characteristics, and different designated products can be formulated for different time periods.
  • the time period proposed for the exponential smoothing model of the wealth management product may be one month, and the time period within the time period may be in days.
  • the time period proposed for the exponential smoothing model of the insurance product may be one year, and the time period within the time period may be in units of months.
  • the exponential coefficient can reflect the importance of the time period affecting the user's preference. The larger the exponential coefficient, the more important the impact of the time period on user preference. The closer the time periods are to each other, the greater the impact on user preference.
  • Step 206 Acquire multiple user identifiers and user preference values of the specified products for the specified products in multiple time periods.
  • the time period defined by the server for the exponential smoothing model includes multiple time periods, and the server acquires multiple user identifiers and user preference values of the user identifiers for the specified products in multiple time periods.
  • the user preference value of the specified product in multiple time periods may be a user preference value of the specified product, or may be a user preference value of the specified product.
  • Step 208 Perform an iterative calculation on the user preference value by using an exponential smoothing model to obtain a smoothing result corresponding to the time period.
  • the server inputs the user preference value corresponding to the multiple time periods into the exponential smoothing model, and iteratively calculates the user preference values of the multiple time periods to obtain a plurality of smoothing results corresponding to the time periods. Specifically, the server obtains an exponential coefficient corresponding to the exponential smoothing model according to the product identifier. The server multiplies the user preference value corresponding to the first time period in the proposed time period by the exponential coefficient, and uses the product as the initial value of the exponential smoothing model, and the initial value may also be referred to as the smoothing result corresponding to the first time period. .
  • the server performs iterative calculation by using the smoothing result corresponding to the first time period, the user preference level corresponding to the second time period, and the exponential coefficient input exponential smoothing model to obtain a smoothing result corresponding to the second time period.
  • the server calculates smooth results corresponding to multiple time periods.
  • the model iteratively calculates the user preference values of the previous 4 days, and obtains the corresponding smoothing results as shown in Table 1:
  • This combines the user preference value of the specified product with the time factor through an exponential smoothing model.
  • Step 210 Generate a sparse matrix by using a user identifier and a smoothing result corresponding to the time period, where the sparse matrix includes a plurality of user preferences to be predicted.
  • the server generates a sparse matrix corresponding to the product identifier and the product identifier by using the smoothing result corresponding to the user identifier and the time period.
  • the sparse matrix may include multiple user identifiers and one product identifier, and may also include multiple user identifiers and multiple product identifiers.
  • the coefficient matrix includes known user preferences Degree value and unknown user preference value. Among them, the unknown user preference value is also the predicted value of the user preference to be predicted.
  • the predicted value of the user preference to be predicted can be represented by a preset character. For example, use? To represent.
  • the row in the sparse matrix represents the product identifier
  • the column represents the user identifier
  • the value in the sparse matrix represents the user preference value of the product to the product, as shown in Table 2 below:
  • the server obtains a smoothing result of the product identifier, the user identifier, and the user preference value in the current time period in the current time period to generate a sparsity corresponding to the user identifier and the product identifier. matrix.
  • Step 212 Acquire a collaborative filtering model, and input the smoothing result corresponding to the time period to the collaborative filtering model.
  • Step 214 Perform training through the collaborative filtering model, and calculate a predicted value of the plurality of users to be predicted in the sparse matrix.
  • the collaborative filtering model can use the traditional collaborative filtering model.
  • the server acquires a collaborative filtering model, and inputs the smoothing result corresponding to the time period to the collaborative filtering model.
  • the collaborative filtering model Through the collaborative filtering model, the predicted values of the plurality of users to be predicted in the sparse matrix are calculated.
  • the server when predicting an unknown user preference level value in the next time period, the server obtains a smoothing result of the plurality of user identifiers in the previous time period, and inputs the smoothing processing result of the previous time period to the collaborative filtering model.
  • the collaborative filtering model Through the collaborative filtering model, the pre-predicted user preference in the sparse matrix corresponding to the user identifier and the product identifier is calculated in the next time period. Measured value.
  • the user preference value in multiple time periods is iteratively calculated, and a smoothing result corresponding to the time period is obtained, so that the user preference value and the time factor of the specified product are performed.
  • Effective integration When predicting an unknown user preference value in the next time period, the user identifier and the smoothing result corresponding to the time period may be used to generate a sparse matrix, and the smoothing result corresponding to the time period is input to the collaborative filtering model, and the collaborative filtering model is performed. Training, thereby calculating a predicted value of a plurality of user preferences to be predicted in the sparse matrix. Since the smoothed result input to the collaborative filtering model is fused with the time factor, it is possible to predict the user preference level value associated with the specified product and the time factor. Thereby, the combination time factor is used to effectively predict the user preference of the specified product.
  • the method further includes: acquiring a dimension corresponding to the user preference value; and user preference of the plurality of dimensions according to the user identifier The value is statistically calculated; the statistical result is regularized to obtain a multi-dimensional vector corresponding to the user identifier; and the user similarity between the user identifiers is calculated according to the multi-dimensional vector.
  • the similarity calculation may be performed on all the known and predicted user preference values, thereby obtaining the user preference program. Similar multiple user IDs.
  • the server can use the product identification as the dimension corresponding to the user preference value. Different product identifiers are different dimensions.
  • the user preference value can be regarded as a recorded point scattered in the space. Taking a map in which space is a two-dimensional space as an example, as shown in FIG. 3, each record point can be represented by longitude and latitude.
  • the X axis in Figure 3 can represent dimensions and the Y axis represents longitude. It is assumed that the user preference value of the user identification 1 is represented by a black dot in the recording point in FIG. 3, and the user preference value of the user identification 2 is represented by a gray dot in the recording point in FIG. There are 4 record points for user ID 1, and 3 record points for user ID 2.
  • the server clusters all the recorded points.
  • the server can use the KMeans algorithm (a clustering algorithm) to cluster multiple classes.
  • Each class can have a corresponding dimension.
  • Each category includes a record point of a user preference value corresponding to a plurality of user IDs.
  • the server collects statistics on user preference levels of multiple dimensions according to the user identifier, and obtains a statistical result of the user preference value.
  • the server performs regularization processing on the statistical result to obtain a multi-dimensional vector corresponding to the user identifier, and calculates a similarity distance between the user identifiers according to the multi-dimensional vector, and uses the similarity distance as the similarity of the user's preference.
  • the recording points corresponding to the user ID 1 and the user ID 2 in FIG. 3 are taken as an example for description.
  • the server clusters the record points in Figure 3 to obtain three dimensions.
  • the user identifier 1 has 2 record points in the first dimension, 1 record point in the second dimension, and 1 record point in the third dimension.
  • User ID 2 has 2 record points in the first dimension, 1 record point in the second dimension, and 0 record points in the third dimension.
  • the total number of record points of the user preference value corresponding to the user ID 1 of the server is 4, and the total number of record points of the user preference level corresponding to the user ID 2 is 3.
  • the server performs regularization processing on the statistical result to obtain a multidimensional vector (2/4, 1/4, 1/4) corresponding to the user identifier 1 and a multidimensional vector corresponding to the user identifier 2 (2/4, 1/4, 1/4). ).
  • the similarity distance between the user identifier 1 and the user identifier 2 is calculated according to the multi-dimensional vector, and the similarity distance is taken as the similarity of the user's preference.
  • the calculation method of the similar distance may be various, for example, the calculation method using the Euclidean distance or the like to calculate the similarity distance.
  • the method further includes: obtaining a positive sample and a negative sample corresponding to the user preference value according to the product identifier and the user identifier; splitting the negative sample to obtain a plurality of split negative samples, after splitting The difference between the number of negative samples and the number of positive samples is within a preset range; the classification model is obtained, and the classification model is trained by using the positive sample and the split negative sample to obtain a plurality of trained classification models; After training, the classification model is fitted and calculated for each training. The classification weight corresponding to the classification model.
  • the server may further obtain a positive sample and a negative sample corresponding to the user preference level value according to the product identifier and the user identifier.
  • a positive sample indicates that the user likes a product
  • a negative sample indicates that the user does not like a product.
  • the positive sample is that user 1 likes iPhone 7 (a type of mobile phone), and the negative sample is that user 2 does not like iPhone 7.
  • the user preference value includes a known user preference value and a predicted user preference value.
  • the server may perform classification training using known user preference values, or may perform classification training using known user preference values and predicted user preference values.
  • Positive and negative samples can be collectively referred to as samples.
  • Corresponding sample data is pre-stored on the server, and the sample data includes user characteristic data and product feature data.
  • the user feature data includes the age and gender of the user, and the product feature data includes the product identifier and the product type.
  • the number of users who like the new product is much smaller than the number of users who do not like the new product.
  • the number of positive samples for a product is less than the number of negative samples.
  • the traditional way is to obtain negative samples corresponding to the number of positive samples by under-sampling in negative samples, and use negative samples and positive samples of under-sampling for classification training.
  • the traditional way is to copy the positive samples so that the number of positive samples is basically the same as the number of negative samples.
  • the traditional method 2 does not add additional sample information, because the number of negative samples is much larger than the number of positive samples, after the positive samples are copied, the amount of data that needs to be calculated increases, which increases the computing burden of the server.
  • the server obtains a positive sample and a negative sample corresponding to the user preference value according to the product identifier and the user identifier.
  • the server splits the negative samples based on the number of positive samples. Split sample The difference between the number of the present and the number of positive samples is within a preset range.
  • the number of negative samples after splitting is equal to or equal to the number of positive samples.
  • the server obtains a classification model, wherein the classification model can adopt a traditional classification model.
  • the server trains each split negative sample and positive sample input classification model to obtain the trained classification model with the same number of negative samples after splitting.
  • the server obtains a regression model, wherein the regression model can adopt a traditional regression model.
  • the server inputs the output results of the plurality of trained classification models to the regression model, and fits the plurality of trained classification models through the regression model, and calculates the classification weight corresponding to each of the trained classification models. Throughout the process, not only the full use of all sample data, but also the need to calculate the data has not surged, effectively alleviating the computing burden of the server.
  • the method further comprises: acquiring sample data to be classified; and classifying the sample data by using the trained classification model and the classification weight.
  • the server may obtain sample data to be classified, input the sample data to be classified into the trained classification model, and classify the classified sample data by using each trained classification model and classification weight. This allows for fast and efficient classification of sample data to be processed.
  • a server in one embodiment, as shown in FIG. 4, includes a processor coupled via a system bus, an internal memory, a non-volatile storage medium, and a network interface.
  • the non-volatile storage medium of the server stores an operating system and computer executable instructions, and the computer executable instructions are used to implement a collaborative filtering method suitable for a server.
  • the processor is used to provide computing and control capabilities to support the operation of the entire server.
  • the network interface is used to communicate with external terminals via a network connection.
  • the server can be implemented with a stand-alone server or a server cluster consisting of multiple servers. It will be understood by those skilled in the art that the structure shown in FIG.
  • FIG. 4 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the server to which the solution of the present application is applied.
  • the specific server may include a ratio. More or fewer components are shown in the figures, or some components are combined, or have different component arrangements.
  • a collaborative filtering device that integrates time factors is provided, including: a model establishing module 502, an obtaining module 504, a smoothing module 506, and a matrix generating module. 508 and a first training module 510, wherein:
  • the model building module 502 is configured to establish an exponential smoothing model.
  • the obtaining module 504 is configured to obtain a time period for the exponential smoothing model, where the time period includes a plurality of time periods, and obtain a plurality of user identifiers and user preference values of the specified products for the specified products in the plurality of time periods.
  • the smoothing module 506 is configured to perform an iterative calculation on the user preference level value by using an exponential smoothing model to obtain a smoothing result corresponding to the time period.
  • the matrix generation module 508 is configured to generate a sparse matrix by using a user identifier and a smoothing result corresponding to the time period, where the sparse matrix includes a plurality of user preferences to be predicted.
  • the obtaining module 504 is further configured to acquire a collaborative filtering model.
  • the first training module 510 is configured to input the smoothing result corresponding to the time period to the collaborative filtering model; and perform training through the collaborative filtering model to calculate a predicted value of the plurality of users to be predicted in the sparse matrix.
  • the formula for the smoothing model includes:
  • a represents the index coefficient corresponding to the product identifier
  • P t+1 represents the user preference level value corresponding to the next time period
  • P t represents the user preference level value corresponding to the current time period
  • P t-1 represents the previous time period corresponding to User preference value.
  • the obtaining module 504 is further configured to obtain a dimension corresponding to the user preference value.
  • the device further includes: a statistics module 512, a regularization module 514, and a similarity calculation module 516, where:
  • the statistics module 512 is configured to perform statistics on user preference levels of multiple dimensions according to the user identifier.
  • the regularization module 514 is configured to perform regularization processing on the statistical result to obtain a multi-dimensional vector corresponding to the user identifier.
  • the similarity calculation module 516 is configured to calculate, according to the multi-dimensional vector, the similarity of user preferences between the user identifiers.
  • the obtaining module 504 is further configured to obtain a positive sample and a negative sample corresponding to the user preference value according to the product identifier and the user identifier; as shown in FIG. 6, the device further includes: a split module 518, and a second training. Module 520 and fitting module 522, wherein:
  • the splitting module 518 is configured to split the negative sample to obtain a plurality of split negative samples, and the difference between the number of split negative samples and the number of positive samples is within a preset range.
  • the acquisition module 504 is also used to acquire a classification model.
  • the second training module 520 is configured to train the classification model by using the positive sample and the split negative sample to obtain a plurality of trained classification models.
  • the fitting module 522 is configured to fit a plurality of trained classification models, and calculate a classification weight corresponding to each of the trained classification models.
  • the obtaining module 504 is further configured to acquire sample data to be classified; as shown in FIG. 7, the apparatus further includes: a classification module 524, configured to classify the sample data by using the trained classification model and the classification weight .
  • the various modules in the above-described collaborative time factor collaborative filtering device may be implemented in whole or in part by software, hardware, and combinations thereof.
  • the above modules may be embedded in the hardware of the base station or may be stored in the memory of the base station in a software form, so that the processor can call the corresponding operations of the above modules.
  • the processor may be a central processing unit (CPU) or a microprocessor.
  • one or more non-volatile readable storage media storing computer-executable instructions are provided, the computer-executable instructions being executed by one or more processors such that the one or more The processors perform the following steps:
  • the training is performed by the collaborative filtering model, and the predicted values of the plurality of users to be predicted in the sparse matrix are calculated.
  • the formula of the smoothing model includes:
  • a represents the index coefficient corresponding to the product identifier
  • P t+1 represents the user preference level value corresponding to the next time period
  • P t represents the user preference level value corresponding to the current time period
  • P t-1 represents the previous time period corresponding to User preference value.
  • the one or more processors perform the following steps: obtaining a dimension corresponding to the user preference value; performing statistics on the user preference values of the multiple dimensions according to the user identifier; performing regularization processing on the statistical result to obtain a multi-dimensional vector corresponding to the user identifier And calculating a similarity of the user preferences of the user identifiers according to the multi-dimensional vector.
  • the one or more processors when the computer executable instructions are executed by one or more processors, the one or more processors further cause the step of: obtaining a positive corresponding to the user preference level value based on the product identification and the user identification a sample and a negative sample; splitting the negative sample to obtain a plurality of split negative samples, wherein the difference between the number of the split negative samples and the number of the positive samples is within a preset range; Obtaining a classification model, training the classification model by using the positive sample and the split negative sample to obtain a plurality of trained classification models; and fitting the plurality of trained classification models, The classification weight corresponding to each trained classification model is calculated.
  • the computer executable instructions are executed by one or more processors, further causing the one or more The processor performs the following steps: obtaining sample data to be classified; and using the training The trained classification model and the classification weights classify the sample data to be classified.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Data Mining & Analysis (AREA)
  • General Business, Economics & Management (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • Game Theory and Decision Science (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Complex Calculations (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

一种融合时间因素的协同过滤方法,包括:建立指数平滑模型(202);获取对所述指数平滑模型拟定的时间段,所述时间段包括多个时间周期(204);获取多个用户标识以及用户标识在多个时间周期内对指定产品的用户喜好程度值(206);利用所述指数平滑模型对所述用户喜好程度值进行迭代计算,得到与时间周期对应的平滑结果(208);利用所述用户标识和所述与时间周期对应的平滑结果生成稀疏矩阵,所述稀疏矩阵包括多个待预测用户喜好程度(210);获取协同过滤模型,将所述时间周期对应的平滑结果输入至所述协同过滤模型(212);及通过所述协同过滤模型进行训练,计算得到所述稀疏矩阵中的多个待预测用户喜好程度的预测值(214)。

Description

融合时间因素的协同过滤方法、装置、服务器和存储介质
本申请要求于2016年11月15日提交中国专利局,申请号为2016110052001,发明名称为“融合时间因素的协同过滤方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及计算机技术领域,特别是涉及一种融合时间因素的协同过滤方法、装置、服务器和存储介质。
背景技术
收集用户对产品的喜好,通过进行数据分析和挖掘可以有效提高产品信息推送的准确度。在传统的方式中,用户某个产品的喜好程度通常是只是利用用户行为来构建的。例如,用户行为包括:点击、收藏和购买等。在对未知的用户喜好程度值进行预测时也就缺乏了时间因素的考虑。假设,用户在一年之前购买的某个产品,而在今年未再继续够买该产品。如果利用该用户对该产品在一年前的喜好程度来预测其他用户在今年对该产品的喜好程度,那么预测结果显然无法反映出实际状况。如何结合时间因素对指定产品的用户喜好程度值进行有效预测成为目前需要解决的一个技术问题。
发明内容
根据本申请的各种实施例,提供一种融合时间因素的协同过滤方法、装置、服务器和存储介质。
一种融合时间因素的协同过滤方法,包括:
建立指数平滑模型;
获取对所述指数平滑模型拟定的时间段,所述时间段包括多个时间周期;
获取多个用户标识以及用户标识在多个时间周期内对指定产品的用户喜好程度值;
利用所述指数平滑模型对所述用户喜好程度值进行迭代计算,得到与时间周期对应的平滑结果;
利用所述用户标识和所述与时间周期对应的平滑结果生成稀疏矩阵,所述稀疏矩阵包括多个待预测用户喜好程度;
获取协同过滤模型,将所述时间周期对应的平滑结果输入至所述协同过滤模型;及
通过所述协同过滤模型进行训练,计算得到所述稀疏矩阵中的多个待预测用户喜好程度的预测值。
一种融合时间因素的协同过滤装置,包括:
模型建立模块,用于建立指数平滑模型;
获取模块,用于获取对所述指数平滑模型拟定的时间段,所述时间段包括多个时间周期;获取多个用户标识以及用户标识在多个时间周期内对指定产品的用户喜好程度值;
平滑模块,用于利用所述指数平滑模型对所述用户喜好程度值进行迭代计算,得到与时间周期对应的平滑结果;
矩阵生成模块,用于利用所述用户标识和所述与时间周期对应的平滑结果生成稀疏矩阵,所述稀疏矩阵包括多个待预测用户喜好程度;
所述获取模块还用于获取协同过滤模型;及
第一训练模块,用于将所述时间周期对应的平滑结果输入至所述协同过滤模型;通过所述协同过滤模型进行训练,计算得到所述稀疏矩阵中的多个待预测用户喜好程度的预测值。
一种服务器,包括存储器和处理器,所述存储器中储存有计算机可执行指令,所述计算机可执行指令被所述处理器执行时时,使得所述处理器执行以下步骤:
建立指数平滑模型;
获取对所述指数平滑模型拟定的时间段,所述时间段包括多个时间周期;
获取多个用户标识以及用户标识在多个时间周期内对指定产品的用户喜好程度值;
利用所述指数平滑模型对所述用户喜好程度值进行迭代计算,得到与时间周期对应的平滑结果;
利用所述用户标识和所述与时间周期对应的平滑结果生成稀疏矩阵,所述稀疏矩阵包括多个待预测用户喜好程度;
获取协同过滤模型,将所述时间周期对应的平滑结果输入至所述协同过滤模型;及
通过所述协同过滤模型进行训练,计算得到所述稀疏矩阵中的多个待预测用户喜好程度的预测值。
一个或多个存储有计算机可执行指令的非易失性可读存储介质,所述计算机可执行指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
建立指数平滑模型;
获取对所述指数平滑模型拟定的时间段,所述时间段包括多个时间周期;
获取多个用户标识以及用户标识在多个时间周期内对指定产品的用户喜好程度值;
利用所述指数平滑模型对所述用户喜好程度值进行迭代计算,得到与时间周期对应的平滑结果;
利用所述用户标识和所述与时间周期对应的平滑结果生成稀疏矩阵,所述稀疏矩阵包括多个待预测用户喜好程度;
获取协同过滤模型,将所述时间周期对应的平滑结果输入至所述协同过滤模型;及
通过所述协同过滤模型进行训练,计算得到所述稀疏矩阵中的多个待预测用户喜好程度的预测值。
本申请的一个或多个实施例的细节在下面的附图和描述中提出。本申请 的其它特征、目的和优点将从说明书、附图以及权利要求书变得明显。
附图说明
为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他实施例的附图。
图1为一个实施例中融合时间因素的协同过滤方法的应用场景图;
图2为一个实施例中融合时间因素的协同过滤方法的流程图;
图3为一个实施例中二维空间中记录点的示意图;
图4为一个实施例中服务器的框图;
图5为一个实施例中融合时间因素的协同过滤装置的框图;
图6为另一个实施例中融合时间因素的协同过滤装置的框图;
图7为再一个实施例中融合时间因素的协同过滤装置的框图;
图8为还一个实施例中融合时间因素的协同过滤装置的框图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请实施例中提供的融合时间因素的协同过滤方法可以应用于如图1所示的应用场景中。终端102与服务器104通过网络连接。终端102可以有多个。终端102上安装了可以访问服务器的应用程序,用户通过该应用程序访问服务器104时,服务器104向终端102返回相应的页面。用户可以对页面展示的产品进行点击、收藏以及购买等。用户通过终端102进行操作时,服务器104可以采集用户标识以及上述用户行为。服务器104通过在预设的时间周期内对指定产品采集用户行为得到用户喜好程度值。服务器104建立 指数平滑模型。服务器104可以对指数平滑模型拟定对应的时间段,时间段内可以有多个时间周期。服务器104获取多个用户标识以及用户标识在多个时间周期内对指定产品的用户喜好程度值。服务器104将多个时间周期对应的用户喜好程度值输入指数平滑模型,对多个时间周期的用户喜好程度值进行迭代计算,得到多个与时间周期对应的平滑结果。服务器104利用用户标识和时间周期对应的平滑结果生成用户标识与产品标识对应的稀疏矩阵,稀疏矩阵包括多个待预测用户喜好程度。服务器104获取协同过滤模型,将与时间周期对应的平滑结果输入至协同过滤模型。通过协同过滤模型进行训练,计算得到稀疏矩阵中的多个待预测用户喜好程度的预测值。
在一个实施例中,如图2所示,提供了一种融合时间因素的协同过滤方法,应该理解的是,虽然图2的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,其可以以其他的顺序执行。而且,图2中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,其执行顺序也不必然是依次进行,而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。以该方法应用于服务器为例进行说明,具体包括:
步骤202,建立指数平滑模型。
步骤204,获取对指数平滑模型拟定的时间段,时间段包括多个时间周期。
用户喜好程度是指用户对指定产品的喜好程度。用户喜好程度可以采用数值来表示。服务器上预先存储了用户喜好数据。其中,用户喜好数据包括用户标识、产品标识和对应的用户喜好程度值等。用户喜好程度值可以是服务器在预设的时间周期内对指定产品采集用户行为来得到的,用户行为包括:点击、购买和收藏等。用户喜好程度可以是与时间周期相对应的。对于不同的指定产品,用户喜好程度对应的时间周期可以是相同的,也可以是不同的。 例如,游戏产品,用户喜好程度对应的时间周期可以是一天。保险产品,用户喜好程度对应的时间周期可以是一个月或者一个月等。
为了将用户对指定产品的用户喜好程度与时间因素进行有效结合,服务器建立指数平滑模型。通过指数平滑模型将多个时间周期的用户喜好程度进行融合。
在其中一个实施例中,平滑模型的公式包括:Pt+1=a*Pt+(1-a)*Pt-1;其中,a表示产品标识对应的指数系数;Pt+1表示下一个时间周期对应的用户喜好程度值;Pt表示当前时间周期对应的用户喜好程度值;Pt-1表示上一个时间周期对应的用户喜好程度值。
服务器可以对指数平滑模型拟定对应的时间段,时间段内可以有多个时间周期。时间段可以根据指定产品特性来拟定,不同的指定产品可以拟定不同的时间段。例如,对理财产品的指数平滑模型所拟定的时间段可以是一个月,该时间段内的时间周期可以是以天为单位。对保险产品的指数平滑模型所拟定的时间段可以是一年,该时间段内的时间周期可以是以月为单位。
不同的指定产品可以对应不同的指数系数。指数系数可以反映时间周期对用户喜好程度影响的重要性。指数系数越大,时间周期对用户喜好程度的影响的重要性就越高。时间周期彼此之间越接近,对用户喜好程度的影响也就越大。
步骤206,获取多个用户标识以及用户标识在多个时间周期内对指定产品的用户喜好程度值。
服务器对指数平滑模型拟定的时间段中包括多个时间周期,服务器获取多个用户标识以及用户标识在多个时间周期内对指定产品的用户喜好程度值。其中,多个时间周期内对指定产品的用户喜好程度值可以是用户对一个指定产品的用户喜好程度值,也可以是用户对多个指定产品的用户喜好程度值。
步骤208,利用指数平滑模型对用户喜好程度值进行迭代计算,得到时间周期对应的平滑结果。
服务器将多个时间周期对应的用户喜好程度值输入指数平滑模型,对多个时间周期的用户喜好程度值进行迭代计算,得到多个与时间周期对应的平滑结果。具体的,服务器根据产品标识获取指数平滑模型对应的指数系数。服务器将拟定的时间段中的第一个时间周期对应的用户喜好程度值与指数系数相乘,将乘积作为指数平滑模型的初始值,该初始值也可以称为第一时间周期对应的平滑结果。服务器利用第一时间周期对应的平滑结果、第二时间周期对应的用户喜好程度值、指数系数输入指数平滑模型进行迭代计算,得到第二时间周期对应的平滑结果。以此类推,服务器计算得到多个时间周期对应的平滑结果。
假设,指定产品为对产品1,时间周期为一天,平滑指数模型中的指数系数为0.3,拟定的时间段为4天,现在需要预测第5天的用户喜好程度值,那么首先需要利用指数平滑模型对前面4天的用户喜好程度值分别进行迭代计算,得到相应的平滑结果可以如表一所示:
表一:
Figure PCTCN2017079565-appb-000001
其中,第一天的平滑结果为:0.3*8=2.4;第二天的平滑结果为:0.3*9+(1-0.3)*2.4=4.38;第三天的平滑结果为:0.3*5+(1-0.3)*4.38=4.566;第四天的平滑结果为:0.3*3+(1-0.3)*4.566=4.096。由此通过指数平滑模型将指定产品的用户喜好程度值与时间因素进行了融合。
步骤210,利用用户标识和与时间周期对应的平滑结果生成稀疏矩阵,稀疏矩阵包括多个待预测用户喜好程度。
服务器利用用户标识和时间周期对应的平滑结果生成用户标识与产品标识对应的稀疏矩阵。稀疏矩阵中可以包括多个用户标识和一个产品标识,也可以包括多个用户标识和多个产品标识。系数矩阵中包括已知的用户喜好程 度值和未知的用户喜好程度值。其中,未知的用户喜好程度值也就是待预测用户喜好程度的预测值。
在稀疏矩阵中,待预测用户喜好程度的预测值可以用预设字符来表示。例如,用?来表示。举例,稀疏矩阵中的行表示产品标识,列表示用户标识,稀疏矩阵中的数值表示用户对产品的用户喜好程度值,如下表二所示:
Figure PCTCN2017079565-appb-000002
由于稀疏矩阵中的用户喜好程度值采用的是与时间周期对应的平滑结果,因此稀疏矩阵也与时间因素进行了有效融合。当需要预测下一个时间周期内未知的用户喜好程度值时,服务器获取当前时间周期内的产品标识、用户标识以及用户喜好程度值在当前时间周期的平滑结果来生成用户标识与产品标识对应的稀疏矩阵。
步骤212,获取协同过滤模型,将与时间周期对应的平滑结果输入至协同过滤模型。
步骤214,通过协同过滤模型进行训练,计算得到稀疏矩阵中的多个待预测用户喜好程度的预测值。
协同过滤模型可以采用传统的协同过滤模型。服务器获取协同过滤模型,将与时间周期对应的平滑结果输入至协同过滤模型。通过协同过滤模型进行训练,计算得到稀疏矩阵中的多个待预测用户喜好程度的预测值。
具体的,当预测下一个时间周期内未知的用户喜好程度值时,服务器获取多个用户标识在上一个时间周期的平滑结果,将上一个时间周期的平滑处理结果输入至协同过滤模型。通过协同过滤模型进行训练,计算出用户标识与产品标识对应的稀疏矩阵中的待预测用户喜好程度在下一个时间周期的预 测值。
本实施例中,通过建立指数平滑模型,将多个时间周期内的用户喜好程度值进行迭代计算,得到与时间周期对应的平滑结果,从而使得对指定产品的用户喜好程度值与时间因素进行了有效融合。当预测下一个时间周期内未知的用户喜好程度值时,可以利用用户标识和与时间周期对应的平滑结果生成稀疏矩阵,将与时间周期对应的平滑结果输入至协同过滤模型,通过协同过滤模型进行训练,从而计算得到稀疏矩阵中的多个待预测用户喜好程度的预测值。由于输入至协同过滤模型的平滑结果是与时间因素进行了融合的,由此能够预测出对指定产品与时间因素相关的用户喜好程度值。从而实现了结合时间因素对指定产品的用户喜好程度进行有效预测。
在一个实施例中,在计算得到稀疏矩阵中的多个待预测用户喜好程度的预测值的步骤之后,还包括:获取用户喜好程度值对应的维度;根据用户标识对多个维度的用户喜好程度值进行统计;对统计结果进行正则化处理,得到用户标识对应的多维向量;根据多维向量计算用户标识彼此之间的用户喜好的相似度。
本实施例中,服务器对稀疏矩阵中多个待预测用户喜好程度计算出相应的预测值之后,还可以对所有已知的和预测出的用户喜好程度值进行相似度计算,从而得到用户喜好程序相似的多个用户标识。
服务器可以将产品标识作为用户喜好程度值对应的维度。不同的产品标识也就是不同的维度。用户喜好程度值可以视为空间中散落的记录点。以空间为二维空间的地图为例,如图3所示,每个记录点可以用经度和纬度来表示。图3中的X轴可以表示维度,Y轴表示经度。假设,用户标识1的用户喜好程度值在图3中的记录点采用黑色点来表示,用户标识2的用户喜好程度值在图3中的记录点采用灰色点来表示。用户标识1的记录点有4个,用户标识2的记录点有3个。由于每个记录点的经度和纬度不同,无法直接进行相似度比较。如果利用经度均值和维度均值组成的均值点来进行比较,均值点显然已经严重偏离了用户的记录点,不能表达真实的用户喜好程度值。
为了对用户喜好程度值进行有效比较,服务器对所有的记录点进行聚类,例如,服务器可以采用KMeans算法(一种聚类算法)进行聚类得到多个类。每一类都可以有对应的维度。每一类中包括多个用户标识对应的用户喜好程度值的记录点。
服务器根据用户标识对多个维度的用户喜好程度值进行统计,得到用户喜好程度值的统计结果。服务器对统计结果进行正则化处理,得到用户标识对应的多维向量,根据多维向量计算用户标识彼此之间的相似距离,将相似距离作为用户喜好的相似度。
以图3中的用户标识1和用户标识2对应的记录点为例进行说明。服务器对图3中的记录点进行聚类,得到三个维度。其中,用户标识1在第一维度中有2个记录点,在第二维度中有1个记录点,在第三维度中有1个记录点。用户标识2在第一维度中有2个记录点,在第二维度中有1个记录点,在第三维度中有0个记录点。服务器统计用户标识1对应的用户喜好程度值的记录点总数为4个,用户标识2对应的用户喜好程度值的记录点总数为3个。服务器对统计结果进行正则化处理,得到用户标识1对应的多维向量(2/4,1/4,1/4)以及用户标识2对应的多维向量(2/4,1/4,1/4)。根据多维向量计算用户标识1与用户标识2之间的相似距离,将该相似距离作为用户喜好的相似度。相似距离的计算方法可以有多种,例如采用欧式距离的计算方法等来计算相似距离。
通过计算用户标识彼此之间用户喜好的相似度,由此可以在海量的用户中有效提取出用户喜好相似的用户。进而方便对用户喜好相似的用户进行消息推荐和消费倾向进行预测。
在一个实施例中,该方法还包括:根据产品标识和用户标识获取用户喜好程度值对应的正样本和负样本;将负样本进行拆分,得到多个拆分后的负样本,拆分后的负样本的数量与正样本的数量的差值在预设范围内;获取分类模型,利用正样本和拆分后的负样本对分类模型进行训练,得到多个训练后的分类模型;对多个训练后的分类模型进行拟合,计算得到每个训练后的 分类模型对应的分类权重。
本实施例中,服务器还可以根据产品标识和用户标识获取用户喜好程度值对应的正样本和负样本。正样本表示用户喜欢某产品,负样本表示用户不喜欢某产品。例如,正样本为用户1喜欢iPhone7(一种手机),负样本为用户2不喜欢iPhone7。用户喜好程度值包括已知的用户喜好程度值和预测出的用户喜好程度值。服务器可以采用已知的用户喜好程度值来进行分类训练,也可以采用已知的用户喜好程度值和预测出的用户喜好程度值进行分类训练。
正样本和负样本可以统称为样本。服务器上预先存储了相应的样本数据,样本数据包括用户特征数据和产品特征数据。其中,用户特征数据包括用户的年龄和性别等,产品特征数据包括产品标识和产品类型等。
通常在一个新产品推出时,喜好该新产品的用户数量要远远小于不喜欢该新产品的用户数量。由此造成用户对某个产品的正样本数量要小于负样本的数量。
传统的分类训练方式主要有两种。传统的方式一是通过在负样本进行欠抽样,得到与正样本数量相当的负样本,利用欠抽样的负样本与正样本进行分类训练。但是由于欠抽样的负样本只是负样本中的一小部分数据,没有完全利用所有样本数据,导致分类模型不够准确。传统的方式二是通过将正样本进行复制,使得正样本的数量与负样本的数量基本持平。虽然传统的方式二中没有增加额外的样本信息,但是由于负样本的数量要远远大于正样本的数量,正样本复制后,导致需要计算的数据量激增,加重了服务器的运算负担。
为了有效解决传统方式中出现的样本数据未充分利用以及样本数据被全部采用后导致服务器运算负担加重的问题,本实施例中提供了一种新的分类训练方式。
具体的,服务器根据产品标识和用户标识获取用户喜好程度值对应的正样本和负样本。服务器根据正样本的数量对负样本进行拆分。拆分后的负样 本的数量与正样本的数量的差值在预设范围内。拆分后的负样本的数量与正样本的数量相等或持平。服务器获取分类模型,其中,分类模型可以采用传统的分类模型。服务器将每一份拆分后的负样本和正样本输入分类模型进行训练,得到与拆分后的负样本数量相同的训练后的分类模型。
服务器获取回归模型,其中,回归模型可以采用传统的回归模型。服务器将多个训练后的分类模型的输出结果输入至回归模型,通过回归模型对多个训练后的分类模型进行拟合,计算得到每个训练后的分类模型对应的分类权重。在整个过程中,不仅充分利用了所有的样本数据,而且需要计算的数据来也没有激增,有效缓解了服务器的运算负担。
在其中一个实施例中,在计算得到每个训练后的分类模型对应的分类权重的步骤之后,还包括:获取待分类样本数据;利用训练后的分类模型和分类权重对待分类样本数据进行分类。
服务器可以获取待分类的样本数据,将待分类样本数据分别输入至训练后的分类模型,利用每个训练后的分类模型和分类权重对待分类样本数据进行分类。由此可以对待分类样本数据进行快速有效的分类。
在一个实施例中,如图4所示,提供了一种服务器,包括通过系统总线连接的处理器、内存储器、非易失性存储介质和网络接口。其中,该服务器的非易失性存储介质中存储有操作系统和计算机可执行指令,该计算机可执行指令用于实现适用于服务器的一种融合时间因素的协同过滤方法。处理器用于提供计算和控制能力,支撑整个服务器的运行。网络接口用于据以与外部的终端通过网络连接通信。服务器可以用独立的服务器或者是多个服务器组成的服务器集群来实现。本领域技术人员可以理解,图4中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的服务器的限定,具体的服务器可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,如图5所示,提供了一种融合时间因素的协同过滤装置,包括:模型建立模块502、获取模块504、平滑模块506、矩阵生成模块 508和第一训练模块510,其中:
模型建立模块502,用于建立指数平滑模型。
获取模块504,用于获取对指数平滑模型拟定的时间段,时间段包括多个时间周期;获取多个用户标识以及用户标识在多个时间周期内对指定产品的用户喜好程度值。
平滑模块506,用于利用指数平滑模型对用户喜好程度值进行迭代计算,得到与时间周期对应的平滑结果。
矩阵生成模块508,用于利用用户标识和与时间周期对应的平滑结果生成稀疏矩阵,稀疏矩阵包括多个待预测用户喜好程度。
获取模块504还用于获取协同过滤模型。
第一训练模块510,用于将时间周期对应的平滑结果输入至协同过滤模型;通过协同过滤模型进行训练,计算得到稀疏矩阵中的多个待预测用户喜好程度的预测值。
在一个实施例中,平滑模型的公式包括:
Pt+1=a*Pt+(1-a)*Pt-1
其中,a表示产品标识对应的指数系数;Pt+1表示下一个时间周期对应的用户喜好程度值;Pt表示当前时间周期对应的用户喜好程度值;Pt-1表示上一个时间周期对应的用户喜好程度值。
在一个实施例中,获取模块504还用于获取用户喜好程度值对应的维度;如图5所示,该装置还包括:统计模块512、正则化模块514和相似度计算模块516,其中:
统计模块512,用于根据用户标识对多个维度的用户喜好程度值进行统计。
正则化模块514,用于对统计结果进行正则化处理,得到用户标识对应的多维向量。
相似度计算模块516,用于根据多维向量计算用户标识彼此之间的用户喜好的相似度。
在一个实施例中,获取模块504还用于根据产品标识和用户标识获取用户喜好程度值对应的正样本和负样本;如图6所示,该装置还包括:拆分模块518、第二训练模块520和拟合模块522,其中:
拆分模块518,用于将负样本进行拆分,得到多个拆分后的负样本,拆分后的负样本的数量与正样本的数量的差值在预设范围内。
获取模块504还用于获取分类模型。
第二训练模块520,用于利用正样本和拆分后的负样本对分类模型进行训练,得到多个训练后的分类模型。
拟合模块522,用于对多个训练后的分类模型进行拟合,计算得到每个训练后的分类模型对应的分类权重。
在一个实施例中,获取模块504还用于获取待分类样本数据;如图7所示,该装置还包括:分类模块524,用于利用训练后的分类模型和分类权重对待分类样本数据进行分类。
上述融合时间因素的协同过滤装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于基站的处理器中,也可以以软件形式存储于基站的存储器中,以便于处理器调用执行以上各个模块对应的操作。其中,处理器可以为中央处理单元(CPU)或微处理器等。
在一个实施例中,提供了一个或多个存储有计算机可执行指令的非易失性可读存储介质,所述计算机可执行指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
建立指数平滑模型;
获取对所述指数平滑模型拟定的时间段,所述时间段包括多个时间周期;
获取多个用户标识以及用户标识在多个时间周期内对指定产品的用户喜好程度值;
利用所述指数平滑模型对所述用户喜好程度值进行迭代计算,得到与时间周期对应的平滑结果;
利用所述用户标识和所述与时间周期对应的平滑结果生成稀疏矩阵,所述稀疏矩阵包括多个待预测用户喜好程度;
获取协同过滤模型,将所述时间周期对应的平滑结果输入至所述协同过滤模型;及
通过所述协同过滤模型进行训练,计算得到所述稀疏矩阵中的多个待预测用户喜好程度的预测值。
在一个实施例中,所述平滑模型的公式包括:
Pt+1=a*Pt+(1-a)*Pt-1
其中,a表示产品标识对应的指数系数;Pt+1表示下一个时间周期对应的用户喜好程度值;Pt表示当前时间周期对应的用户喜好程度值;Pt-1表示上一个时间周期对应的用户喜好程度值。
在一个实施例中,在所述计算得到所述稀疏矩阵中的多个待预测用户喜好程度的预测值的步骤之后,所述计算机可执行指令被一个或多个处理器执行时,还使得所述一个或多个处理器执行以下步骤:获取用户喜好程度值对应的维度;根据用户标识对多个维度的用户喜好程度值进行统计;对统计结果进行正则化处理,得到用户标识对应的多维向量;及根据所述多维向量计算用户标识彼此之间的用户喜好的相似度。
在一个实施例中,所述计算机可执行指令被一个或多个处理器执行时,还使得所述一个或多个处理器执行以下步骤:根据产品标识和用户标识获取用户喜好程度值对应的正样本和负样本;将所述负样本进行拆分,得到多个拆分后的负样本,所述拆分后的负样本的数量与所述正样本的数量的差值在预设范围内;获取分类模型,利用所述正样本和所述拆分后的负样本对所述分类模型进行训练,得到多个训练后的分类模型;及对所述多个训练后的分类模型进行拟合,计算得到每个训练后的分类模型对应的分类权重。
在一个实施例中,在所述计算得到每个训练后的分类模型对应的分类权重的步骤之后,所述计算机可执行指令被一个或多个处理器执行时,还使得所述一个或多个处理器执行以下步骤:获取待分类样本数据;及利用所述训 练后的分类模型和所述分类权重对所述待分类样本数据进行分类。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一非易失性计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上实施例仅表达了本发明的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本发明构思的前提下,还可以做出若干变形和改进,这些都属于本发明的保护范围。因此,本发明专利的保护范围应以所附权利要求为准。

Claims (20)

  1. 一种融合时间因素的协同过滤方法,包括:
    建立指数平滑模型;
    获取对所述指数平滑模型拟定的时间段,所述时间段包括多个时间周期;
    获取多个用户标识以及用户标识在多个时间周期内对指定产品的用户喜好程度值;
    利用所述指数平滑模型对所述用户喜好程度值进行迭代计算,得到与时间周期对应的平滑结果;
    利用所述用户标识和所述与时间周期对应的平滑结果生成稀疏矩阵,所述稀疏矩阵包括多个待预测用户喜好程度;
    获取协同过滤模型,将所述时间周期对应的平滑结果输入至所述协同过滤模型;及
    通过所述协同过滤模型进行训练,计算得到所述稀疏矩阵中的多个待预测用户喜好程度的预测值。
  2. 根据权利要求1所述的方法,其特征在于,所述平滑模型的公式包括:
    Pt+1=a*Pt+(1-a)*Pt-1
    其中,a表示产品标识对应的指数系数;Pt+1表示下一个时间周期对应的用户喜好程度值;Pt表示当前时间周期对应的用户喜好程度值;Pt-1表示上一个时间周期对应的用户喜好程度值。
  3. 根据权利要求1所述的方法,其特征在于,在所述计算得到所述稀疏矩阵中的多个待预测用户喜好程度的预测值的步骤之后,所述方法还包括:
    获取用户喜好程度值对应的维度;
    根据用户标识对多个维度的用户喜好程度值进行统计;
    对统计结果进行正则化处理,得到用户标识对应的多维向量;及
    根据所述多维向量计算用户标识彼此之间的用户喜好的相似度。
  4. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    根据产品标识和用户标识获取用户喜好程度值对应的正样本和负样本;
    将所述负样本进行拆分,得到多个拆分后的负样本,所述拆分后的负样本的数量与所述正样本的数量的差值在预设范围内;
    获取分类模型,利用所述正样本和所述拆分后的负样本对所述分类模型进行训练,得到多个训练后的分类模型;及
    对所述多个训练后的分类模型进行拟合,计算得到每个训练后的分类模型对应的分类权重。
  5. 根据权利要求4所述的方法,其特征在于,在所述计算得到每个训练后的分类模型对应的分类权重的步骤之后,所述方法还包括:
    获取待分类样本数据;及
    利用所述训练后的分类模型和所述分类权重对所述待分类样本数据进行分类。
  6. 一种融合时间因素的协同过滤装置,包括:
    模型建立模块,用于建立指数平滑模型;
    获取模块,用于获取对所述指数平滑模型拟定的时间段,所述时间段包括多个时间周期;获取多个用户标识以及用户标识在多个时间周期内对指定产品的用户喜好程度值;
    平滑模块,用于利用所述指数平滑模型对所述用户喜好程度值进行迭代计算,得到与时间周期对应的平滑结果;
    矩阵生成模块,用于利用所述用户标识和所述与时间周期对应的平滑结果生成稀疏矩阵,所述稀疏矩阵包括多个待预测用户喜好程度;
    所述获取模块还用于获取协同过滤模型;及
    第一训练模块,用于将所述时间周期对应的平滑结果输入至所述协同过滤模型;通过所述协同过滤模型进行训练,计算得到所述稀疏矩阵中的多个待预测用户喜好程度的预测值。
  7. 根据权利要求6所述的装置,其特征在于,所述平滑模型的公式包括:
    Pt+1=a*Pt+(1-a)*Pt-1
    其中,a表示产品标识对应的指数系数;Pt+1表示下一个时间周期对应的用户喜好程度值;Pt表示当前时间周期对应的用户喜好程度值;Pt-1表示上一个时间周期对应的用户喜好程度值。
  8. 根据权利要求6所述的装置,其特征在于,所述获取模块还用于获取用户喜好程度值对应的维度;
    所述装置还包括:
    统计模块,用于根据用户标识对多个维度的用户喜好程度值进行统计;
    正则化模块,用于对统计结果进行正则化处理,得到用户标识对应的多维向量;及
    相似度计算模块,用于根据所述多维向量计算用户标识彼此之间的用户喜好的相似度。
  9. 根据权利要求6所述的装置,其特征在于,所述获取模块还用于根据产品标识和用户标识获取用户喜好程度值对应的正样本和负样本;
    所述装置还包括:
    拆分模块,用于将所述负样本进行拆分,得到多个拆分后的负样本,所述拆分后的负样本的数量与所述正样本的数量的差值在预设范围内;
    所述获取模块还用于获取分类模型;
    第二训练模块,用于利用所述正样本和所述拆分后的负样本对所述分类模型进行训练,得到多个训练后的分类模型;及
    拟合模块,用于对所述多个训练后的分类模型进行拟合,计算得到每个训练后的分类模型对应的分类权重。
  10. 根据权利要求9所述的装置,其特征在于,所述获取模块还用于获取待分类样本数据;
    所述装置还包括:
    分类模块,用于利用所述训练后的分类模型和所述分类权重对所述待分类样本数据进行分类。
  11. 一种服务器,包括存储器和处理器,所述存储器中储存有计算机可执行指令,所述计算机可执行指令被所述处理器执行时时,使得所述处理器执行以下步骤:
    建立指数平滑模型;
    获取对所述指数平滑模型拟定的时间段,所述时间段包括多个时间周期;
    获取多个用户标识以及用户标识在多个时间周期内对指定产品的用户喜好程度值;
    利用所述指数平滑模型对所述用户喜好程度值进行迭代计算,得到与时间周期对应的平滑结果;
    利用所述用户标识和所述与时间周期对应的平滑结果生成稀疏矩阵,所述稀疏矩阵包括多个待预测用户喜好程度;
    获取协同过滤模型,将所述时间周期对应的平滑结果输入至所述协同过滤模型;及
    通过所述协同过滤模型进行训练,计算得到所述稀疏矩阵中的多个待预测用户喜好程度的预测值。
  12. 根据权利要求11所述的服务器,其特征在于,所述平滑模型的公式包括:
    Pt+1=a*Pt+(1-a)*Pt-1
    其中,a表示产品标识对应的指数系数;Pt+1表示下一个时间周期对应的用户喜好程度值;Pt表示当前时间周期对应的用户喜好程度值;Pt-1表示上一个时间周期对应的用户喜好程度值。
  13. 根据权利要求11所述的服务器,其特征在于,在所述计算得到所述稀疏矩阵中的多个待预测用户喜好程度的预测值的步骤之后,还使得所述处理器执行以下步骤:
    获取用户喜好程度值对应的维度;
    根据用户标识对多个维度的用户喜好程度值进行统计;
    对统计结果进行正则化处理,得到用户标识对应的多维向量;及
    根据所述多维向量计算用户标识彼此之间的用户喜好的相似度。
  14. 根据权利要求11所述的服务器,其特征在于,还使得所述处理器执行以下步骤:
    根据产品标识和用户标识获取用户喜好程度值对应的正样本和负样本;
    将所述负样本进行拆分,得到多个拆分后的负样本,所述拆分后的负样本的数量与所述正样本的数量的差值在预设范围内;
    获取分类模型,利用所述正样本和所述拆分后的负样本对所述分类模型进行训练,得到多个训练后的分类模型;及
    对所述多个训练后的分类模型进行拟合,计算得到每个训练后的分类模型对应的分类权重。
  15. 根据权利要求14所述的服务器,其特征在于,在所述计算得到每个训练后的分类模型对应的分类权重的步骤之后,还使得所述处理器执行以下步骤:
    获取待分类样本数据;及
    利用所述训练后的分类模型和所述分类权重对所述待分类样本数据进行分类。
  16. 一个或多个存储有计算机可执行指令的非易失性可读存储介质,所述计算机可执行指令被一个或多个处理器执行时,使得所述一个或多个处理器执行以下步骤:
    建立指数平滑模型;
    获取对所述指数平滑模型拟定的时间段,所述时间段包括多个时间周期;
    获取多个用户标识以及用户标识在多个时间周期内对指定产品的用户喜好程度值;
    利用所述指数平滑模型对所述用户喜好程度值进行迭代计算,得到与时间周期对应的平滑结果;
    利用所述用户标识和所述与时间周期对应的平滑结果生成稀疏矩阵,所述稀疏矩阵包括多个待预测用户喜好程度;
    获取协同过滤模型,将所述时间周期对应的平滑结果输入至所述协同过滤模型;及
    通过所述协同过滤模型进行训练,计算得到所述稀疏矩阵中的多个待预测用户喜好程度的预测值。
  17. 根据权利要求16所述的非易失性可读存储介质,其特征在于,所述平滑模型的公式包括:
    Pt+1=a*Pt+(1-a)*Pt-1
    其中,a表示产品标识对应的指数系数;Pt+1表示下一个时间周期对应的用户喜好程度值;Pt表示当前时间周期对应的用户喜好程度值;Pt-1表示上一个时间周期对应的用户喜好程度值。
  18. 根据权利要求16所述的非易失性可读存储介质,其特征在于,在所述计算得到所述稀疏矩阵中的多个待预测用户喜好程度的预测值的步骤之后,所述计算机可执行指令被一个或多个处理器执行时,还使得所述一个或多个处理器执行以下步骤:
    获取用户喜好程度值对应的维度;
    根据用户标识对多个维度的用户喜好程度值进行统计;
    对统计结果进行正则化处理,得到用户标识对应的多维向量;及
    根据所述多维向量计算用户标识彼此之间的用户喜好的相似度。
  19. 根据权利要求16所述的非易失性可读存储介质,其特征在于,所述计算机可执行指令被一个或多个处理器执行时,还使得所述一个或多个处理器执行以下步骤:
    根据产品标识和用户标识获取用户喜好程度值对应的正样本和负样本;
    将所述负样本进行拆分,得到多个拆分后的负样本,所述拆分后的负样本的数量与所述正样本的数量的差值在预设范围内;
    获取分类模型,利用所述正样本和所述拆分后的负样本对所述分类模型进行训练,得到多个训练后的分类模型;及
    对所述多个训练后的分类模型进行拟合,计算得到每个训练后的分类模 型对应的分类权重。
  20. 根据权利要求19所述的非易失性可读存储介质,其特征在于,在所述计算得到每个训练后的分类模型对应的分类权重的步骤之后,所述计算机可执行指令被一个或多个处理器执行时,还使得所述一个或多个处理器执行以下步骤:
    获取待分类样本数据;及
    利用所述训练后的分类模型和所述分类权重对所述待分类样本数据进行分类。
PCT/CN2017/079565 2016-11-15 2017-04-06 融合时间因素的协同过滤方法、装置、服务器和存储介质 WO2018090545A1 (zh)

Priority Applications (6)

Application Number Priority Date Filing Date Title
US15/578,368 US10565525B2 (en) 2016-11-15 2017-04-06 Collaborative filtering method, apparatus, server and storage medium in combination with time factor
KR1020187015328A KR102251302B1 (ko) 2016-11-15 2017-04-06 시간 인자와 결합한 협업 필터링 방법, 장치, 서버 및 저장 매체
SG11201709930TA SG11201709930TA (en) 2016-11-15 2017-04-06 Collaborative filtering method, apparatus, server and storage medium in combination with time factor
JP2017566628A JP6484730B2 (ja) 2016-11-15 2017-04-06 時間因子を融合させる協調フィルタリング方法、装置、サーバおよび記憶媒体
EP17801315.7A EP3543941A4 (en) 2016-11-15 2017-04-06 METHOD, DEVICE, SERVER AND STORAGE MEDIUM FOR COLLABORATIVE TIME FACTOR FILTER FILTERING
AU2017268629A AU2017268629A1 (en) 2016-11-15 2017-04-06 Collaborative filtering method, apparatus, server and storage medium in combination with time factor

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201611005200.1A CN106530010B (zh) 2016-11-15 2016-11-15 融合时间因素的协同过滤方法和装置
CN201611005200.1 2016-11-15

Publications (1)

Publication Number Publication Date
WO2018090545A1 true WO2018090545A1 (zh) 2018-05-24

Family

ID=58353220

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/079565 WO2018090545A1 (zh) 2016-11-15 2017-04-06 融合时间因素的协同过滤方法、装置、服务器和存储介质

Country Status (9)

Country Link
US (1) US10565525B2 (zh)
EP (1) EP3543941A4 (zh)
JP (1) JP6484730B2 (zh)
KR (1) KR102251302B1 (zh)
CN (1) CN106530010B (zh)
AU (2) AU2017268629A1 (zh)
SG (1) SG11201709930TA (zh)
TW (1) TWI658420B (zh)
WO (1) WO2018090545A1 (zh)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111209977A (zh) * 2020-01-16 2020-05-29 北京百度网讯科技有限公司 分类模型的训练和使用方法、装置、设备和介质
CN111652741A (zh) * 2020-04-30 2020-09-11 中国平安财产保险股份有限公司 用户偏好分析方法、装置及可读存储介质

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110378731B (zh) * 2016-04-29 2021-04-20 腾讯科技(深圳)有限公司 获取用户画像的方法、装置、服务器及存储介质
CN106530010B (zh) 2016-11-15 2017-12-12 平安科技(深圳)有限公司 融合时间因素的协同过滤方法和装置
CN107633254A (zh) * 2017-07-25 2018-01-26 平安科技(深圳)有限公司 建立预测模型的装置、方法及计算机可读存储介质
CN109060332A (zh) * 2018-08-13 2018-12-21 重庆工商大学 一种基于协同过滤融合进行声波信号分析的机械设备诊断法
CN109800359B (zh) * 2018-12-20 2021-08-17 北京百度网讯科技有限公司 信息推荐处理方法、装置、电子设备及可读存储介质
US10984461B2 (en) * 2018-12-26 2021-04-20 Paypal, Inc. System and method for making content-based recommendations using a user profile likelihood model
CN110580311B (zh) * 2019-06-21 2023-08-01 东华大学 动态时间感知协同过滤方法
CN110458664B (zh) * 2019-08-06 2021-02-02 上海新共赢信息科技有限公司 一种用户出行信息预测方法、装置、设备及存储介质
CN110992127B (zh) * 2019-11-14 2023-09-29 北京沃东天骏信息技术有限公司 一种物品推荐方法及装置
CN111178986B (zh) * 2020-02-18 2023-04-07 电子科技大学 用户-商品偏好的预测方法及系统

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103235823A (zh) * 2013-05-06 2013-08-07 上海河广信息科技有限公司 根据相关网页和当前行为确定用户当前兴趣的方法和系统
US20150039620A1 (en) * 2013-07-31 2015-02-05 Google Inc. Creating personalized and continuous playlists for a content sharing platform based on user history
CN105975483A (zh) * 2016-04-25 2016-09-28 北京三快在线科技有限公司 一种基于用户偏好的消息推送方法和平台
CN106530010A (zh) * 2016-11-15 2017-03-22 平安科技(深圳)有限公司 融合时间因素的协同过滤方法和装置

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2438602A (en) * 2000-10-18 2002-04-29 Johnson & Johnson Consumer Intelligent performance-based product recommendation system
US8392228B2 (en) * 2010-03-24 2013-03-05 One Network Enterprises, Inc. Computer program product and method for sales forecasting and adjusting a sales forecast
US7657526B2 (en) * 2006-03-06 2010-02-02 Veveo, Inc. Methods and systems for selecting and presenting content based on activity level spikes associated with the content
JP5778626B2 (ja) 2012-06-18 2015-09-16 日本電信電話株式会社 アイテム利用促進装置、アイテム利用促進装置の動作方法およびコンピュータプログラム
KR101676219B1 (ko) * 2012-12-17 2016-11-14 아마데우스 에스.에이.에스. 상호작용식 검색 폼을 위한 추천 엔진
US9348924B2 (en) * 2013-03-15 2016-05-24 Yahoo! Inc. Almost online large scale collaborative filtering based recommendation system
US20140272914A1 (en) 2013-03-15 2014-09-18 William Marsh Rice University Sparse Factor Analysis for Learning Analytics and Content Analytics
CN103473354A (zh) * 2013-09-25 2013-12-25 焦点科技股份有限公司 基于电子商务平台的保险推荐系统框架及保险推荐方法
CN104021163B (zh) * 2014-05-28 2017-10-24 深圳市盛讯达科技股份有限公司 产品推荐系统及方法
CN104166668B (zh) * 2014-06-09 2018-02-23 南京邮电大学 基于folfm模型的新闻推荐系统及方法
KR101658714B1 (ko) * 2014-12-22 2016-09-21 연세대학교 산학협력단 온라인 활동 이력에 기초한 사용자의 온라인 활동 예측 방법 및 시스템
CN105205128B (zh) * 2015-09-14 2018-08-28 清华大学 基于评分特征的时序推荐方法及推荐装置
CN106960354A (zh) * 2016-01-11 2017-07-18 中国移动通信集团河北有限公司 一种基于客户生命周期的精准化推荐方法及装置
CN107194754A (zh) * 2017-04-11 2017-09-22 美林数据技术股份有限公司 基于混合协同过滤的券商产品推荐方法
CN107169052B (zh) * 2017-04-26 2019-03-05 北京小度信息科技有限公司 推荐方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103235823A (zh) * 2013-05-06 2013-08-07 上海河广信息科技有限公司 根据相关网页和当前行为确定用户当前兴趣的方法和系统
US20150039620A1 (en) * 2013-07-31 2015-02-05 Google Inc. Creating personalized and continuous playlists for a content sharing platform based on user history
CN105975483A (zh) * 2016-04-25 2016-09-28 北京三快在线科技有限公司 一种基于用户偏好的消息推送方法和平台
CN106530010A (zh) * 2016-11-15 2017-03-22 平安科技(深圳)有限公司 融合时间因素的协同过滤方法和装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3543941A4 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111209977A (zh) * 2020-01-16 2020-05-29 北京百度网讯科技有限公司 分类模型的训练和使用方法、装置、设备和介质
CN111209977B (zh) * 2020-01-16 2024-01-05 北京百度网讯科技有限公司 分类模型的训练和使用方法、装置、设备和介质
CN111652741A (zh) * 2020-04-30 2020-09-11 中国平安财产保险股份有限公司 用户偏好分析方法、装置及可读存储介质
CN111652741B (zh) * 2020-04-30 2023-06-09 中国平安财产保险股份有限公司 用户偏好分析方法、装置及可读存储介质

Also Published As

Publication number Publication date
US10565525B2 (en) 2020-02-18
JP6484730B2 (ja) 2019-03-13
CN106530010B (zh) 2017-12-12
TWI658420B (zh) 2019-05-01
JP2019507398A (ja) 2019-03-14
AU2017101862A4 (en) 2019-10-31
SG11201709930TA (en) 2018-06-28
KR20190084866A (ko) 2019-07-17
US20180300648A1 (en) 2018-10-18
CN106530010A (zh) 2017-03-22
EP3543941A4 (en) 2020-07-29
KR102251302B1 (ko) 2021-05-13
EP3543941A1 (en) 2019-09-25
AU2017268629A1 (en) 2018-05-31
TW201820231A (zh) 2018-06-01

Similar Documents

Publication Publication Date Title
WO2018090545A1 (zh) 融合时间因素的协同过滤方法、装置、服务器和存储介质
US10846643B2 (en) Method and system for predicting task completion of a time period based on task completion rates and data trend of prior time periods in view of attributes of tasks using machine learning models
TWI702844B (zh) 用戶特徵的生成方法、裝置、設備及儲存介質
CN108829808B (zh) 一种页面个性化排序方法、装置及电子设备
CN106503022B (zh) 推送推荐信息的方法和装置
US11109083B2 (en) Utilizing a deep generative model with task embedding for personalized targeting of digital content through multiple channels across client devices
WO2019076173A1 (zh) 内容推送方法、装置及计算机设备
US20210056458A1 (en) Predicting a persona class based on overlap-agnostic machine learning models for distributing persona-based digital content
CN110321422A (zh) 在线训练模型的方法、推送方法、装置以及设备
US11429653B2 (en) Generating estimated trait-intersection counts utilizing semantic-trait embeddings and machine learning
JP2017535857A (ja) 変換されたデータを用いた学習
CN111783810B (zh) 用于确定用户的属性信息的方法和装置
US10318540B1 (en) Providing an explanation of a missing fact estimate
WO2019061664A1 (zh) 电子装置、基于用户上网数据的产品推荐方法及存储介质
US20190273789A1 (en) Establishing and utilizing behavioral data thresholds for deep learning and other models to identify users across digital space
US20210241072A1 (en) Systems and methods of business categorization and service recommendation
Inoue et al. Estimating customer impatience in a service system with unobserved balking
Song et al. A novel QoS-aware prediction approach for dynamic web services
US20140324524A1 (en) Evolving a capped customer linkage model using genetic models
Mahendran et al. A model robust subsampling approach for Generalised Linear Models in big data settings
CN110704648B (zh) 用户行为属性的确定方法、装置、服务器及存储介质
US20230126932A1 (en) Recommended audience size
US20230316325A1 (en) Generation and implementation of a configurable measurement platform using artificial intelligence (ai) and machine learning (ml) based techniques
CN117319475A (zh) 通信资源推荐方法、装置、计算机设备和存储介质
CN118227677A (zh) 信息推荐及信息推荐模型处理方法、装置、设备和介质

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 11201709930T

Country of ref document: SG

Ref document number: 15578368

Country of ref document: US

WWE Wipo information: entry into national phase

Ref document number: 2017566628

Country of ref document: JP

ENP Entry into the national phase

Ref document number: 20187015328

Country of ref document: KR

Kind code of ref document: A

ENP Entry into the national phase

Ref document number: 2017268629

Country of ref document: AU

Date of ref document: 20170406

Kind code of ref document: A

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17801315

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE