CN113515690A

CN113515690A - Training method of content recall model, content recall method, device and equipment

Info

Publication number: CN113515690A
Application number: CN202110003674.7A
Authority: CN
Inventors: 赵忠; 傅妍玫; 蒋宏伟
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-01-04
Filing date: 2021-01-04
Publication date: 2021-10-19

Abstract

The application discloses a training method of a content recall model, a content recall method, a device and equipment, and relates to the field of artificial intelligence. The method comprises the following steps: acquiring sample data, wherein the sample data comprises sample user data and sample content data; constructing a first content recall model and a second content recall model, wherein the first content recall model corresponds to at least two content recall targets, the second content recall model corresponds to a fusion recall target, and the fusion recall target is obtained by fusing at least two content recall targets; and training a first content recall model and a second content recall model based on the sample data, wherein the second content recall model is obtained by training in a knowledge distillation mode based on an output result of the first content recall model. Because the second content recall model is essentially a single target recall model, the content recall efficiency can be improved on the premise of ensuring the content recall accuracy when the second content recall model is subsequently utilized to recall multi-target content.

Description

Training method of content recall model, content recall method, device and equipment

Technical Field

The embodiment of the application relates to the field of artificial intelligence, in particular to a training method of a content recall model, a content recall method, a content recall device and content recall equipment.

Background

The recommendation system is a system for recommending interesting contents to a user, and is widely applied to recommendation scenes such as article recommendation, audio and video recommendation, shopping recommendation and the like.

The process of content recommendation generally includes recall and ranking, wherein recall refers to the process of screening candidate content meeting the user interest from a content database, and ranking is a process of fine-screening and ranking candidate content. In the related art, content recall is generally realized by using a content recall model obtained by deep learning training. In one example, when article recall is required, a user feature vector is obtained by inputting a user feature into the content recall model, an inner product of the user feature vector and the article feature vector (which is output and stored in advance by the content recall model according to the input article feature) is calculated, and article recall is performed based on the inner product.

However, in the related art, the content recall models are all for a single content recall target, such as a click target, a praise target, a comment target, and the like, when content meeting multiple content recall targets needs to be recalled (such as a click and praise target), multiple content recall models need to be trained and maintained, and multiple content recall models need to be used for content recall respectively and recall content integration is performed, which results in low content recall efficiency in a multiple content recall target scene.

Disclosure of Invention

The embodiment of the application provides a training method, a content recall method, a device and equipment of a content recall model, and the content recall efficiency in a multi-content recall target scene can be improved. The technical scheme is as follows:

in one aspect, an embodiment of the present application provides a method for training a content recall model, where the method includes:

acquiring sample data, wherein the sample data comprises sample user data and sample content data, and the sample content data corresponds to at least two content recall targets;

constructing a first content recall model and a second content recall model, wherein the first content recall model corresponds to at least two content recall targets, the second content recall model corresponds to a fusion recall target, and the fusion recall target is obtained by fusing at least two content recall targets;

training the first content recall model and the second content recall model based on the sample data, wherein the second content recall model is obtained by training in a knowledge distillation mode based on an output result of the first content recall model.

In another aspect, an embodiment of the present application provides a content recall method, where the method includes:

receiving a content recall request, wherein the content recall request comprises user data;

performing feature extraction on the user data through a second content recall model to obtain a user feature vector corresponding to the user data, wherein the second content recall model is obtained by training in a knowledge distillation mode based on an output result of a first content recall model, the first content recall model corresponds to at least two content recall targets, the second content recall model corresponds to a fusion recall target, and the fusion recall target is obtained by fusing at least two content recall targets;

and determining recalled target content data based on the user feature vector and a fusion feature vector corresponding to the content data, wherein the fusion feature vector is obtained by performing feature extraction on the content data by the second content recall model.

In another aspect, an embodiment of the present application provides an apparatus for training a content recall model, where the apparatus includes:

the system comprises a sample acquisition module, a content recall module and a content recall module, wherein the sample acquisition module is used for acquiring sample data, the sample data comprises sample user data and sample content data, and the sample content data corresponds to at least two content recall targets;

the model construction module is used for constructing a first content recall model and a second content recall model, the first content recall model corresponds to at least two content recall targets, the second content recall model corresponds to a fusion recall target, and the fusion recall target is obtained by fusing at least two content recall targets;

and the model training module is used for training the first content recall model and the second content recall model based on the sample data, wherein the second content recall model is obtained by training in a knowledge distillation mode based on the output result of the first content recall model.

In another aspect, an embodiment of the present application provides a content recall apparatus, where the apparatus includes:

the device comprises a request receiving module, a content recall module and a content recall module, wherein the request receiving module is used for receiving a content recall request which comprises user data;

the user feature extraction module is used for performing feature extraction on the user data through a second content recall model to obtain a user feature vector corresponding to the user data, the second content recall model is obtained by training in a knowledge distillation mode based on an output result of a first content recall model, the first content recall model corresponds to at least two content recall targets, the second content recall model corresponds to a fusion recall target, and the fusion recall target is obtained by fusing at least two content recall targets;

and the recall module is used for determining recalled target content data based on the user feature vector and a fusion feature vector corresponding to the content data, wherein the fusion feature vector is obtained by performing feature extraction on the content data by the second content recall model.

In another aspect, an embodiment of the present application provides a computer device, which includes a processor and a memory, where the memory stores at least one instruction, and the at least one instruction is loaded and executed by the processor to implement the method for training a content recall model according to the above aspect, or to implement the method for content recall according to the above aspect.

In another aspect, the present application provides a computer-readable storage medium, where at least one instruction is stored, and the at least one instruction is loaded and executed by a processor to implement the method for training a content recall model according to the above aspect, or to implement the method for content recall according to the above aspect.

In another aspect, embodiments of the present application provide a computer program product or a computer program, which includes computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the training method of the content recall model provided in the above aspect, or to implement the content recall method as described in the above aspect.

In the embodiment of the application, a knowledge distillation mode is adopted to distill a plurality of content recall targets into a single content recall target, and in a content recall model training stage, a first content recall model corresponding to at least two content recall targets and a second content recall model corresponding to a fused recall target are constructed, and the first content recall model and the second content recall model are trained by using sample data, so that the second content recall model learns the knowledge characteristics of the first content recall model through knowledge distillation, and has content recall capability similar to that of the first content recall model; because the second content recall model is essentially a single target recall model, the content recall efficiency can be improved on the premise of ensuring the content recall accuracy when the second content recall model is subsequently utilized to recall multi-target content.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic diagram illustrating an implementation of a content recall model training and application process provided by an embodiment of the present application;

FIG. 2 illustrates a schematic diagram of an implementation environment provided by an exemplary embodiment of the present application;

FIG. 3 illustrates a flow chart of a method of training a content recall model provided by an exemplary embodiment of the present application;

FIG. 4 illustrates a flow chart of a method of training a content recall model provided by another exemplary embodiment of the present application;

FIG. 5 is a diagram illustrating an implementation of a word vector embedding process according to an exemplary embodiment of the present application;

FIG. 6 is a schematic diagram of an MLP layer structure shown in an exemplary embodiment of the present application;

FIG. 7 is a schematic diagram of an implementation of a model training process shown in an exemplary embodiment of the present application;

FIG. 8 is a schematic diagram of an implementation of a model training process shown in another exemplary embodiment of the present application;

FIG. 9 is a flow chart illustrating a method of content recall in accordance with an exemplary embodiment of the present application;

FIG. 10 is a schematic diagram illustrating an implementation of a content recall process according to an exemplary embodiment of the present application;

FIG. 11 is a comparison of recall effects shown in an exemplary embodiment of the present application;

FIG. 12 is a block diagram of an apparatus for training a content recall model according to an exemplary embodiment of the present application;

fig. 13 is a block diagram illustrating a content recall apparatus according to an exemplary embodiment of the present application;

fig. 14 shows a schematic structural diagram of a computer device provided in an exemplary embodiment of the present application.

Detailed Description

To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.

The training method of the content recall model and the content recall method provided by the embodiment of the application can be applied to the scenes needing content recommendation, such as article recommendation scenes, shopping recommendation scenes, audio and video recommendation scenes, advertisement recommendation scenes, social account recommendation scenes and the like, and multi-target content recall is needed when content recommendation is carried out. Taking an article recommendation scene as an example, when an article server recommends an article, the article is recalled by taking the goal that a user clicks to read and the reading time exceeds a time threshold; taking a video recommendation scene as an example, when a video server carries out video recommendation, the video server carries out video recommendation by taking a click watching of a user and a praise as a target; taking a shopping recommendation scene as an example, when the shopping server recommends a commodity, the user clicks to check the commodity, and the commodity purchase is taken as a target to recall the commodity.

Taking an advertisement recommendation scene as an example, in order to improve the click probability of inserting advertisements between application contents (such as a graphic advertisement inserted between news in news applications, a graphic advertisement inserted between social trends in social applications, and a video advertisement inserted between video applications in video applications), and improve the purchase probability of a commodity corresponding to an advertisement after click, an advertisement recommendation server acquires user data from an application server, and recalls advertisement data conforming to user characteristics from an advertisement database by targeting advertisement click and purchase of the commodity corresponding to the advertisement, so as to recommend the advertisement.

Taking a social account recommendation scenario as an example, in order to improve the efficiency of adding an interested account in a social application by a user, a social server acquires a user image, and recalls a social account matching the user interest from a social account data block and recommends the social account with the goal that the social account is clicked and a social relationship (such as a friend relationship, an attention relationship and the like) is established with the social account.

Of course, in addition to the above several possible application scenarios, the training method of the content recall model and the content recall method provided in the embodiment of the present application may be applied to any scenario that requires multi-target content recommendation, and the embodiment of the present application does not limit a specific application scenario.

Before content recall is carried out, the computer equipment carries out model training by adopting the training method of the content recall model provided by the embodiment of the application. As shown in fig. 1, the computer device first obtains sample data 11 (including sample user data and sample content data), constructs a first content recall model 12 based on at least two content recall targets, and trains a second content recall model 13 (corresponding to a single-target content recall model) based on a fused recall target obtained by fusing at least two content recall targets.

Further, the computer equipment performs model training on the first content recall model 12 by using the sample data 11, so that the trained first content recall model 12 has multi-target content recall capability; at the same time, the computer device trains the second content recall model 13 in a knowledge distillation manner, so that the second content recall model 13 learns the multi-target content recall capability of the first content recall model 12.

After the model training is completed, the computer device deploys the second content recall model 13 on the server, and the server performs feature extraction on the content in the content database 14 by using the second content recall model 13 to obtain a single content feature vector 15 corresponding to each content, and stores the single content feature vector 15. And when multi-target content recall is subsequently performed, the server performs feature extraction on the user data 16 by using the second content recall model 13 to obtain a single user feature vector 17 corresponding to the user, so that recall content 18 meeting the multi-target is screened from the content database based on the user feature vector and the content feature vector, and content recommendation is performed based on the screened recall content 18.

FIG. 2 illustrates a schematic diagram of an implementation environment provided by an exemplary embodiment of the present application. The implementation environment includes a terminal 210 and a server 220. The data communication between the terminal 210 and the server 220 is performed through a communication network, optionally, the communication network may be a wired network or a wireless network, and the communication network may be at least one of a local area network, a metropolitan area network, and a wide area network.

The terminal 210 is an electronic device with content recommendation requirements, and the electronic device may be a mobile terminal such as a smart phone, a tablet computer, a laptop portable notebook computer, and the like, or a terminal such as a desktop computer, a projection computer, and the like, which is not limited in this embodiment of the present application.

Optionally, the application installed in the terminal 210 has a content recommendation requirement, and the application includes a news reading application (recommending articles of interest to the user), a video application (recommending videos of interest to the user), a music application (recommending music meeting the user's preference), a shopping application (recommending goods meeting the user's preference), a social application (recommending social accounts of interest to the user), and the like.

In fig. 2, a news reading application is installed in the terminal 210, and the news reading application has a function of recommending articles according to the user's interest and the history news reading record.

The server 220 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like.

Optionally, the server 220 is a background server for the application (having a content recommendation requirement) in the terminal 210, and is configured to provide a content recommendation service for the application. As shown in fig. 2, the server 220 is a background server of a news reading application program and is used for providing article recommendation service.

In this embodiment, the server 220 builds a first article recall model 221 (equivalent to a multi-target recall model) based on at least two article recall targets (such as an article click target and an article reading duration target) in advance according to article data, user data and article historical reading records in an article database, and trains a second article recall model 222 (equivalent to a single-target recall model) by a knowledge distillation method based on a fused recall target obtained by fusing the at least two article recall targets (such as an article click target and a reading duration greater than a duration threshold).

After the training of the second article recall model 222 is completed, the server 220 deploys the second article recall model 222 as a target article recall model, performs feature extraction on the articles in the article database 223 by using the second article recall model 222 to obtain article feature vectors corresponding to the articles, and constructs a vector index for the article feature vectors to form an article feature vector index database 224.

In the subsequent article recommendation process, after receiving a recommendation request sent by the terminal 210, the server 220 first performs feature extraction on the user data of the user corresponding to the terminal 210 by using the second article recall model 222 to obtain a user feature vector 225, and then determines a recalled article 226 based on the user feature vector 225 and the article feature vector index library 224, so as to perform article recommendation on the basis of the recalled article 226.

In other possible embodiments, the model training process may also be executed by a computer, and the computer deploys the trained second article recall model on the server side, which is not limited in this embodiment.

For convenience of description, the following embodiments are described as examples of the training of the content recall model and the execution of the content recall method by a computer device.

FIG. 3 illustrates a flowchart of a method for training a content recall model provided by an exemplary embodiment of the present application. The embodiment is described by taking the method as an example for a computer device, and the method comprises the following steps.

Step 301, sample data is obtained, wherein the sample data includes sample user data and sample content data, and the sample content data corresponds to at least two content recall targets.

Wherein, the sample user data in the sample data is used for characterizing the multidimensional characteristics of the user, such as the historical content viewing record characteristics, the user portrait, the context characteristics (including the time characteristics, the network characteristics, the geographic position characteristics and the like when the operation is performed) of the user; the sample content data is used for characterizing multi-dimensional characteristics of the content, such as a content type characteristic, a content release time characteristic, a content length characteristic and the like. Moreover, since the content recall model with the multi-target recall capability needs to be trained finally, the sample content data corresponds to a plurality of content recall targets.

In one possible implementation, the computer device acquires a historical operation record of the content by the user, and accordingly generates sample data based on the historical operation record.

Taking an article as an example, when the content recall target includes an article click and the article reading duration is greater than the duration threshold, the historical operation record acquired by the computer device is the click reading record of the user on the article and the article reading duration, and correspondingly, the generated sample data includes user data of the user, the article data, the click behavior data of the user on the article and the article reading duration data of the user.

Taking a video as an example, when the content recall target includes video watching and video approval, the historical operation record acquired by the computer device is a watching record and approval record of the user on the video, and correspondingly, the generated sample data includes user data of the user, video data, watching behavior data of the user on the video, and approval behavior data of the user on the video.

Step 302, a first content recall model and a second content recall model are established, wherein the first content recall model corresponds to at least two content recall targets, the second content recall model corresponds to a fusion recall target, and the fusion recall target is obtained by fusing the at least two content recall targets.

In the embodiment of the application, in the model training stage, the computer device needs to train two content recall models, namely a first content recall model for multiple targets and a second content recall model for a single target, where the multiple targets are at least two content recall targets corresponding to the sample content data, and the single target is a fused recall target obtained by fusing the at least two content recall targets.

Taking an article as an example, the recall target of the first content recall model includes an article click and the reading duration of the article reaches the duration threshold, and the recall target of the second content recall model is the article click and the reading duration reaches the duration threshold. Taking a video as an example, the recall target of the first content recall model includes video watching and video approval, and the recall target of the second content recall model is video watching and approval.

In some embodiments, the first content recall model is structurally different from the second content recall model. The first content recall model has at least two target recall functions, so that at least two content feature vectors (respectively corresponding to different targets) can be obtained by processing sample content data through the first content recall model, and the second content recall model has a single target (fusion target) recall function, so that only a single content feature vector is obtained by processing the sample content data through the second content recall model, namely, a network branch with at least two paths of output content feature vectors is arranged in the first content recall model, and a network branch with only one path of output content feature vectors is arranged in the second content recall model.

Step 303, training a first content recall model and a second content recall model based on the sample data, wherein the second content recall model is obtained by training in a knowledge distillation mode based on an output result of the first content recall model.

After the model construction is completed, the computer device trains a first content recall model and a second content recall model based on sample data, and a knowledge distillation technology is integrated in the training process, wherein the first content recall model is a Teacher (Teacher) model, and the second content recall model is a Student (Student) model.

In one possible implementation, the computer device trains the first content recall model with at least two content recall targets of the sample content data as a supervision, such that the first content recall model is multi-target recall capable. And when training the second content recall model, the computer equipment trains the second content recall model by taking the output result of the first content recall model as a soft target (softtarget), so that the output result of the second content recall model is as close to the output result of the first content recall model as possible, thereby having the multi-target recall capability similar to the first content recall target. The following embodiments will detail the model training process.

After the training of the first content recall model and the second content recall model is completed, because the second content recall model learns the model knowledge of the first content recall model (namely, the recall effects of the first content recall model and the second content recall model are basically the same), the second content recall model only needs to be deployed on the server to perform online recall, the first content recall model does not need to be deployed and maintained, and the maintenance cost of the online model is reduced.

And distilling the multiple targets into a single fused target by a knowledge distillation technology, so that the fused target can obtain the effect equivalent to the multiple targets. Compared with the method that a plurality of single-target content recall models (respectively corresponding to different targets) obtained through training are utilized to perform online recall, a plurality of content feature vectors need to be generated and stored based on content data, and user feature vectors and the content feature vectors need to be respectively calculated, because the second content recall model only generates a single content feature vector based on the content data, the occupied storage space is remarkably reduced (equivalent to the storage resource of single-target recall), and only the user feature vectors and the single content feature vectors need to be calculated, the calculated amount is remarkably reduced (equivalent to the calculation resource of single-target recall), and therefore the content recall efficiency is improved.

In summary, in the embodiment of the present application, a knowledge distillation manner is adopted to distill multiple content recall targets into a single content recall target, and in a content recall model training stage, by constructing a first content recall model corresponding to at least two content recall targets and a second content recall model corresponding to a fused recall target, the first content recall model and the second content recall model are trained by using sample data, so that the second content recall model learns the knowledge characteristics of the first content recall model through knowledge distillation, thereby having a content recall capability similar to that of the first content recall model; because the second content recall model is essentially a single target recall model, the content recall efficiency can be improved on the premise of ensuring the content recall accuracy when the second content recall model is subsequently utilized to recall multi-target content.

In a possible implementation manner, the computer device fuses the loss function of the first content recall model and the loss function of the second content recall model according to the respective model training targets of the first content recall model and the second content recall model, so as to jointly train the two models according to the fused loss functions, and the step 303 may include the following steps.

First, a first loss function of the first content recall model under sample data is determined.

The first loss function is used for representing the deviation degree between the predicted operation result of the first content recall model based on the sample user data and the sample content data and the actual operation result of the user on the content. And because the first content recall model corresponds to a plurality of content recall objectives, the first loss function includes a loss of the first content recall model for a single recall objective.

In addition, the first loss function further includes a loss of the first content recall model at a multiple recall goal in order to enable subsequent learning of the second content recall model by knowledge distillation into a multiple recall.

In one possible implementation, when the at least two content recall targets include a content click and a content conversion, the first loss function includes a loss of the first content recall model at the content click target (the first content recall model predicts a difference between a click event of the content and an actual click event of the content), and a loss of the first content recall model at the content click and conversion target (the first content recall model predicts a difference between a click conversion event of the content and an actual click conversion event of the content). Wherein, clicking is a precondition of conversion, that is, the content can be converted after being clicked. For example, the conversion may refer to reading duration of the article reaching a duration threshold, a praise operation, a share operation, a collection operation, and the like.

And secondly, determining a second loss function of the second content recall model under the sample data based on the output result of the first content recall model.

In a possible implementation manner, since the second content recall model corresponds to the fused recall target, the computer device determines a second loss function of the second content recall model under the sample data by using the output result of the first content recall model under the multi-recall target as the supervision of the output result of the second content recall model, wherein the second loss function is used for representing the difference between the multi-target recall result of the first content recall model and the fused recall result of the second content recall model.

In connection with the example of the above steps, the computer device determines a second loss function based on the output of the first recall model under the content click and conversion goal and the output of the second recall model under the fused recall goal (content click and conversion).

And thirdly, fusing the first loss function and the second loss function to obtain a target loss function.

Further, the computer device fuses the first loss function and the second loss function as a total loss function for subsequent training of the first content recall model and the second content recall model.

Optionally, the first loss function and the second loss function respectively correspond to weights, and when performing the fusion of the loss functions, the target loss function is obtained by performing weighted fusion based on the first loss function, the first weight corresponding to the first loss function, the second loss function, and the second weight corresponding to the second loss function. The first loss and the second loss are super-parameters, and the first loss and the second loss can be set in advance according to experience and adjusted in the training process.

And fourthly, training the first content recall model and the second content recall model based on the target loss function.

In some embodiments, the computer device jointly trains the first content recall model and the second content recall model with the goal of minimizing a goal loss function. In the model training, a gradient descent algorithm or a back propagation algorithm may be used to adjust the model parameters, which is not limited in this embodiment.

The following describes the determination of the loss function in the model training process in detail by using an exemplary embodiment, and the following embodiment describes an example in which at least two content recall targets include a content click and a content conversion, and a fused recall target is a content click and a conversion.

FIG. 4 is a flowchart illustrating a method for training a content recall model according to another exemplary embodiment of the present application. The embodiment is described by taking the method as an example for a computer device, and the method comprises the following steps.

Step 401, sample data is obtained, where the sample data includes sample user data and sample content data, and the sample content data corresponds to at least two content recall targets.

The step 301 may be referred to in the implementation manner of this step, and this embodiment is not described herein again.

It should be noted that, since the data amount of the sample data is huge, it takes a lot of time to train with all the sample data. Therefore, in a possible implementation mode, the computer equipment adopts an NCE (Noise-contrast Estimation) mode to select positive and negative samples, and the model training speed is increased under the condition that the training effect is not influenced.

In one possible implementation, for a content click target, the computer device takes the clicked sample content data as a positive sample, and extracts a negative sample from the positive sample according to the sample heat; for the content conversion target, the computer device takes the converted sample content data as a positive sample and takes the sample content data that is not converted as a negative sample.

For example, when the content conversion target is a reading duration target of an article, the computer device determines a 70% quantile of the reading duration of the article by the user as a duration threshold, thereby determining a sample article with a reading duration greater than the duration threshold as a positive sample, and determining a sample article with a reading duration less than the duration threshold as a negative sample.

In addition, when the positive and negative samples are selected, in order to enable the negative sample to counter the positive sample, in a possible implementation manner, the computer device randomly selects the positive sample data from the sample data, and selects the negative sample data according to the sample heat of the sample content data in the sample data, wherein the selection proportion of the negative sample data and the sample heat have a positive correlation, that is, the number of the negative samples selected by the computer device from the high heat sample content data is greater than the number of the negative samples selected from the low heat sample content data.

Step 402, a first content recall model and a second content recall model are established, wherein the first content recall model corresponds to at least two content recall targets, the second content recall model corresponds to a fusion recall target, and the fusion recall target is obtained by fusing the at least two content recall targets.

The content recall model is composed of an Embedding (Embedding) Layer and a multilayer Perceptron (MLP) Layer (or called a fully connected Layer). The Embedding layer is used for converting input data into a vector form, and the MLP layer is used for connecting input vectors through the hidden layer and outputting final feature vectors.

In one possible implementation, the first content recall model includes a user branch and a content branch. The user branch comprises an Embedding layer for converting user data into vectors and an MLP layer for processing the vectors output by the Embedding layer and outputting user characteristic vectors; the content branch comprises an Embedding layer for converting content data into vectors and an MLP layer for processing the output vectors of the Embedding layer and outputting at least two content feature vectors (corresponding to different content recall targets). The second content recall model also includes a user branch and a content branch. The user branch in the second content recall model is similar to the user branch in the first content recall model, but the content branch in the second content recall model comprises an Embedding layer for converting content data into vectors and an MLP layer for processing the vectors output by the Embedding layer and outputting single content feature vectors (corresponding to a fusion recall target).

Optionally, an Embedding layer in the content recall model is used for converting the input data into a low-dimensional dense vector. And initializing a lookup table in the content recall model, wherein each line of the lookup table is represented by a low-dimensional dense vector corresponding to one data component (such as a label), and when the input data is converted into the low-dimensional dense vectors, searching the low-dimensional dense vectors corresponding to the components of the input data from the lookup table, and performing weighted summation on the low-dimensional dense vectors to obtain the low-dimensional dense vectors corresponding to the input data.

Illustratively, as shown in fig. 5, the lookup table 51 includes low-dimensional dense vectors (v) corresponding to 10000 user tags₁To v₁₀₀₀₀). When the user data 52 indicates that the user has two tags, and the tag IDs are 308 and 4080, respectively, the computer device finds v from the lookup table 51₃₀₈And v₄₀₈₀And according to the weight corresponding to the label, determining that the user vector 53 corresponding to the user data 52 is 0.7 x v₃₀₈+0.3×v₄₀₈₀。

Each neuron in the MLP layer takes the output of all neurons in the previous layer as input, each neuron in the adjacent layer has respective connection weight, and meanwhile, the neuron has nonlinear fitting capability by introducing a nonlinear activation function (such as Relu). Schematically, as shown in FIG. 6, when the input vector of the l-th layer is x_lWhen the input vector of the l +1 th layer is x_l+1And the connection weight matrix between the l-th layer and the l + 1-th layer is w_l+1，b_l+1Is the bias vector of layer l +1, f is the nonlinear activation function, then x_l+1＝f(W_l+1x_l+b_l+1)。

Illustratively, as shown in fig. 7, the computer device constructs a first content recall model including a user branch 71, a first content branch 72 and a second content branch 73, and constructs a second content recall model including the user branch 71 and a fused content branch 74. The user branch 71 is used for processing and splicing user data through an Embedding layer to obtain a user low-dimensional dense vector, so that the user low-dimensional dense vector is processed through an MLP layer, and a user feature vector is output. The first content branch 72, the second content branch 73 and the fused content branch 74 are used for processing and splicing content data through an Embedding layer to obtain content low-dimensional dense vectors, and processing the content low-dimensional dense vectors through respective MLP layers to obtain single target feature vectors and fused target feature vectors.

Step 403, determining at least two single-target probabilities of the first content recall model under sample data, where the single-target probabilities refer to probabilities that the recall content satisfies a single content recall target, and different content recall targets correspond to different single-target probabilities.

In order to balance the content recall effect of the first content recall model under at least two content recall targets, the computer equipment inputs sample data into the first content recall model to obtain the single target probability output by the first content recall model.

In one possible implementation, the content recall target includes a content click and a content conversion, and the computer device inputs the sample data into the first content recall model to obtain a content click probability representing a probability that the sample content is clicked by the user and a content conversion probability representing a probability that the sample content is converted.

Regarding the specific manner of determining the probability of a single target, this step may optionally include the following sub-steps.

Firstly, carrying out feature extraction on sample user data through a first content recall model to obtain a sample user feature vector.

In the embodiment of the application, the computer device adopts the vector inner product to represent the single-target probability of content clicking or conversion, so that the computer device needs to perform feature extraction on sample user data through a user branch in the first content recall model to obtain a sample user feature vector.

Schematically, as shown in fig. 7, the computer device processes and splices the sample user data through the Embedding layer in the user branch 71 to obtain a user low-dimensional dense vector 701, so as to process the user low-dimensional dense vector 701 through the MLP layer (Relu in the figure is an activation function of each layer), and output a sample user feature vector 711.

And secondly, performing feature extraction on the sample content data through the first content recall model to obtain at least two sample content feature vectors, wherein different content recall targets correspond to different sample content feature vectors.

Similar to the process of obtaining the sample user feature vectors, the computer device performs feature extraction on the sample content data through different content branches in the first content recall model to obtain a plurality of sample content feature vectors, wherein the different sample content feature vectors correspond to different content recall targets.

Schematically, as shown in fig. 7, the computer device processes and splices the sample content data through the Embedding layer in the first content branch 72 (corresponding to the content click target) to obtain a content low-dimensional dense vector 702, so that the content low-dimensional dense vector 702 is processed through the MLP layer (Relu in the figure is an activation function of each layer), and a content click feature vector 721 is output; the computer device processes and splices the sample content data through the Embedding layer in the second content branch 73 (corresponding to the content conversion target) to obtain a content low-dimensional dense vector 702, so that the content low-dimensional dense vector 702 is processed through the MLP layer (Relu in the figure is an activation function of each layer), and a content conversion feature vector 731 is output.

And thirdly, determining the single target probability corresponding to each of the at least two content recall targets based on the inner product operation result of the sample user feature vector and each sample content feature vector.

Further, the computer device performs inner product operation on the sample user feature vectors and each sample content feature vector, so as to determine the single target probability corresponding to each of the at least two content recall targets according to the inner product operation result. Wherein, the inner product operation result and the single target probability are in positive correlation.

Taking the content click and content conversion example, the single target probability (content click probability) corresponding to the content click target can be expressed as:

wherein, theta_ctrFor the network parameter corresponding to the content click branch, y is 1 to indicate that the content is clicked, and x_uIs a sample user vector, v_cThe feature vector is clicked on for the content.

The probability of a single target (content conversion probability) corresponding to a content conversion target can be expressed as:

wherein, theta_cvrFor the network parameter corresponding to the content conversion branch, z ═ 1 indicates that the content is converted, and x_uIs a sample user vector, v_dFeature vectors are translated for the content.

Illustratively, as shown in fig. 7, the computer device performs an inner product operation on the sample user feature vector 711 and the content click feature vector 721 to determine a content click probability 751; the inner product operation is performed on the sample user feature vector 711 and the content conversion feature vector 731 to determine a content conversion probability 761.

A first fusion probability is determined based on the at least two single-target probabilities, the first fusion probability being a probability that the recalled content meets the at least two content recall targets simultaneously, step 404.

In order to fuse the links between different content recall targets in the process of training the first content recall model, after the computer device determines the probability of a single target, it is necessary to further determine the fusion probability when multiple content recall targets are met.

In one possible implementation, when the content recall target is a content Click and a content Conversion, respectively, since a content Click Rate (CTR) is defined as a Click number/exposure number and a content Conversion Rate (CVR) is defined as a Conversion number/Click number, the computer device determines a product of a content Click probability and a content Conversion probability as a first fusion probability of a content Click Conversion Rate (CTCVR).

In connection with the example in the above embodiment, the first fusion probability (first content click conversion probability) may be expressed as:

namely, it is

P_ctcvr(θ_ctr，θ_cvr)＝P_ctr(θ_ctr)P_cvr(θ_cvr)

Illustratively, as shown in FIG. 7, the computer device multiplies the content click probability 751 and the content conversion probability 761 to obtain a first content click conversion probability 771.

Step 405, a first loss function is determined based on the first fusion probability and the single-target probability.

In one possible implementation, the computer device determines a single-target loss according to a single-target label corresponding to the single-target probability and the sample content data; and determining fusion target loss according to the first fusion probability and the multi-target label corresponding to the sample content data, and further determining the single target loss and the fusion target loss as a first loss function, namely a total loss function of the first content recall model.

In one possible implementation, the computer device determines a content click loss according to a deviation between a content click flag (i.e., a flag indicating whether the sample content data is clicked or not, and if so, the flag is 1) corresponding to the sample content data and a content click probability (CTR); and determining the content click conversion loss according to the deviation between a content click conversion label (namely, the content click conversion label is used for indicating whether the sample content data is clicked and converted, if the sample content data is clicked and converted, the label is 1, and if the sample content data is not clicked or not clicked, the label is 0) and the content click conversion probability (CTCVR).

Among them, the content click loss (binary cross-entropy loss) can be expressed as:

L₁(θ_ctr)＝-ylogP_ctr(θ_ctr)-(1-y)log(1-P_ctr(θ_ctr))

the content click conversion penalty (binary cross-entropy penalty) can be expressed as:

L₂(θ_ctr，θ_cvr)＝-zlogP_ctr(θ_ctr)P_cvr(θ_cvr)-(1-z)log(1-P_ctr(θ_ctr)P_cvr(θ_cvr))

accordingly, the first loss function is L₁(θ_ctr)+L₂(θ_ctr，θ_cvr)。

Illustratively, as shown in fig. 7, the computer device determines a content click loss according to a content click probability 751 and a content click label 752, determines a content click conversion loss according to a content click conversion probability 771 and a content click conversion label 752, and finally obtains a first loss function.

Step 406, determining a second fusion probability of the second content recall model under the sample data, wherein the second fusion probability refers to the probability that the recall content meets the fusion recall target.

Different from the first content recall model, the fusion probability can be obtained only by performing probability fusion on at least two single-target probabilities, and the recall target of the second content recall model is the fusion recall target, so the fusion probability can be obtained by inputting sample data into the second content recall model.

In one possible embodiment, this step may comprise the following sub-steps.

Firstly, extracting characteristics of sample user data through a second content recall model to obtain a sample user characteristic vector.

Similar to the user feature vectorization process performed by using the first content recall model, the computer device performs word embedding and full connection processing on the sample user data through the user branch in the second content recall model to obtain a sample user feature vector.

And secondly, performing feature extraction on the sample content data through a second content recall model to obtain a sample fusion feature vector.

Similar to the content data feature vectorization process under the single target, the computer device inputs the sample content data into the fused content branch of the second content recall model to obtain a sample fused feature vector output by the fused content branch.

Schematically, as shown in fig. 7, the computer device processes and splices the sample content data through an Embedding layer in the fused content branch 74 (corresponding to the content click conversion target) to obtain a content low-dimensional dense vector 702, so that the content low-dimensional dense vector 702 is processed through an MLP layer (Relu in the figure is an activation function of each layer), and a sample fused feature vector 741 is output.

And thirdly, determining a second fusion probability based on the inner product operation result of the sample user feature vector and the sample fusion feature vector.

Similar to the process of determining the probability of the single target, the computer device performs an inner product operation on the sample user feature vector and the sample fusion feature vector, so as to determine a second fusion probability corresponding to the fusion recall target according to the inner product operation result. And the inner product operation result and the second fusion probability are in positive correlation.

Taking the fusion recall target as an example of content click conversion, a second fusion probability (content click conversion probability) corresponding to the content click fusion target may be expressed as:

wherein, theta_fusionFor merging the corresponding network parameters of the content branch, z ═ 1 indicates that the content is clicked on and converted, and x_uIs a sample user vector, v_fFeature vectors are fused for the samples.

Schematically, as shown in fig. 7, the computer device performs an inner product operation on the sample user feature vector 711 and the sample fusion feature vector 741 to obtain a second content click conversion probability 781.

Step 407, determining a second loss function based on the difference in the probability distribution of the first fusion probability and the second fusion probability.

Unlike the first content recall model that the sample label is used as the supervised determination loss function, in the present embodiment, since the first content recall model is used as the teacher model of the second content recall model, the loss function needs to be determined by using the output result of the first content recall model as the supervised determination loss function, so that the output result of the second content recall model tends to the output result of the first content recall model during the knowledge distillation process.

In order to make the second fusion probability of the second content recall model output tend toward the first fusion probability of the first content recall model output even if the difference in the probability distributions of the two is as small as possible, the computer device determines a second loss function based on the probability distribution difference of the first fusion probability and the second fusion probability.

In a possible implementation, the computer device measures the difference of the probability distribution by using KL (Kullback-Leibler divergence) divergence, and accordingly, the second loss function can be expressed as:

of course, the computer device may also measure the probability distribution difference in other manners, which is not limited in this embodiment.

Illustratively, as shown in FIG. 7, the computer device calculates a KL divergence of a probability distribution between a first content click transition probability 771 and a second content click transition probability 781 to determine a second loss function.

And step 408, fusing the first loss function and the second loss function to obtain a target loss function.

Further, the computer device performs loss fusion on the first loss function of the first content recall model and the second loss function of the second content recall model to obtain an objective loss function, and in combination with the example in the above step, the objective loss function may be represented as:

L(θ_ctr，θ_cvr,θ_fusion)＝w₁L₁(θ_ctr)+w₂L₂(θ_ctr,θ_cvr)+w₃L₃(θ_fusion)

wherein, w₁，w₂，w₃Respectively being content click loss and content click conversionWeights for the chemometrics loss and the fusion loss.

Step 409, training the first content recall model and the second content recall model based on the objective loss function.

The implementation manner of this step can refer to the above embodiments, and this embodiment is not described herein again.

In the embodiment, by utilizing the knowledge distillation technology, the teacher model (namely the first content recall model) is more focused on modeling and training of multiple targets, the student model (namely the second content recall model) is focused on fusion and transfer learning of the multiple targets, and the content recall quality of the multiple targets is improved; in addition, the teacher model in the scheme can independently model more targets, and can effectively construct the dependency transformation relation among different targets, so that the problem that different targets are split (the false positive rate of the model is high due to the fact that the targets are split) is avoided, and the content recall quality is further improved.

It should be noted that, the embodiment shown in fig. 7 is described by taking an example that the first content recall model is composed of a plurality of single-target recall models (different single-target recall models correspond to different content recall targets), and in other possible implementations, the first content recall model may also be a multi-target recall model shared by underlying parameters, so as to reduce the parameter quantity of the entire model and reduce the risk of overfitting. For example, the first content recall model is a Share Bottom model, a multitask learning task relationship Modeling (Modeling task relationships in multi-task learning with multi-gate mix-of-Experts, MMoE) model based on expert multi-gate mixing, and the like, which is not limited in this embodiment.

Illustratively, on the basis of fig. 7, as shown in fig. 8, the user branch 71 includes a user-side neural network 703 for outputting a user feature vector, and the first content branch 72, the second content branch 73, and the fused content branch 74 share a content-side neural network 704.

The flow of online content recall after model training is described below using an illustrative embodiment.

Fig. 9 is a flowchart illustrating a content recall method according to an exemplary embodiment of the present application. The method comprises the following steps.

Step 901, receiving a content recall request, where the content recall request includes user data.

In a possible implementation manner, when a recall request sent by the terminal is received, the computer device obtains corresponding user data from the user database according to a user identifier contained in the recall request, wherein the user data is used for characterizing user characteristics from multiple dimensions.

And 902, performing feature extraction on the user data through the second content recall model to obtain a user feature vector corresponding to the user data.

The second content recall model is obtained by training in a knowledge distillation mode based on the output result of the first content recall model, the first content recall model corresponds to at least two content recall targets, the second content recall model corresponds to a fusion recall target, and the fusion recall target is obtained by fusing at least two content recall targets. For the specific training process of the first content recall model and the second content recall model, reference may be made to the above-mentioned model training method, which is not described herein again in this embodiment.

In a possible implementation manner, the computer device vectorizes the user data through an Embedding layer of a user branch in the second content recall model to obtain a user vector, and performs feature extraction on the user vector through an MLP layer to obtain a user feature vector.

Schematically, as shown in fig. 10, the computer device performs vectorization processing on user data through an Embedding layer of the user branch 1010 to obtain a user vector 1011 (low-dimensional dense vector), and further performs feature extraction on the user vector 1011 through an MLP layer to obtain a user feature vector 1012.

Step 903, determining the recalled target content data based on the user feature vector and the fusion feature vector corresponding to the content data, wherein the fusion feature vector is obtained by performing feature extraction on the content data through the second content recall model.

In a possible implementation manner, after the computer device completes the training of the second content recall model, the second content recall model is used to perform feature extraction on the content data, so as to obtain and store the fusion feature vectors corresponding to the content data, and when content recall is subsequently performed, the computer device performs inner product operation on the user feature vectors and each fusion feature vector to obtain recall scores (which are just related to the inner product operation results) corresponding to the content data, so as to recall target content data based on the recall scores, wherein the target content data is content data k before recall scores.

In a possible implementation manner, in order to improve the online content recall speed, the computer performs feature extraction on content data (including historical content data and newly-added content data) in the content database through the second content recall model, and after obtaining a fusion feature vector corresponding to the content data, establishes a vector index for the fusion feature vector, and stores the vector index into the vector recall index database. Subsequently, the content can be recalled from the vector recall index base. Wherein the vector recall index base may be Faiss.

Illustratively, as shown in fig. 10, the computer device performs vectorization processing on the content data through an Embedding layer of the fused feature branch 1020 to obtain a content vector 1021 (low-dimensional dense vector), and further performs feature extraction on the content vector 1021 through an MLP layer to obtain a fused feature vector 1022. Further, the computer device stores the vector index corresponding to the fused feature vector 1022 in the Faiss vector recall index repository 1030.

Accordingly, when the computer device recalls the target content data, the target vector index is determined from the vector recall index library based on the user characteristic vector, and the target content data corresponding to the target vector index is recalled.

In a possible implementation manner, the computer device screens candidate vector indexes from the vector recall index library by an approximate nearest neighbor lookup method based on the user feature vectors, so as to further determine target vector indexes from the candidate vector indexes by calculating inner products of the user feature vectors and the content feature vectors, and thus recall the target content data corresponding to the target vector indexes.

Illustratively, as shown in fig. 10, the computer device recalls content from the Faiss vector recall index library 1030 based on the user feature vector, and finally obtains a TopK recall list 1040, where the TopK recall list 1040 includes the top K contents with the highest click conversion probability.

In this embodiment, when the second content recall model obtained by training is used to perform online content recall, only the single fusion feature vector needs to be indexed and stored, so that the occupation of storage resources is reduced, and only the inner product of the single user feature vector and the single fusion feature vector needs to be calculated, which is beneficial to improving the content recall speed.

In an actual application scenario, the above scheme is applied to article recommendation, and the average reading time of the recommended article before and after the scheme is applied is shown in fig. 11. It can be seen from the figure that after the scheme of the application is applied, the average reading time of the recommended article is increased by 10%, that is, the recommended article is more in line with the interest of the user.

Fig. 12 is a block diagram illustrating a training apparatus for a content recall model according to an exemplary embodiment of the present application, where the apparatus includes:

a sample obtaining module 1201, configured to obtain sample data, where the sample data includes sample user data and sample content data, and the sample content data corresponds to at least two content recall targets;

a model building module 1202, configured to build a first content recall model and a second content recall model, where the first content recall model corresponds to at least two content recall targets, the second content recall model corresponds to a fusion recall target, and the fusion recall target is obtained by fusing at least two content recall targets;

a model training module 1203, configured to train the first content recall model and the second content recall model based on the sample data, where the second content recall model is obtained by training in a knowledge distillation manner based on an output result of the first content recall model.

Optionally, the model training module 1203 includes:

a first loss determining unit, configured to determine a first loss function of the first content recall model under the sample data;

a second loss determining unit, configured to determine, based on an output result of the first content recall model, a second loss function of the second content recall model under the sample data;

a loss fusion unit, configured to fuse the first loss function and the second loss function to obtain a target loss function;

a training unit to train the first content recall model and the second content recall model based on the target loss function.

Optionally, the first loss determining unit is configured to:

determining at least two single target probabilities of the first content recall model under the sample data, wherein the single target probabilities refer to probabilities that recall content meets a single content recall target, and different content recall targets correspond to different single target probabilities;

determining a first fusion probability based on at least two of the single-target probabilities, the first fusion probability referring to a probability that recalled content satisfies at least two of the content recall targets at the same time;

determining the first loss function based on the first fusion probability and the single-objective probability.

Optionally, the second loss determining unit is configured to:

determining a second fusion probability of the second content recall model under the sample data, wherein the second fusion probability refers to the probability that recall content meets the fusion recall target;

determining the second loss function based on a difference in probability distribution of the first fusion probability and the second fusion probability.

Optionally, the first loss determining unit is specifically configured to:

performing feature extraction on the sample user data through the first content recall model to obtain a sample user feature vector;

performing feature extraction on the sample content data through the first content recall model to obtain at least two sample content feature vectors, wherein different content recall targets correspond to different sample content feature vectors;

and determining the single target probability corresponding to each of at least two content recall targets based on the inner product operation result of the sample user feature vector and each sample content feature vector.

Optionally, the second loss determining unit is specifically configured to:

performing feature extraction on the sample user data through the second content recall model to obtain a sample user feature vector;

performing feature extraction on the sample content data through the second content recall model to obtain a sample fusion feature vector;

and determining the second fusion probability based on the inner product operation result of the sample user feature vector and the sample fusion feature vector.

Optionally, the loss fusion unit is configured to:

and weighting and fusing to obtain the target loss function based on the first loss function, the first weight corresponding to the first loss function, the second loss function and the second weight corresponding to the second loss function.

Optionally, the apparatus further comprises:

a positive sample selecting module for randomly selecting positive sample data from the sample data;

and the negative sample selecting module is used for selecting negative sample data according to the sample heat of the sample content data in the sample data, and the selecting proportion of the negative sample data is in positive correlation with the sample heat.

Optionally, at least two of the content recall targets include a content click and a content conversion, and the fused recall target is the content click and the content conversion.

Optionally, the first content recall model is a multi-target recall model shared by bottom-layer parameters, or the first content recall model is composed of a plurality of single-target recall models, and different single-target recall models correspond to different content recall targets.

Fig. 13 is a block diagram illustrating a content recall apparatus according to an exemplary embodiment of the present application, where the apparatus includes:

a request receiving module 1301, configured to receive a content recall request, where the content recall request includes user data;

a user feature extraction module 1302, configured to perform feature extraction on the user data through a second content recall model to obtain a user feature vector corresponding to the user data, where the second content recall model is obtained by training in a knowledge distillation manner based on an output result of a first content recall model, the first content recall model corresponds to at least two content recall targets, the second content recall model corresponds to a fusion recall target, and the fusion recall target is obtained by fusing at least two content recall targets;

and a recall module 1303, configured to determine recalled target content data based on the user feature vector and a fusion feature vector corresponding to the content data, where the fusion feature vector is obtained by performing feature extraction on the content data by the second content recall model.

Optionally, the apparatus further comprises:

the content feature extraction module is used for performing feature extraction on the content data in a content database through the second content recall model to obtain the fusion feature vector corresponding to the content data;

the index storage module is used for establishing a vector index for the fusion characteristic vector and storing the vector index to a vector recall index library;

the recall module 1303 is configured to:

and determining a target vector index from the vector recall index library based on the user characteristic vector, and recalling the target content data corresponding to the target vector index.

It should be noted that: the device provided in the above embodiment is only illustrated by dividing the functional modules, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the apparatus and method embodiments provided by the above embodiments belong to the same concept, and details of the implementation process are referred to as method embodiments, which are not described herein again.

Referring to fig. 14, a schematic structural diagram of a computer device according to an exemplary embodiment of the present application is shown. Specifically, the method comprises the following steps: the computer device 1400 includes a Central Processing Unit (CPU) 1401, a system memory 1404 including a random access memory 1402 and a read only memory 1403, and a system bus 1405 connecting the system memory 1404 and the Central Processing Unit 1401. The computer device 1400 also includes a basic Input/Output system (I/O system) 1406 that facilitates transfer of information between devices within the computer, and a mass storage device 1407 for storing an operating system 1413, application programs 1414, and other program modules 1415.

The basic input/output system 1406 includes a display 1408 for displaying information and an input device 1409, such as a mouse, keyboard, etc., for user input of information. Wherein the display 1408 and input device 1409 are both connected to the central processing unit 1401 via an input-output controller 1410 connected to the system bus 1405. The basic input/output system 1406 may also include an input/output controller 1410 for receiving and processing input from a number of other devices, such as a keyboard, mouse, or electronic stylus. Similarly, input-output controller 1410 also provides output to a display screen, a printer, or other type of output device.

The mass storage device 1407 is connected to the central processing unit 1401 through a mass storage controller (not shown) connected to the system bus 1405. The mass storage device 1407 and its associated computer-readable media provide non-volatile storage for the computer device 1400. That is, the mass storage device 1407 may include a computer readable medium (not shown) such as a hard disk or drive.

Without loss of generality, the computer-readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes Random Access Memory (RAM), Read Only Memory (ROM), flash Memory or other solid state Memory technology, Compact disk Read-Only Memory (CD-ROM), Digital Versatile Disks (DVD), or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices. Of course, those skilled in the art will appreciate that the computer storage media is not limited to the foregoing. The system memory 1404 and mass storage device 1407 described above may collectively be referred to as memory.

The memory stores one or more programs configured to be executed by the one or more central processing units 1401, the one or more programs containing instructions for implementing the methods described above, and the central processing unit 1401 executes the one or more programs to implement the methods provided by the various method embodiments described above.

According to various embodiments of the present application, the computer device 1400 may also operate as a remote computer connected to a network via a network, such as the Internet. That is, the computer device 1400 may be connected to the network 1412 through the network interface unit 1411 connected to the system bus 1405, or may be connected to other types of networks or remote computer systems (not shown) using the network interface unit 1411.

The memory also includes one or more programs, stored in the memory, that include instructions for performing the steps performed by the computer device in the methods provided by the embodiments of the present application.

An embodiment of the present application further provides a computer-readable storage medium, where at least one instruction, at least one program, a code set, or a set of instructions is stored in the computer-readable storage medium, and the at least one instruction, the at least one program, the code set, or the set of instructions is loaded and executed by a processor to implement the method for training a content recall model according to any one of the above embodiments, or to implement the method for content recall according to any one of the above embodiments.

Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to execute the training method of the content recall model described in the above embodiment, or execute the content recall method described in the above embodiment.

Those skilled in the art will appreciate that all or part of the steps in the methods of the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, which may be a computer readable storage medium contained in a memory of the above embodiments; or it may be a separate computer-readable storage medium not incorporated in the terminal. The computer readable storage medium has stored therein at least one instruction, at least one program, set of codes, or set of instructions that is loaded and executed by a processor to implement the method of any of the above method embodiments.

Optionally, the computer-readable storage medium may include: ROM, RAM, Solid State Drives (SSD), or optical disks, etc. The RAM may include a resistive Random Access Memory (ReRAM) and a Dynamic Random Access Memory (DRAM), among others. The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.

It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.

The above description is intended to be exemplary only, and not to limit the present application, and any modifications, equivalents, improvements, etc. made within the spirit and scope of the present application are intended to be included therein.

Claims

1. A method for training a content recall model, the method comprising:

2. The method of claim 1, wherein said training the first content recall model and the second content recall model based on the sample data comprises:

determining a first loss function of the first content recall model under the sample data;

determining a second loss function of the second content recall model under the sample data based on an output result of the first content recall model;

fusing the first loss function and the second loss function to obtain a target loss function;

training the first content recall model and the second content recall model based on the objective loss function.

3. The method of claim 2, wherein said determining a first loss function of the first content recall model under the sample data comprises:

4. The method of claim 3, wherein determining a second loss function of the second content recall model under the sample data based on output results of the first content recall model comprises:

5. The method of claim 3, wherein the determining at least two single objective probabilities of the first content recall model under the sample data comprises:

6. The method of claim 4, wherein said determining a second fusion probability of the second content recall model under the sample data comprises:

7. The method of claim 2, wherein said fusing said first loss function and said second loss function to obtain a target loss function comprises:

8. The method according to any one of claims 1 to 7, wherein after said obtaining sample data, said method further comprises:

randomly selecting positive sample data from the sample data;

and selecting negative sample data according to the sample heat of the sample content data in the sample data, wherein the selection proportion of the negative sample data is in positive correlation with the sample heat.

9. The method of any of claims 1 to 7, wherein at least two of the content recall targets include a content click and a content conversion, and wherein the fused recall target is a content click and conversion.

10. A method for recalling content, the method comprising:

11. The method of claim 10, further comprising:

performing feature extraction on the content data in a content database through the second content recall model to obtain the fusion feature vector corresponding to the content data;

establishing a vector index for the fusion feature vector, and storing the vector index to a vector recall index library;

the determining recalled target content data based on the user feature vector and the fusion feature vector corresponding to the content data includes:

12. An apparatus for training a content recall model, the apparatus comprising:

13. A content recall apparatus, the apparatus comprising:

14. A computer device comprising a processor and a memory, the memory having stored therein at least one instruction that is loaded and executed by the processor to implement a method of training a content recall model according to any of claims 1 to 9 or to implement a method of content recall according to any of claims 10 to 11.

15. A computer-readable storage medium having stored therein at least one instruction which is loaded and executed by a processor to implement a method for training a content recall model according to any one of claims 1 to 9 or a method for content recall according to any one of claims 10 to 11.