CN114942843A

CN114942843A - Model data processing system

Info

Publication number: CN114942843A
Application number: CN202210470843.2A
Authority: CN
Inventors: 崔泽宇; 马坚鑫; 周畅; 周靖人; 杨红霞
Original assignee: Alibaba China Co Ltd
Current assignee: Alibaba China Co Ltd
Priority date: 2022-04-28
Filing date: 2022-04-28
Publication date: 2022-08-26

Abstract

An embodiment of the present invention provides a model data processing system, including: terminal equipment and cloud server. And the cloud server trains the language model and configures the converged language model into the terminal equipment. And the terminal equipment optimizes the template information according to the historical behavior text describing the user behavior and the language model trained to be convergent so as to generate a recommendation model comprising the language model and the optimized template information. The terminal equipment only needs to configure a universal language model and template information corresponding to different recommendation tasks locally, the terminal equipment does not need to configure an independent recommendation model for each recommendation service, and the loss of memory resources on the terminal equipment can be reduced because the template information contains fewer parameters. Meanwhile, the key processing process of the terminal device is to optimize the first template information containing a small number of parameters, and in the optimization process, the loss of computing resources in the terminal device is reduced.

Description

Model data processing system

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a model data processing system.

Background

Personalized recommendation is applied to a plurality of scenes, for example, in a shopping scene, commodities can be recommended to a user according to the purchasing behavior of the user; for example, in a chat scene, the commodity recommendation can be performed for the user according to the chat content of the user.

In the prior art, recommendation can be generally realized by using a recommendation model, and the recommendation models correspond to scenes one to one, that is, commodity recommendation is realized by using a recommendation model corresponding to a shopping scene, and recommendation is realized in a chat process by using a recommendation model corresponding to a chat scene. In practice, a plurality of recommendation models suitable for different scenes are required to be configured on the terminal device. And multiple model configurations may increase the resource consumption of the terminal device.

Therefore, how to reduce the resource consumption of the terminal device becomes an urgent problem to be solved while ensuring that different recommended tasks can be realized.

Disclosure of Invention

In view of this, an embodiment of the present invention provides a model data processing system for reducing resource consumption of a terminal device.

In a first aspect, an embodiment of the present invention provides a model data processing system, including: a terminal device and a cloud server;

the cloud server is used for training the language model according to the training text; configuring the converged language model and first template information corresponding to a first recommended task into the terminal equipment;

the terminal equipment is used for acquiring a first historical behavior text for describing user behaviors and the first template information;

and optimizing the first template information according to the first historical behavior text and the language model to obtain a first recommendation model comprising the optimized first template information and the language model, wherein the first recommendation model corresponds to the first recommendation task.

In a second aspect, an embodiment of the present invention provides another model data processing system, including: a terminal device and a cloud server;

the cloud server is used for training a language model according to the training text; configuring the converged language model and template information corresponding to the recommended task into the terminal equipment;

the terminal device is used for acquiring a first historical behavior text for describing user behaviors, a description text of an alternative recommended object corresponding to the first historical behavior text and template information corresponding to the recommended task;

optimizing the template information according to the first historical behavior text, the description text, a classification layer and a feature extraction layer in the language model;

and setting the classification layer and the optimized template information to have the same parameters so as to obtain a recommendation model comprising the optimized template information, the classification layer and the feature extraction layer, wherein the recommendation model corresponds to the recommendation task.

The model data processing system provided by the embodiment of the invention comprises terminal equipment and a cloud server. And the cloud server trains the language model and configures the converged language model into the terminal equipment. The terminal equipment obtains a first historical behavior text describing the behavior generated by the user and first template information corresponding to the first recommended task. Then, the first template information is optimized according to the first historical behavior text and the language model trained to be converged to generate a first recommendation model which comprises the language model and the optimized first template information and is suitable for the first recommendation task.

In the process, through the cooperative processing of the terminal equipment and the cloud server, the terminal equipment is provided with a universal language model and a plurality of template information corresponding to different recommendation tasks, an independent recommendation model is not required to be configured for each recommendation task, the number of models and the number of model parameters stored in the terminal equipment are greatly reduced, and the consumption of a memory of the terminal equipment can be reduced. Compared with a complete training recommendation model, the parameters contained in the optimized template information are fewer, so that the terminal equipment only needs to collect a small number of training samples and perform optimization on the template information locally, and the consumption of computing resources in the terminal equipment is reduced in the optimization process.

And because the language model is trained by the cloud server in advance and can be directly used, the terminal equipment is mainly used for optimizing the first template information in the process of generating the first recommendation model. Because the number of the parameters in the first template information is less, the data processing pressure of the terminal equipment in the optimization process can be reduced, and the generation speed of the recommendation model is increased.

In addition, due to the universality of the historical behavior texts, behaviors of the user in different scenes can be understood by the terminal device, namely, the behaviors described by the first historical behavior text can participate in the optimization of the template information and the generation process of the model, and the accuracy of the recommendation model can also be improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.

FIG. 1 is a block diagram of a data processing system according to an embodiment of the present invention;

FIG. 2 is a block diagram of a model data processing system according to an embodiment of the present invention;

fig. 3 is a flowchart of a template information optimization method according to an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of a recommendation model suitable for generating a recommendation task according to an embodiment of the present invention;

FIG. 5 is a flowchart of another method for optimizing template information according to an embodiment of the present invention;

fig. 6 is a flowchart of another template information optimization method according to an embodiment of the present invention;

fig. 7 is a flowchart of another template information optimization method according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a recommendation model suitable for a discriminant recommendation task according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in the examples of the present invention and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, and "a" and "an" generally include at least two, but do not exclude at least one, unless the context clearly dictates otherwise.

It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

The words "if," "if," as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to a recognition," depending on the context. Similarly, the phrases "if determined" or "if identified (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when identified (a stated condition or event)" or "in response to an identification (a stated condition or event)", depending on the context.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a commodity or system that includes the element.

Some embodiments of the invention are described in detail below with reference to the accompanying drawings. The features of the embodiments and examples described below may be combined with each other without conflict between the embodiments. In addition, the sequence of steps in each method embodiment described below is only an example and is not strictly limited.

Before describing the embodiments provided by the present invention in detail, a description may be given of a common recommendation model training and configuration process.

As shown in fig. 1, in a data processing system, different Applications (APPs) installed on terminal devices in the system may use different recommendation models to execute different recommendation tasks, so as to provide different recommendation services for users. The recommended task may be the recommended task in the shopping scenario or in the chat scenario mentioned in the background. In practice, the recommendation model may be trained using historical behaviors generated by the user as training samples. And one common form of historical behavior is in the form of identification information. For example, for a historical behavior "user a purchases article B", the identification information form thereof may be expressed as "001 purchases 002". Since the identification information is not common in different scenes, that is, "001" may correspond to different users and "002" may also correspond to different commodities in different recommendation tasks, different behaviors of the users in different scenes may not be accurately understood by the recommendation model.

In order to improve the situation, in practice, it is necessary to collect the historical behaviors of the user in the form of identification information generated under different recommended tasks respectively, and then train the recommendation model suitable for each recommended task by using the historical behaviors in the form of identification information generated under each recommended task respectively. Therefore, the APP can execute a plurality of recommendation tasks, a plurality of recommendation models need to be trained, and the plurality of recommendation models are encapsulated in the APP. At this time, the more recommendation services, the more models encapsulated in the APP, that is, the more recommendation models to be configured in the terminal device. For the APPs including various recommended tasks, the accumulation of the models is difficult to meet the requirement of software refinement, and because a plurality of models are configured in the terminal device, the terminal device needs to store a large number of model parameters, which increases the consumption of the memory in the terminal device.

The calculation amount of the recommendation model training process is large, so the training of the recommendation model is usually executed by a cloud server in the data processing system, and the recommendation model trained by the cloud server is configured on the terminal equipment along with the installation of the APP. In addition, in order to guarantee the training effect of each recommendation model, a large amount of historical behaviors in the form of identification information suitable for the recommendation model need to be collected. At this time, the difficulty of sample collection is increased, and the training period of the recommended model is further prolonged.

In order to improve the above problems, a data processing system provided by the embodiment of the present invention may be used, as shown in fig. 2. The overall working idea of the data processing system is as follows: the cloud server firstly trains a universal language model suitable for different recommended services. And then configuring the general language model and the template information corresponding to different recommended tasks into the terminal equipment. Further, the terminal device optimizes the template information corresponding to different recommended tasks by utilizing the feature extraction capability and the text generation capability of the universal language model. And finally, configuring a universal language model and a plurality of optimized template information on the terminal equipment. The terminal equipment can obtain recommendation models suitable for different recommendation tasks in a mode of combining the language model and the optimized template information. In practice, the recommended task may be a generative recommended task or a discriminant recommended task, where text information is output by a recommendation model for executing the recommendation task, and a classification result is output by the recommendation model for executing the discriminant recommended task. The specific contents of the different recommendation tasks can be seen in the following related description.

From the above description, it can be seen that the data processing system shown in fig. 2 is actually a model data processing system with end cloud cooperation. When the recommended task is a generative recommended task, the collaborative work process of the terminal device and the cloud server in the system may be in detail as follows:

the cloud server can collect text data generated by different users in any scene, the text data is used as a training text, a general language model is trained in an unsupervised training mode, and the general language model can be configured on the terminal equipment. Optionally, the language model trained to converge and the template information corresponding to different recommended tasks may be encapsulated in the APP providing the recommendation service at the same time, and then in response to installation of the APP, the common language model and the plurality of template information may be configured on the terminal device together.

Further, the terminal device takes the collected historical behavior text of the user and template information corresponding to any recommended task as input of a universal language model, performs feature extraction and feature fusion on the model input by using the feature extraction capability of the universal language model, generates a predicted behavior text based on a feature fusion result by using the generation capability of the language model, and optimizes the template information by using the predicted behavior text as a basis to obtain the optimized template information. Finally, the terminal device may combine the template information and the language model into a recommendation model suitable for any of the recommended tasks. And the parameters of the generic language model remain unchanged while the template information is optimized. Alternatively, the template information may be in text form or feature vector form.

It should be noted that, from the perspective of protecting personal privacy, the collected historical behavior text may also be subjected to filtering processing on the privacy information therein, so that the terminal device obtains the historical behavior text that does not include the user privacy information.

In this embodiment, through cooperative work of the terminal device and the cloud server, the terminal device is configured with a universal language model and a plurality of template information corresponding to different recommended tasks, the number of models and model parameters configured in the terminal device are greatly reduced, and consumption of memory in the terminal device can be reduced. Compared with a complete training recommendation model, the optimized template information contains fewer parameters, so that the terminal equipment only needs to collect a small number of training samples and perform optimization on the template information locally, and the consumption of computing resources in the terminal equipment is reduced in the optimization process. In addition, because the input of the universal language model is a text, and the text has universality, the historical behavior text in the text form generated under any recommended task can be understood by the language model, the template information can be optimized by using the historical behavior text, and the difficulty in collecting training samples is also reduced.

Taking online shopping APP as an example, different generative recommendation tasks, whose action objects may be embodied as buyers or merchants, for clarity of description, the following explains common generative recommendation tasks directly using the buyers or merchants:

the search word recommending task is that the buyer recommends the search words displayed in a search box of an APP home page of online shopping. The commodity recommending task is to recommend commodities shown on a home page of the shopping APP or a search result page for the buyer according to the search terms input by the buyer. The commodity recommending task can also be that other commodities are recommended to the seller in response to the commodity shelving operation of the seller, and the commodities are used for guiding the merchant to shelf new commodities. And the commodity recommendation task based on the conversation is to recommend commodities to the buyer in the chat process according to the chat content of the buyer and the customer service of the merchant.

When the above-mentioned commodity recommending task is executed, specifically, the candidate recommended commodities may be obtained first, then each candidate recommended commodity is classified, and the classification result indicates whether the buyer or the merchant clicks on each candidate recommended commodity and the probability of clicking on the candidate recommended commodity. Then, in one case, whether the alternative recommended commodity is displayed can be determined according to the probability, and in another case, the alternative recommended commodity can be ranked according to the probability so as to display the ranking result. The classification process may be considered as a subtask of the goods recommendation task, and the subtask may be considered as a discriminant task.

Optionally, when recommending the goods for the buyer or the merchant, an explanation document for the goods may also be generated at the same time, and the explanation document is used to explain the reason why the buyer or the merchant is interested in the goods. The generation of the interpretation scheme is also an explanatory text generation task, which can also be regarded as a subtask in the goods recommendation task, and the subtask can be regarded as a generative task.

For each generative recommendation task described above, the corresponding template information may be understood as follows:

when the recommendation task is a dialog-based goods recommendation task, the corresponding template information may be "____ is a good that the buyer may purchase". When the recommendation task is a goods recommendation task for a merchant, the corresponding template information may be "____ is a goods that the merchant may be on the shelf". When the recommendation task is a search word recommendation task, the corresponding template information may be "buyer may search ____". When the recommendation task is an explanatory text generation task, its corresponding template information may be "the buyer is likely to purchase the commodity a because ____".

In practice, the recommendation model configured in the terminal device for executing the generative recommendation task can generate the behavior text of the next action of the user, wherein the generation time of the behavior text of the next action is later than that of the first historical behavior text. And the behavior text of the previous behavior can include the historical behavior text and the content which does not appear in the template information.

Based on the above description, after configuring the converged terminal device, the work focus of the terminal device is on the optimization of the template information, and the specific process of the optimization can be understood by combining the following flowcharts. In practice, since different generative recommendation tasks have different template information, and the optimization process for each template information is the same, the optimization process of the corresponding template information can be described by taking any generative recommendation task as an example. For clarity of the entire description, this arbitrary generative recommendation task may be referred to as a first recommendation task, its corresponding template information is referred to as first template information, a model for executing the first recommendation task is a first recommendation model, and a historical behavior text used when optimizing the first template information is referred to as a first historical behavior text.

Fig. 3 is a flowchart of a template information optimization method according to an embodiment of the present invention. The template information optimization method provided by the embodiment of the invention can be executed by the terminal equipment used by the user. In this embodiment, the first template information may be expressed in a text form. As shown in fig. 3, the method may include the steps of:

s101, a first historical behavior text used for describing user behaviors and first template information corresponding to a first recommended task are obtained.

When the APP is installed on the terminal equipment, the terminal equipment can acquire historical behaviors generated in the APP by a user in a historical time period, and therefore a historical behavior text describing the historical behaviors in a text form is obtained.

For the behavior described in the first history behavior text, taking online shopping APP as an example, the behavior may be different behaviors generated when the user uses different services including a recommendation service provided by the online shopping APP for a period of time. For example, the first historical behavior text may describe search behavior, click behavior, browsing behavior, payment behavior, and the like that may be generated when the user uses the wine reservation service, the transportation travel service, the medical service, the loan financial service, and the like provided by the online shopping APP. Each piece of historical behavior text may be composed of an action and an action result. For example, the historical behavior text is "user a purchases item B," where "purchase" is the behavior and "item B" is the behavior result.

Meanwhile, in response to the installation of the APP, the terminal device can also acquire first template information corresponding to the first recommended task. The first template information may also be encapsulated in the APP. And the semantics of the first template information are used for reflecting the task content of the first recommended task. The first recommendation task may be considered as a recommendation service provided by the online shopping APP for the user or a sub-service in a recommendation service, such as the above goods recommendation task for the user, a search term recommendation task, an explanatory text generation task, a conversation-based goods recommendation task, and so on.

S102, optimizing first template information according to the first historical behavior text and the converged language model.

Further, the terminal device may also input the first historical behavior text in the form of text and the first template information into a language model that has been configured locally, and generate a prediction result from the language model. Alternatively, the structure of the language model may be as shown in FIG. 4. Based on this model structure, the generation process of the prediction result can be described as: and a feature extraction layer in the language model respectively extracts features of the first template information in the text form and the first historical behavior text, the extracted feature vectors are fused, and a prediction result is generated by a generation layer in the language model according to the fusion result. And the prediction result output by the language model is the predicted behavior text of the next behavior of the user. That is, the language model can analyze the first historical behavior text of the user to obtain personalized data such as behavior habits, commodity preferences and the like of the user, and then the first template information is used as guide information to guide generation of a prediction result.

Then, the terminal device may optimize the first template information according to a difference between the obtained second historical behavior text and the prediction result. Optionally, the optimization mode may be a text description mode that adjusts the first template information under the condition that the semantics are not changed according to the difference size. The generation time of the second historical behavior text is later than that of the first historical behavior text, so that the second historical behavior text is the real result of the next behavior of the user, and the difference between the second historical behavior text and the prediction result is the optimization direction of the first template information. And stopping optimizing the first template information after the difference between the second historical behavior text and the prediction result meets the requirement. Wherein, in the process of optimizing the template information, the parameters of the language model are kept unchanged.

It should be noted that, in practice, considering that a text cannot be directly input into a language model, word vector conversion may be performed on first template information in a text form and a first historical behavior text, and then a word vector obtained after the conversion is input into the language model. The "inputting text into the language model" described in the embodiments of the present invention is actually inputting word vectors of the text into the language model.

In this embodiment, compared with a completely trained recommendation model, parameters included in the optimized template information are fewer, so that the terminal device only needs to collect a small amount of historical behavior texts and perform optimization of the template information locally, the calculation amount in the optimization process is small, and the consumption of calculation resources in the terminal device is reduced. In addition, because the input of the universal language model is a text, and the text has universality, the historical behavior text in the text form generated under any recommended task can be understood by the language model, and the historical behavior text can be used for reducing the difficulty in collecting training samples and simultaneously quickly completing the optimization of template information.

Optionally, the first template information may also be in the form of a feature vector, which may be referred to as a first template vector. The word vector corresponding to the first template information in the text form may be optimized as an initial template vector in the first template vector.

At this time, similar to the process in the embodiment shown in fig. 3, the optimization process of the terminal device on the first template vector may be described as follows: inputting the first template vector and the first historical behavior text into a language model configured locally by terminal equipment, extracting feature vectors of the two parts of input contents by a feature extraction layer in the language model respectively, fusing the extracted feature vectors, and generating a prediction result by a generation layer in the language model according to a fusion result. Then, the terminal device may calculate a loss value according to a difference between the prediction result and the second historical behavior text, and adjust elements in the first template vector by back propagation, gradient descent, and the like based on the loss value. Wherein the parameters of the language model remain unchanged during the adjustment of the elements. And because the first template vector is a more abstract feature vector, the first template information is easier to understand by a language model compared with the first template information in a text form, so that the optimization of the first template vector is easier and better in optimization effect.

No matter the first template information is in a text form or a vector form, the feature vector extracted by the language model and the first template vector both include a plurality of feature vectors, and when the first template information is the first template vector, in order to further reduce the calculation amount in the feature fusion process and reduce the loss of the calculation resources of the terminal device in the optimization process, optionally, the language model may fuse the feature vectors in the following manner:

and a first feature extraction layer in the language model fuses each feature vector of the first historical text with each feature vector in the first template vector respectively to obtain a first fused feature vector, and fuses the feature vectors in the first template vector mutually to obtain a second fused feature vector. The first and second fused feature vectors each still include a plurality of feature vectors. And then, a second feature extraction layer in the language model fuses the feature vectors in the first fused feature vector with each other, and fuses the feature vectors in the second fused feature vector with each other again. Wherein the first feature extraction layer and the second feature extraction layer each comprise at least one layer, and the second feature extraction layer follows the first feature extraction layer.

The above is actually a step-by-step fusion method: the first feature extraction layer fuses the feature vector of each first historical behavior text with the first template vector, and fuses the feature vectors in the first template vector at the same time; and the second feature extraction layer performs mutual fusion on the feature vectors of the different historical behavior texts, and performs mutual fusion on the feature vectors in the first template vector again. The above process can also be understood by means of the model structure shown in fig. 4.

Optionally, when the first template information is represented in a text form, the language model may also be fused according to the above two-step fusion manner, that is, the first feature extraction layer fuses each piece of the first historical behavior text and the respective feature vector of the first template information, and fuses the feature vectors of the first template information together; and the second feature extraction layer performs mutual fusion on the feature vectors of the different historical behavior texts, and performs mutual fusion on the feature vectors of the first template information again, so that the calculation speed of feature fusion is increased.

In the optimization process provided by the embodiment shown in fig. 3, the difference between the prediction result output by the language model and the second historical behavior text indicates the accuracy of the first template information, that is, whether the template information needs to be optimized continuously, and fig. 5 is a flowchart of another template information optimization method provided by the embodiment of the present invention in order to increase the optimization speed of the first template information. In this embodiment, the first template information may be a first template vector. As shown in fig. 5, the method may include the steps of:

s201, a first historical behavior text used for describing user behaviors and first template information corresponding to a first recommended task are obtained.

The execution process of step S201 may refer to the related description in the above embodiments, which is not described herein again. In this embodiment, the first template information may specifically represent a feature vector form, that is, a first template vector.

S202, a first template text corresponding to the first recommended task is obtained.

The first template text is also encapsulated in the APP, and when the APP is installed, the terminal equipment can obtain the first template text. The first template text corresponds to the first recommended task and has the same semantics as the first template vector. For example, The semantic of The first template vector is "The user may purchase article A because ____", and The first template text may be "The user purchases a product name A, beacon ____".

S203, inputting the first historical behavior text, the first template text and the first template vector into a language model, and fusing the feature vectors of the first historical behavior text and the first template vector by the language model.

And then, inputting the first historical behavior text, the first template text and the first template vector into a language model, more specifically, inputting the word vectors of the first historical behavior text and the first template vector into the language model, respectively extracting the features of the first historical behavior text and the first template text by a feature extraction layer in the language model, and then fusing the extracted feature vectors with the first template vector.

The feature vectors of the first historical behavior text, the first template text and the first template vector extracted by the language model shown in fig. 4 are all composed of a plurality of feature vectors. In order to improve the feature fusion speed, a step-by-step fusion mode can also be adopted: a first feature extraction layer of the language model fuses the feature vectors of the first template text and each first historical behavior text with each feature vector in the first template vector to obtain a third fused feature vector, and fuses the first template vectors with each other to obtain a fourth fused vector; and the second feature extraction layer of the language model fuses the feature vectors in the third fused feature vector with each other, and fuses the feature vectors in the fourth fused feature vector with each other again.

And S204, acquiring a prediction result generated by the language model according to the fusion result.

S205, adjusting elements in the first template vector according to the difference between the prediction result and the second historical behavior text, wherein the generation time of the second historical behavior text is later than that of the first historical behavior text.

Finally, the language model can generate a prediction result according to the fusion result obtained by the step-by-step fusion. Alternatively, the prediction result may be generated by a generation layer in the language model. As in the embodiment shown in fig. 3, the terminal device may calculate a loss value according to a difference between the prediction result and the second historical behavior text, and adjust the elements in the first template vector by back propagation, gradient descent, and the like based on the loss value. The word vector corresponding to the first template information in the text form may be optimized as an initial template vector in the first template vector. It should be noted that, in the optimization, i.e., the element adjustment process, parameters of the first feature extraction layer, the second feature extraction layer, and the generation layer in the language model are fixed.

In addition, the process of the present embodiment can also be understood in conjunction with the language model shown in fig. 4. Compared with the embodiment shown in fig. 3, the first template text is newly introduced as the input of the language model in this embodiment, and is represented by a dashed box in order to distinguish from the input of the language model in the embodiment shown in fig. 3.

In this embodiment, after the first historical behavior text and the first template vector are obtained, a first template text may also be obtained. The first template vector and the first template text have the same semantic meaning, namely, the task content of the first recommended task can be reflected. Therefore, the first template text plays a guiding role in generating the prediction result for the language model, so that the difference between the generated prediction result and the second historical behavior text is smaller, and the optimization speed of the first template vector is increased.

In addition, when the first template information is embodied in a text form, the first template text can also be introduced, and at this time, the language model can still perform feature vector fusion in the step-by-step fusion manner, and perform optimization of the first template information according to the generated prediction result.

The process of feature fusion may be: and a first feature extraction layer in the language model fuses each feature vector of the first historical behavior text and each feature vector of the first template text with a word vector of first template information in a text form respectively to obtain a third fused feature vector, and simultaneously fuses the word vectors of the first template information in the text form mutually to obtain a fourth fused feature vector. This third fused feature vector and the fourth fused feature vector each still consist of a plurality of feature vectors. And then, a second feature extraction layer in the language model is used for mutually fusing the feature vectors in the third fused feature vector and mutually fusing the feature vectors in the fourth fused feature vector again.

In summary, for the terminal device configured with the language model for training convergence, multiple recommended models can be obtained on the basis of configuring optimized template information corresponding to different generative tasks, and the number of model parameters is greatly reduced, thereby reducing the loss of memory resources in the terminal device. For the template information optimization process, in one case, the first historical behavior text and the first template information may be used as input of the language model, and the first template information may be optimized according to a prediction result output by the language model, so as to generate a first recommendation model including the language model and the optimized first template information. The first template information is expressed in a text form or a characteristic vector form. Because the template information contains fewer parameters, the consumption of computing resources on the terminal equipment can be reduced in the optimization process. In another case, the first historical behavior text, the first template text and the first template information may be used as input of a language model, the first template information is optimized, and the first recommendation model is finally obtained. The first template text is used for guiding the generation direction of the prediction result, so that the loss of the terminal equipment is reduced, and the optimization speed of the first template information is improved.

It should be noted that, optionally, for the optimization process of the template information in the foregoing embodiments, the template information may be continuously optimized along with new historical behavior texts continuously generated by the user, that is, real-time update of the recommendation model is realized. Considering that the template optimization needs a certain time, the terminal device may also reduce the optimization frequency of the template, i.e. periodically perform the optimization of the template information.

The first recommendation task in the above embodiments is a generative recommendation task, that is, the recommendation result is generated by the language model itself and is not selected from the alternative recommendation results. In practice, there may be a discriminant recommendation task, i.e., a second recommendation task. Compared with the generative recommendation task, the discriminant recommendation task can be understood as a classification task, namely the recommendation model is used for classifying the alternative recommendation objects, and the classification result is used for indicating whether the alternative recommendation objects are displayed to the user or not. For the model data processing system with end cloud cooperation as shown in fig. 2, the terminal device and the cloud server in the system also obtain a recommendation model suitable for discriminant tasks through cooperative work.

For such discriminant task, the cooperative working process of the terminal device and the cloud server in the system may be described as follows:

the cloud server firstly trains a universal language model and configures the universal language model to the terminal equipment. Further, the terminal device takes the collected historical behavior text of the user, the description text of the alternative recommendation object corresponding to the historical behavior text and the template information of the different discriminant recommendation tasks as the input of a general language model, utilizes a feature extraction layer in the general language model to perform feature extraction and feature fusion on the input of the model, generates a classification result based on the feature fusion result, and optimizes the template information by taking the classification result as the basis to obtain the optimized template information. And then, setting the classification layer and the optimized second template information to have the same parameters, namely sharing the parameters of the classification layer and the optimized second template information. And the optimized second template information is in a feature vector form. And the parameters of the generic language model remain unchanged while the template information is optimized. Finally, the terminal device can combine the optimized template information, the classification layer and the feature extraction layer in the language model into a recommendation model. Alternatively, the template information may be in the form of text or feature vectors. And in response to the installation of the APP, the classification layer can be configured on the terminal equipment together with the language model and the template information.

Taking online shopping APP as an example, the alternative recommendation object can be selected according to the historical behavior text and the sales condition of each commodity in the online shopping APP. The description text of the alternative recommended object is used for describing the name, brand, specification and other detailed information of the selected alternative recommended object. And the classification result output by the classification layer indicates whether the user clicks the alternative recommendation object. The recommendation task can be regarded as a recommendation service provided by the online shopping APP for the user or a sub-service in a certain recommendation service. For example, when the recommendation task is a goods recommendation task for the buyer, the template information may be "the goods that the buyer is likely to click on is ____".

In this embodiment, through the cooperative processing of the terminal device and the cloud server, the terminal device is only required to be configured with a universal language model and a plurality of template information corresponding to different discriminant recommendation tasks, so that the consumption of memory in the terminal device can be reduced. And the parameters contained in the optimized template information are less, so that the consumption of computing resources in the terminal equipment is reduced in the optimization process. In addition, because the input of the universal language model is a text, and the text has universality, the historical behavior text in the text form generated under any recommended task can be understood by the language model, the template information can be optimized by using the historical behavior text, and the difficulty in collecting training samples is also reduced.

In practice, the recommendation model configured in the terminal device for executing the discriminant recommendation task can output a classification result of the candidate recommendation object corresponding to the behavior text of the next behavior of the user, where the classification result indicates whether the user clicks or purchases the candidate recommendation object.

Based on the above description, after configuring the converged terminal device, the work focus of the terminal device is on the optimization of the template information, and the specific process of the optimization can be understood by combining the following flowcharts. In practice, since different discriminant recommendation tasks also have different template information and the optimization process for each template information is the same, the optimization process for the corresponding template information can be described by taking any discriminant recommendation task as an example. For clarity of the whole description, any discriminant recommendation task is referred to as a second recommendation task, the corresponding template information is referred to as second template information, the model for executing the second recommendation task is a second recommendation model, and the historical behavior text used when the second template information is optimized is still referred to as a first historical behavior text.

Fig. 6 is a flowchart of another template information optimization method according to an embodiment of the present invention. The method may also be performed by a terminal device. In this embodiment, the second template information may be expressed in a text form. As shown in fig. 6, the method may include the steps of:

s301, a first historical behavior text for describing user behaviors, a description text of an alternative recommendation object corresponding to the first historical behavior text and second template information corresponding to a second recommendation task are obtained.

For specific meanings of the first historical behavior text and the description text, reference may be made to the description in the above related embodiments, and details are not repeated here. Meanwhile, second template information corresponding to the second recommended task can be obtained, and the template information and the language model are configured in the terminal equipment. And the semantics of the second template information is the specific task content of the second recommended task. Optionally, after the APP runs in the terminal device, the second template information encapsulated in the APP may be configured in the terminal device.

S302, optimizing second template information according to the first historical behavior text, the description text and the feature extraction layer in the language model.

Further, the terminal device may input the first historical behavior text in the text form, the description text, and the second template information into a language model configured locally, and then output a classification result by the classification layer. The language model may then reuse this classification result to optimize the second template information.

For the generation of the classification result, the language model may perform feature extraction and feature fusion on the first historical behavior text, the description text, and the second template information, respectively, so as to output a feature fusion result by the language model. And the classification layer outputs a classification result according to the feature fusion result. That is, the feature extraction layer in the language model can analyze the first historical behavior text and the description text of the user through feature extraction to obtain personalized data such as behavior habits, commodity preferences and the like of the user, and the classification layer outputs a classification result according to the second template information in the feature fusion result.

For the optimization of the second template information, because the classification result output by the language model is the predicted behavior text of the next behavior of the user, the terminal device can also obtain a second historical behavior text which is generated later than the first historical behavior text, and the actual result of the next behavior of the user is the behavior and behavior result described by the second historical behavior text, the terminal device can optimize the second template information according to the difference between the classification result and the second historical behavior text. Optionally, the optimization process may be to adjust a description manner of the second template information according to the difference size under the condition that the semantics are the same.

Optionally, the second template information may also be in the form of a feature vector, referred to as a second template vector. At this time, similar to the process in the embodiment shown in fig. 6, the process of optimizing the second template vector by the terminal device is as follows: and inputting the description text, the second template vector and the first historical behavior text into a language model, extracting the feature vectors by a feature extraction layer in the language model respectively, and fusing the extracted feature vectors. And classifying the alternative recommended objects corresponding to the first historical behavior text by the classification layer according to the fusion result. Because the classification result output by the classification layer is used for predicting the next behavior of the user, the terminal device can also obtain a second historical behavior text of which the generation time is later than that of the first historical behavior text, and the real result of the next behavior of the user is the behavior and behavior result described by the second historical behavior text, therefore, the terminal device can calculate a loss value according to the difference between the classification result and the second historical behavior text, and adjust elements in the second template vector by using the loss value in a gradient descent and back propagation mode. The word vector corresponding to the second template information in the text form may be used as the initial second template vector, so as to perform element adjustment on the basis. And in the element adjusting process, the parameters of the first feature extraction layer and the second feature extraction layer in the language model are fixed and unchanged.

Compared with the word vector corresponding to the second template information in the text form, the second template vector is a more abstract feature vector, and the second template vector is easier to understand by a language model, so that the second template vector is easier to optimize, and the optimization effect is better.

No matter the second template information is in a text form or a vector form, the feature vector proposed by the language model and the second template vector each include a plurality of feature vectors, in order to further reduce the amount of computation in the feature fusion process and reduce the consumption of computing resources of the terminal device in the optimization process, optionally, when the second template information is the second template vector, the language model may fuse the feature vectors in the following manner:

and the first feature extraction layer in the language model is used for fusing each feature vector included in the first historical text and the description text with each feature vector in the second template vector respectively to obtain a fifth fused feature vector, and mutually fusing the feature vectors in the second template vector to obtain a sixth fused feature vector. This fifth and sixth fused feature vectors still each comprise a plurality of feature vectors. And then, the second feature extraction layer in the language model is used for mutually fusing the feature vectors in the fifth fusion feature vector and mutually fusing the feature vectors in the sixth fusion feature vector again.

It should be noted that, when the second template information is represented in a text form, the language model may also be in the step-by-step fusion manner, that is, the first feature extraction layer of the language model fuses the feature vectors of each of the first historical behavior text and the description text with the feature vector of the second template information, and fuses the feature vectors in the second template information with each other; and the second feature extraction layer of the language model fuses the feature vectors of the historical behavior text and the description text with each other, and fuses the feature vectors in the second template information with each other again, so that the calculation speed of feature fusion can be improved.

In the optimization process provided in the embodiment shown in fig. 6, the difference between the classification result output by the language model and the second historical behavior text indicates the accuracy of the second template information, i.e., whether the template information needs to be optimized continuously, and fig. 7 is a flowchart of another template information optimization method provided in the embodiment of the present invention, in order to improve the optimization speed of the second template information. In this embodiment, the second template information may be a second template vector. As shown in fig. 8, the method may include the steps of:

s401, obtaining a first historical behavior text for describing the behavior of the user, a description text of an alternative recommendation object corresponding to the first historical behavior text, and a second template vector corresponding to a second recommendation task.

The execution process of step S401 may refer to the related description in the foregoing embodiments, and is not described herein again.

S402, acquiring a second template text corresponding to the second recommended task.

Similar to the second template vector, the second template text is also encapsulated in the APP, and after the APP is installed in the terminal device, the terminal device can obtain the second template text. The second template text corresponds to the second recommended task and has the same semantic meaning as the second template vector. For example, The semantic of The second template vector may be "____ is a good that The buyer may click on" and The second template text may be "The user's mac c i ck a product name ____".

And S403, inputting the first historical behavior text, the description text, the second template vector and the second template text into a language model, and fusing the feature vectors and the second template vector of the first historical behavior text, the description text and the second template text by the language model.

Further, the first historical behavior text, the second template text and the description text can be input into a language model, more specifically, word vectors of the first historical behavior text, the second template text and the description text are input into the language model, the word vectors of the first historical behavior text, the second template text and the description text are extracted by the language model respectively, and then the extracted feature vectors and the second template vector of the input language model are fused.

And the first historical behavior text, the feature vector of each of the description text and the second template vector are all composed of a plurality of feature vectors. In order to increase the computation speed of feature fusion, a step-by-step fusion mode can also be adopted: a first feature extraction layer in the language model fuses respective feature vectors of each first historical behavior text, each second template text and each description text with a second template vector to obtain a seventh fused feature vector, and fuses the second template vectors with each other to obtain an eighth fused vector; and the second feature extraction layer in the language model fuses the feature vectors in the seventh fused feature vector with each other, and fuses the feature vectors in the eighth fused feature vector with each other again.

S404, obtaining a classification result output by the classification layer according to the fusion result.

S405, adjusting elements of a second template vector according to the difference between the classification result and a second historical behavior text, wherein the generation time of the second historical behavior text is later than that of the first historical behavior text.

The execution process of step S404 to step S405 may refer to the relevant description in the foregoing embodiments, and is not described herein again.

In the element adjusting process, parameters of the first feature extraction layer and the second feature extraction layer in the language model are fixed and unchanged. The process of this embodiment can also be understood in conjunction with the second recommendation model shown in fig. 8. Compared with the embodiment shown in fig. 7, in this embodiment, a second template text is newly introduced as an input of the model, and the second template text is represented by a dashed box for distinguishing from the input of the model in the embodiment shown in fig. 7.

In this embodiment, after the first historical behavior text, the description text, and the second template vector are obtained, a second template text may also be obtained. The second template vector and the second template text have the same semantic meaning, namely, the task content of the second recommended task can be reflected. Therefore, the second template text plays a guiding role in generating the prediction result for the language model, so that the difference between the generated prediction result and the second historical behavior text is smaller, and the optimization speed of the second template vector is increased.

On the basis of introducing the second template text in the embodiment shown in fig. 7, optionally, when the second template information is embodied in a text form, at this time, the language model may still perform feature vector fusion in the above-mentioned step-by-step fusion manner, and perform optimization of the first template information according to the generated prediction result.

The process of feature fusion may be: and a first feature extraction layer in the language model fuses each feature vector of the first historical behavior text, the description text and the second template text with the word vector of the second template information respectively to obtain a seventh fused feature vector, and fuses the word vectors of the second template information mutually to obtain an eighth fused feature vector. This seventh fused feature vector and the eighth fused feature vector each still consist of a plurality of feature vectors. And then, the second feature extraction layer in the language model fuses the feature vectors in the seventh fused feature vector with each other and fuses the feature vectors in the eighth fused feature vector with each other again.

In summary, for the terminal device configured with the language model with the training convergence, a plurality of recommended models can be obtained on the basis of configuring optimized template information corresponding to different discriminant tasks, and the number of model parameters is greatly reduced, so that the loss of memory resources in the terminal device is reduced. In the template information optimization process, in one case, the first historical behavior text, the description text and the second template information can be used as the input of the language model, and the second template information is optimized according to the classification result output by the language model, so that a second recommendation model comprising the feature extraction layer, the optimized second template information and the classification layer in the language model is generated. And the classification layer and the optimized second template information have the same parameters. The second template information is expressed in a text form or a feature vector form. Because the template information contains fewer parameters, the consumption of computing resources on the terminal equipment can be reduced in the optimization process. In another case, the first historical behavior text, the description text, the second template text and the second template information may be used as input of a language model to optimize the second template information and finally obtain a second recommendation model. The second template text is used for guiding the generation direction of the prediction result, so that the loss of the terminal equipment is reduced, and the optimization speed of the second template information is improved.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A model data processing system, comprising: a terminal device and a cloud server;

the cloud server is used for training a language model according to the training text; configuring the converged language model and first template information corresponding to a first recommended task into the terminal equipment;

the terminal device is used for acquiring a first historical behavior text for describing user behaviors and the first template information; and optimizing the first template information according to the first historical behavior text and the language model to obtain a first recommendation model comprising the optimized first template information and the language model, wherein the first recommendation model corresponds to the first recommendation task.

2. The system of claim 1, wherein the first template information comprises a first template vector;

the terminal device is used for inputting the first historical behavior text into the language model so as to fuse the feature vector of the first historical behavior text and the first template vector by the language model;

obtaining a prediction result generated by the language model according to a fusion result;

and optimizing elements in the first template vector according to the difference between the prediction result and a second historical behavior text, wherein the generation time of the second historical behavior text is later than that of the first historical behavior text.

3. The system of claim 2, wherein the terminal device is further configured to: acquiring a first template text corresponding to the first recommended task, wherein the semantics of the first template text and the first template vector are the same;

inputting the first historical behavior text, the first template text and the first template vector into the language model, and fusing the feature vectors of the first historical behavior text and the first template vector by the language model.

4. The system according to claim 3, wherein a first feature extraction layer in the language model is configured to fuse respective feature vectors of the first historical behavior text and the first template text with the first template vector to obtain a fused feature vector;

and the second feature extraction layer in the language model is used for fusing all the feature vectors in the fused feature vectors, and the second feature extraction layer is arranged behind the first feature extraction layer.

5. The system according to any one of claims 1 to 4, wherein the first recommendation model generates a behavior text of the next action of the user, the behavior text of the next action being generated later than the first historical behavior text.

6. The system of claim 1, wherein the terminal device is further configured to: obtaining a description text of an alternative recommended object corresponding to the first historical behavior text and second template information corresponding to a second recommended task;

optimizing the second template information according to the first historical behavior text, the description text, a classification layer and a feature extraction layer in the language model,

and setting the classification layer and the optimized second template information to have the same parameters so as to obtain a second recommendation model comprising the classification layer, the optimized second template information and a feature extraction layer in the language model, wherein the second recommendation model corresponds to the second recommendation task.

7. The system of claim 6, wherein the second template information comprises a second template vector;

the terminal device is configured to input the first historical behavior text, the description text, and the second template vector into a feature extraction layer in the language model, so that the feature extraction layer fuses the feature vectors of the first historical behavior text and the description text, and the second template vector;

obtaining a classification result output by the classification layer according to a fusion result;

and adjusting elements in the second template vector according to the difference between the classification result and the second historical behavior text, wherein the generation time of the second historical behavior text is later than that of the first historical behavior text.

8. The system of claim 7, wherein the terminal device is further configured to: acquiring a second template text corresponding to the second recommended task, wherein the second template text and the second template vector have the same semantic meaning;

inputting the first historical behavior text, the description text, the second template vector and the second template text into a feature extraction layer of the language model, so that the feature extraction layer fuses the feature vectors of the first historical behavior text, the description text and the second template vector.

9. The system according to any one of claims 6 to 8, wherein the second recommendation model outputs the classification result of the candidate recommended object corresponding to the behavior text of the next action of the user, and the generation time of the behavior text of the next action is later than that of the first historical behavior text.

10. A model data processing system, comprising: a terminal device and a cloud server;