CN110910209A

CN110910209A - Data processing method and device and computer readable storage medium

Info

Publication number: CN110910209A
Application number: CN201911101655.7A
Authority: CN
Inventors: 陈亮
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2019-11-12
Filing date: 2019-11-12
Publication date: 2020-03-24
Anticipated expiration: 2039-11-12
Also published as: CN110910209B

Abstract

The application discloses a data processing method, a device and a computer readable storage medium, wherein the method comprises the following steps: acquiring a service object set comprising a plurality of service objects; acquiring browsing states of a target user for a plurality of service objects, and determining a target positive sample set and a target negative sample set of the target user according to the browsing states and the service object sets; acquiring a user behavior set of a target user; the user behavior set comprises evaluation operation behaviors of a target user for a plurality of service objects; acquiring an auxiliary positive sample set and an auxiliary negative sample set of a target user in a user behavior set according to the evaluation type of the evaluation operation behavior; and generating an object attribute vector corresponding to each business object in the business object set respectively based on the target positive sample set, the target negative sample set, the auxiliary positive sample set, the auxiliary negative sample set and the word vector model. By the method and the device, the accuracy of the object attribute vector of the generated business object is improved.

Description

Data processing method and device and computer readable storage medium

Technical Field

The present application relates to the field of data processing technologies, and in particular, to a data processing method and apparatus, and a computer-readable storage medium.

Background

With the continuous development of computer networks, the shopping mode of online shopping is rapidly popularized and is in line. The types of shopping objects are also various, such as shopping for clothing, shopping for food, and shopping for virtual goods.

In the process of online shopping, a way of recommending a relevant shopping object (i.e. a commodity, such as the above-mentioned clothing, food and virtual currency) to a user is to vectorize the shopping object. The vectorization representation of the shopping objects can be obtained by vectorizing the shopping objects, and the object characteristics of each shopping object can be represented by the vectorization representation corresponding to each shopping object, so that the appropriate shopping object can be recommended to the user through the vectorization representation of each shopping object.

Conventionally, when vectorizing a shopping object, a vectorized representation of the shopping object is obtained by training an attribute feature (for example, an article type feature of the shopping object itself) of the shopping object itself in a model. When the vectorization representation obtained in this way recommends a shopping object to the user, the degree of interest of the user in the recommended shopping object cannot be estimated, so that the shopping object recommended to the user is inaccurate. It can be seen that the vectorized representation of the shopping object obtained by this method is not accurate.

Content of application

The application provides a data processing method, a data processing device and a computer readable storage medium, which enrich the acquisition mode of the object attribute vector of a business object and improve the accuracy of the acquired object attribute vector of the business object.

One aspect of the present application provides a data processing method, including:

acquiring a service object set, wherein the service object set comprises a plurality of service objects;

acquiring browsing states of a target user for the plurality of service objects, and determining a target positive sample set and a target negative sample set corresponding to the target user according to the browsing states and the service object sets;

acquiring a user behavior set corresponding to the target user, wherein the user behavior set comprises evaluation operation behaviors of the target user for the plurality of service objects;

acquiring an auxiliary positive sample set and an auxiliary negative sample set corresponding to the target user in the user behavior set according to the evaluation type of the evaluation operation behavior;

and generating an object attribute vector corresponding to each business object in the business object set respectively based on the target positive sample set, the target negative sample set, the auxiliary positive sample set, the auxiliary negative sample set and a word vector model.

Wherein the browsing status comprises a browsed status and an unviewed status; determining a target positive sample set and a target negative sample set corresponding to the target user according to the browsing state and the service object set, including:

generating the target positive sample set according to the object identification corresponding to the business object with the browsing state being the browsed state;

and generating the target negative sample set according to the object identification corresponding to the business object with the browsing state being the non-browsing state.

Generating the target positive sample set according to the object identifier corresponding to the business object whose browsing status is the browsing status, including:

acquiring browsing timestamps corresponding to each service object in the browsing state, wherein the browsing state is the browsed state, determining the service object of the browsing timestamp in a target time period as a positive sample service object, and enabling one positive sample service object to correspond to at least one browsing timestamp;

generating a positive sample sequence according to at least one browsing timestamp and an object identifier corresponding to each positive sample business object, and adding the positive sample sequence to the target positive sample set, wherein the positive sample sequence comprises the object identifier corresponding to each positive sample business object.

Wherein, the generating the target negative sample set according to the object identifier corresponding to the service object whose browsing status is the non-browsing status comprises:

determining the number of the objects of the business objects in the positive sample sequence as a target number, and acquiring a negative sample extraction multiple aiming at the target number;

according to the target quantity and the negative sample extraction multiple, extracting a business object as a negative sample business object from the business objects in the browsing state and the non-browsing state, wherein the object quantity of the negative sample business object is equal to the product of the target quantity and the sample extraction multiple;

and adding the object identifier corresponding to the negative sample business object to the target negative sample set.

Wherein the evaluation types comprise a positive evaluation type and a negative evaluation type; the user behavior set comprises a plurality of object operation samples, wherein one object operation sample comprises an object identifier of a business object and a behavior identifier of an evaluation operation behavior of the target user aiming at the business object;

the obtaining, according to the evaluation type of the evaluation operation behavior, an auxiliary positive sample set and an auxiliary negative sample set corresponding to the target user in the user behavior set includes:

determining an object operation sample containing the evaluation operation behavior with the positive evaluation type in the user behavior set as a first object operation sample, and adding the first object operation sample to the auxiliary positive sample set;

and determining an object operation sample containing the evaluation operation behavior with the negative evaluation type in the user behavior set as a second object operation sample, and adding the second object operation sample to the auxiliary negative sample set.

Wherein each business object in the target positive sample set has an object identification;

generating an object attribute vector corresponding to each business object in the business object set respectively based on the target positive sample set, the target negative sample set, the auxiliary positive sample set, the auxiliary negative sample set and a word vector model, including:

obtaining an object identification s in the target positive sample set_jJ is a positive integer less than or equal to N, and N is the number of object identifiers in the target positive sample set;

obtaining the object identifier s in the target positive sample set based on a traversal window with a target step size_jCorresponding neighbor object identification;

based on said object identity s_jUpdating an initial vector corresponding to each business object in the business object set in the word vector model by the neighbor object identification, the target negative sample set, the auxiliary positive sample set and the auxiliary negative sample set;

and respectively determining the initial vector updated in the word vector model as an object attribute vector corresponding to each service object in the service object set.

Each business object in the target negative sample set has an object identifier;

said identification s based on said object_jThe neighbor object identification, the target negative sample set, the auxiliary positive sample set, and the auxiliary negative sample set, and updating an initial vector corresponding to each business object in the business object set in the word vector model, including:

generating initial vectors corresponding to each business object in the business object set respectively based on Gaussian distribution, and associating each initial vector with the object identifier of the corresponding business object respectively;

acquiring a first object to be trained identifier in the target negative sample set, acquiring a first object to be trained operation sample in the auxiliary positive sample set, and acquiring a second object to be trained operation sample in the auxiliary negative sample set;

acquiring a first behavior weight value corresponding to the behavior identifier in the first to-be-trained object operation sample, and acquiring a second behavior weight value corresponding to the behavior identifier in the second to-be-trained object operation sample;

identifying the object s_jInitial vectors respectively associated with the neighbor object identifier, the first object identifier to be trained, the object identifier in the first object operation sample to be trained and the object identifier in the second object operation sample to be trained are all determined as initial vectors to be trained;

updating the initial vector corresponding to each business object in the business object set in the word vector model based on the initial vector to be trained, the first behavior weight value and the second behavior weight value.

Wherein, still include:

in the business object set, acquiring a browsed business object and an evaluation business object corresponding to the target user, wherein the browsed user associated with the browsed business object comprises the target user, the evaluation user associated with the evaluation business object comprises the target user, and the evaluation user refers to a user who performs evaluation operation behavior on the business object;

determining a behavior vector mean value corresponding to the target user according to the browsed business object and the evaluation business object;

and determining a target business object aiming at the target user according to the behavior vector mean value corresponding to the target user and the object attribute vector corresponding to each business object in the business object set, and recommending the target business object to the target user.

Determining a behavior vector mean value corresponding to the target user according to the browsed business object and the evaluation business object, wherein the determining comprises:

acquiring an object attribute vector and a browsed weight value corresponding to the browsed business object, and acquiring an object attribute vector and an evaluation operation weight array corresponding to the evaluation business object;

determining an object attribute vector corresponding to the browsed business object as a first object attribute vector, and determining an object attribute vector corresponding to the evaluation business object as a second object attribute vector;

multiplying each first object attribute vector by the browsed weight value to obtain a first vector corresponding to each first object attribute vector;

multiplying each second object attribute vector by the corresponding weight value in the evaluation operation weight array to obtain a second vector corresponding to each second object attribute vector;

summing the first vector and the second vector to obtain a target vector, and summing the vector quantity of the first vector and the vector quantity of the second vector to obtain a target vector quantity;

and determining the ratio of the target vector to the number of the target vectors as the behavior vector mean value corresponding to the target user.

Wherein, the determining a target business object for the target user according to the behavior vector mean corresponding to the target user and the object attribute vector corresponding to each business object in the business object set, and recommending the target business object to the target user includes:

respectively obtaining a vector distance between an object attribute vector corresponding to each service object in the service object set and a behavior vector mean value corresponding to the target user;

and determining the business object corresponding to the object attribute vector with the minimum vector distance between the behavior vector means corresponding to the target user as the target business object corresponding to the target user, and recommending the target business object to the target user.

Wherein, still include:

determining a vector distance between an object attribute vector corresponding to each service object in the service object set and a behavior vector mean value corresponding to the target user as a cross feature;

and training a recommendation model based on the cross features, wherein the recommendation model is used for recommending the business object for the target user.

Wherein, still include:

acquiring a behavior vector mean value corresponding to the target user, and acquiring a behavior vector mean value corresponding to a user to be matched;

if the vector distance between the behavior vector mean value corresponding to the target user and the behavior vector mean value corresponding to the user to be matched is smaller than a first vector distance threshold value, determining that the target user and the user to be matched have user similarity;

and if the target user and the user to be matched have the user similarity, recommending the service object to the target user according to the historical browsing record of the user to be matched for the service object.

The business objects in the business object set comprise a third business object and a fourth business object; further comprising:

acquiring an object attribute vector corresponding to the third service object, and acquiring an object attribute vector corresponding to the fourth service object;

if the vector distance between the object attribute vector corresponding to the third business object and the object attribute vector corresponding to the fourth business object is smaller than a second vector distance threshold value, determining that the third business object and the fourth business object have object similarity;

and if the first business object and the second business object have the object similarity and the historical browsing records of the target user for the business objects comprise the first business object, recommending the second business object to the target user.

One aspect of the present application provides a data processing apparatus, including:

the system comprises a first acquisition module, a second acquisition module and a third acquisition module, wherein the first acquisition module is used for acquiring a business object set, and the business object set comprises a plurality of business objects;

a second obtaining module, configured to obtain browsing statuses of the target user for the multiple service objects, and determine a target positive sample set and a target negative sample set corresponding to the target user according to the browsing statuses and the service object sets;

a third obtaining module, configured to obtain a user behavior set corresponding to the target user, where the user behavior set includes evaluation operation behaviors of the target user for the plurality of service objects;

a fourth obtaining module, configured to obtain, in the user behavior set, an auxiliary positive sample set and an auxiliary negative sample set corresponding to the target user according to an evaluation type of the evaluation operation behavior;

and the generating module is used for generating an object attribute vector corresponding to each business object in the business object set respectively based on the target positive sample set, the target negative sample set, the auxiliary positive sample set, the auxiliary negative sample set and the word vector model.

Wherein the browsing status comprises a browsed status and an unviewed status; the second obtaining module includes:

a first generating unit, configured to generate the target positive sample set according to an object identifier corresponding to the business object whose browsing status is the browsed status;

and a second generating unit, configured to generate the target negative sample set according to the object identifier corresponding to the service object whose browsing status is the non-browsing status.

Wherein the first generation unit includes:

a time obtaining subunit, configured to obtain browsing timestamps corresponding to each service object in the browsed state, and determine a service object of the browsing timestamp in a target time period as a positive sample service object, where one positive sample service object corresponds to at least one browsing timestamp;

a first adding subunit, configured to generate a positive sample sequence according to at least one browsing timestamp and an object identifier that correspond to each positive sample business object, and add the positive sample sequence to the target positive sample set, where the positive sample sequence includes an object identifier that corresponds to each positive sample business object.

Wherein the second generating unit includes:

a multiple obtaining subunit, configured to determine the number of objects of the business object in the positive sample sequence as a target number, and obtain a negative sample extraction multiple for the target number;

an extracting subunit, configured to extract, according to the target number and the negative sample extraction multiple, a service object as a negative sample service object from the service objects in the browsing state that are in the non-browsing state, where the number of the negative sample service objects is equal to a product of the target number and the sample extraction multiple;

and the second adding subunit is configured to add the object identifier corresponding to the negative example service object to the target negative example set.

the fourth obtaining module includes:

a first adding unit, configured to determine, as a first object operation sample, an object operation sample that includes an evaluation operation behavior with the positive evaluation type in the user behavior set, and add the first object operation sample to the auxiliary positive sample set;

and the second adding unit is used for determining an object operation sample containing the evaluation operation behavior with the negative evaluation type in the user behavior set as a second object operation sample, and adding the second object operation sample to the auxiliary negative sample set.

the generation module comprises:

a first identifier obtaining unit, configured to obtain an object identifier s in the target positive sample set_jJ is a positive integer less than or equal to N, and N is the number of object identifiers in the target positive sample set;

a second identifier obtaining unit, configured to obtain the object identifier s in the target positive sample set based on a traversal window with a target step size_jCorresponding neighbor object identification;

an updating unit for updating the object identifier s based on the object identifier_jUpdating an initial vector corresponding to each business object in the business object set in the word vector model by the neighbor object identification, the target negative sample set, the auxiliary positive sample set and the auxiliary negative sample set;

and the vector determining unit is used for respectively determining the initial vector updated in the word vector model as an object attribute vector corresponding to each service object in the service object set.

the update unit includes:

the vector generation subunit is configured to generate initial vectors corresponding to each service object in the service object set based on gaussian distribution, and associate each initial vector with an object identifier of the corresponding service object;

the sample acquiring subunit is configured to acquire a first object to be trained identifier in the target negative sample set, acquire a first object to be trained operation sample in the auxiliary positive sample set, and acquire a second object to be trained operation sample in the auxiliary negative sample set;

the weight obtaining subunit is configured to obtain a first behavior weight value corresponding to the behavior identifier in the first to-be-trained object operation sample, and obtain a second behavior weight value corresponding to the behavior identifier in the second to-be-trained object operation sample;

a vector determination subunit for identifying the object s_jInitial vectors respectively associated with the neighbor object identifier, the first object identifier to be trained, the object identifier in the first object operation sample to be trained and the object identifier in the second object operation sample to be trained are all determined as initial vectors to be trainedVector quantity;

and the updating subunit is configured to update, in the word vector model, the initial vector corresponding to each service object in the service object set based on the initial vector to be trained, the first behavior weight value, and the second behavior weight value.

Wherein, the data processing device further comprises:

an object obtaining module, configured to obtain a browsed service object and an evaluation service object corresponding to a target user in the service object set, where a browsed user associated with the browsed service object includes the target user, an evaluation user associated with the evaluation service object includes the target user, and the evaluation user is a user who performs an evaluation operation on the service object;

the first determining module is used for determining a behavior vector mean value corresponding to the target user according to the browsed business object and the evaluation business object;

and the second determining module is used for determining a target business object aiming at the target user according to the behavior vector mean value corresponding to the target user and the object attribute vector corresponding to each business object in the business object set, and recommending the target business object to the target user.

Wherein the first determining module comprises:

the acquisition unit is used for acquiring an object attribute vector and a browsed weight value corresponding to the browsed service object, and acquiring an object attribute vector and an evaluation operation weight array corresponding to the evaluation service object;

a first determining unit, configured to determine an object attribute vector corresponding to the browsed service object as a first object attribute vector, and determine an object attribute vector corresponding to the evaluation service object as a second object attribute vector;

a first multiplication unit, configured to multiply each first object attribute vector with the browsed weight value, respectively, to obtain first vectors corresponding to each first object attribute vector;

the second multiplication unit is used for respectively multiplying each second object attribute vector by the corresponding weight value in the evaluation operation weight array to obtain a second vector corresponding to each second object attribute vector;

the summing unit is used for summing the first vector and the second vector to obtain a target vector, and summing the vector quantity of the first vector and the vector quantity of the second vector to obtain a target vector quantity;

and the second determining unit is used for determining the ratio of the target vector to the number of the target vectors as the behavior vector mean value corresponding to the target user.

Wherein the second determining module comprises:

a distance obtaining unit, configured to obtain a vector distance between an object attribute vector corresponding to each service object in the service object set and a behavior vector mean corresponding to the target user;

and the third determining unit is used for determining the service object corresponding to the object attribute vector with the minimum vector distance between the behavior vector mean values corresponding to the target users as the target service object corresponding to the target users, and recommending the target service object to the target users.

Wherein, the data processing device further comprises:

a third determining module, configured to determine a vector distance between an object attribute vector corresponding to each service object in the service object set and a behavior vector mean corresponding to the target user as a cross feature;

and the training module is used for training a recommendation model based on the cross characteristics, and the recommendation model is used for recommending a business object for the target user.

Wherein, the data processing device further comprises:

the mean value obtaining module is used for obtaining the mean value of the behavior vectors corresponding to the target user and obtaining the mean value of the behavior vectors corresponding to the users to be matched;

a first similarity module, configured to determine that user similarity exists between the target user and the user to be matched if a vector distance between the behavior vector mean value corresponding to the target user and the behavior vector mean value corresponding to the user to be matched is smaller than a first vector distance threshold;

and the first recommending module is used for recommending the service object to the target user according to the historical browsing record of the to-be-matched user for the service object if the target user and the to-be-matched user have the user similarity.

The business objects in the business object set comprise a third business object and a fourth business object; the data processing apparatus further includes:

a vector obtaining module, configured to obtain an object attribute vector corresponding to the third service object, and obtain an object attribute vector corresponding to the fourth service object;

a second similarity module, configured to determine that there is object similarity between the third service object and the fourth service object if a vector distance between an object attribute vector corresponding to the third service object and an object attribute vector corresponding to the fourth service object is smaller than a second vector distance threshold;

and the second recommending module is used for recommending the second business object to the target user if the first business object and the second business object have the object similarity and the historical browsing record of the target user for the business object comprises the first business object.

An aspect of the application provides a computer device comprising a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform a method as in an aspect of the application.

An aspect of the application provides a computer-readable storage medium having stored thereon a computer program comprising program instructions which, when executed by a processor, cause the processor to perform the method of the above-mentioned aspect.

The method comprises the steps of firstly, acquiring a service object set; the business object set comprises a plurality of business objects; acquiring browsing states of the plurality of business objects, and acquiring a target positive sample set and a target negative sample set in the business object set according to the browsing states; acquiring a user behavior set corresponding to the service object set; the user behavior set comprises evaluation operation behaviors of a user group aiming at the plurality of service objects; acquiring an auxiliary positive sample set and an auxiliary negative sample set in the user behavior set according to the evaluation type of the evaluation operation behavior; and generating an object attribute vector corresponding to each business object in the business object set respectively based on the target positive sample set, the target negative sample set, the auxiliary positive sample set, the auxiliary negative sample set and a word vector model. Therefore, the method provided by the application can generate the object attribute vector of the business object through the evaluation operation of the user group aiming at the business object, and enriches the generation modes of the object attribute vector aiming at the business object. In addition, in the process of generating the object attribute vector of the business object, the browsing state of the business object is considered, the evaluation operation behaviors of the user group aiming at different types of the business object are considered, and the accuracy of the generated object attribute vector of the business object is improved.

Drawings

In order to more clearly illustrate the technical solutions in the present application or the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1a is a schematic diagram of a system architecture provided herein;

FIG. 1b is a schematic diagram of a data recommendation scenario provided herein;

FIG. 2 is a schematic flow chart diagram of a data processing method provided herein;

FIG. 3 is a schematic flow chart diagram of another data processing method provided herein;

FIG. 4 is a schematic diagram of a scenario for acquiring a sample set provided herein;

FIG. 5 is a schematic diagram of another scenario provided herein for obtaining a sample set;

FIG. 6 is a schematic diagram of a sample selection scenario provided herein;

FIG. 7 is a schematic diagram of a scenario for obtaining a mean value of a behavior vector according to the present application;

FIG. 8 is a schematic diagram of a scenario for acquiring cross features provided in the present application;

FIG. 9 is a schematic view of a scene of a recommended service object provided in the present application;

FIG. 10 is a schematic diagram of another scenario for recommending a business object provided by the present application;

FIG. 11 is a schematic diagram of a data processing apparatus provided in the present application;

fig. 12 is a schematic structural diagram of a computer device provided in the present application.

Detailed Description

The technical solutions in the present application will be described clearly and completely with reference to the accompanying drawings in the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

Please refer to fig. 1a, which is a schematic diagram of a system architecture provided in the present application. As shown in fig. 1a, the system architecture diagram includes a server 100 and a plurality of terminal devices (specifically, a terminal device 200a, a terminal device 200b, and a terminal device 200 c). The terminal device 200a, the terminal device 200b, and the terminal device 200c can communicate with the server 100 through a network. The terminal device may be a mobile phone, a tablet computer, a notebook computer, a palm computer, a Mobile Internet Device (MID), a wearable device (e.g., a smart watch, a smart band, etc.). Here, the communication between the terminal device 200a and the server will be described as an example.

Please refer to fig. 1b, which is a schematic view of a data recommendation scenario provided in the present application. As shown in fig. 1b, the business object set 101a may include a plurality of business objects, and specifically, the plurality of business objects may refer to two or more business objects. Here, it is illustrated that the service object set 101a includes 10 service objects (specifically, including a service object y1, a service object y2, a service object y3, a service object y4, a service object y5, a service object y6, a service object y7, a service object y8, a service object y9, and a service object y10), and the number of service objects in the service object set 101a may be determined according to an actual application scenario, which is not limited herein. Wherein, y1, y2, y3, y4, y5, y6, y7, y8, y9 and y10 are respectively object identifiers of corresponding business objects, and the object identifiers are used for uniquely representing the corresponding business objects. In different application scenarios, a business object may refer to different items. For example, in a clothing purchase scenario, business object set 101a may be a set of all clothing in sale, and then one business object may indicate any piece of clothing in sale. For another example, in a fund purchase scenario, the set of business objects 101a may be a set of all funds, and then a business object may refer to any one fund.

The support target user m clicks and browses any business object in the business object set 101a through the held terminal 200 a. The terminal 200a may notify the server 100 of the obtained click behavior of the target user for the business object, so that the server 100 may know which business objects in the business object set 101a have been clicked by the target user m and when, and which business objects have not been clicked by the target user m. As shown in fig. 1b, the server 100 may obtain a set 102a and a set 103a corresponding to the target user m. The set 102a includes an object identifier y1, an object identifier y2, an object identifier y3, and an object identifier y4, and represents that the target user m has clicked on the business object y1, the business object y2, the business object y3, and the business object y4, that is, the set 102a is a set formed by object identifiers of all business objects clicked on by the target user m. In addition, the object identifiers in the set 102a have an arrangement order, which is determined according to the click time of the target user m for clicking the business object. The method specifically comprises the following steps: the target user m clicks the business object y1 earlier than the business object y2, clicks the business object y2 earlier than the business object y3, and clicks the business object y3 earlier than the business object y 4. The set 103a includes an object identifier y5, an object identifier y6, an object identifier y7, an object identifier y8, an object identifier y9, and an object identifier y10, and the set 103a is a set formed by object identifiers of all business objects that have not been clicked by the target user. In fact, in an actual application scenario, the business objects clicked by the target user m are usually much smaller than the business objects not clicked, and therefore, the object identifiers of all the business objects clicked by the target user m may be added to the set 102a, and the object identifiers of some of the business objects (which may be randomly assumed) not clicked by the target user m may be added to the set 103 a. The set 103a may be obtained by setting the number of object identifiers in the set 103a to 5 times the number of object identifiers in the set 102a (other applicable multiples may be set). The business object corresponding to the object identifier in the set 102a can be used as a positive sample business object, so the set 102a can be referred to as a target positive sample set. Similarly, the business object corresponding to the object identifier in the set 103a may be regarded as a negative sample business object, and therefore the set 103a may be referred to as a target negative sample set.

The support target user m executes evaluation operation behaviors on the business objects in the business object set 101a through the held terminal 200 a. The evaluation type for evaluating the operation behavior may include a positive evaluation type and a negative evaluation type. The evaluation operation behavior of the positive evaluation type refers to a behavior (for example, a behavior of the target user m for collecting and forwarding the service object) that can represent that the target user m has a good feeling to the service object, and the evaluation operation behavior of the negative evaluation type refers to a behavior (for example, a behavior of the target user m for negatively feeding back and badly evaluating the service object) that can represent that the target user m has no good feeling to the service object. The terminal 200a may notify the server 100 of the obtained evaluation operation behavior of the target user for the business object, so that the server 100 may know which evaluation operation behaviors the target user m performs on each business object in the business object set 101a and the time for performing the evaluation operation behaviors. The server 100 may retrieve the set 104a and the set 105 a. The set 104a includes an object operation sample 113a, an object operation sample 114a, and an object operation sample 115 a. The object operation sample 113a is obtained after the target user m performs an evaluation operation behavior z1(z1 is a behavior identifier of a corresponding evaluation operation behavior) on the business object y3, and therefore, the object operation sample 113a includes an object identifier y3 of the business object y3 and a behavior identifier z1 of the evaluation operation behavior z1, that is, the object operation sample 113a includes an association relationship between the object identifier y3 and the behavior identifier z 1. The object operation sample 114a is obtained after the target user m performs the evaluation operation behavior z2(z2 is the behavior identifier of the corresponding evaluation operation behavior) on the business object y5, and therefore, the object operation sample 114a includes the object identifier y5 of the business object y5 and the behavior identifier z2 of the evaluation operation behavior z2, that is, the object operation sample 114a includes the association relationship between the object identifier y5 and the behavior identifier z 2. The object operation sample 115a is obtained after the target user m performs the evaluation operation behavior z3(z3 is the behavior identifier of the corresponding evaluation operation behavior) on the business object y6, and therefore, the object operation sample 115a includes the object identifier y6 of the business object y6 and the behavior identifier z3 of the evaluation operation behavior z3, that is, the object operation sample 115a includes the association relationship between the object identifier y6 and the behavior identifier z 3. The behavior identifier z1, the behavior identifier z2, and the behavior identifier z3 in the object operation sample in the set 104a are all behavior identifiers corresponding to the evaluation operation behavior of the positive evaluation type, that is, the object operation sample in the set 104a is collected after the target user m executes the evaluation operation behavior of the positive evaluation type on the business object. For example, the evaluation operation behavior z1, the evaluation operation behavior z2, and the evaluation operation behavior z3 may each be any one of a praise behavior, a forward behavior, or a favorite behavior. The set 105a includes an object operation sample 116a and an object operation sample 117a, where the object operation sample 116a is obtained after the target user m performs an evaluation operation behavior z4(z4 is a behavior identifier of a corresponding evaluation operation behavior) on the business object y4, and therefore, the object operation sample 116a includes an object identifier y4 of the business object y4 and a behavior identifier z4 of the evaluation operation behavior z4, that is, the object operation sample 116a includes an association relationship between the object identifier y4 and the behavior identifier z 4. The object operation sample 117a is obtained after the target user m performs the evaluation operation behavior z5(z5 is the behavior identifier of the corresponding evaluation operation behavior) on the business object y9, and therefore, the object operation sample 117a includes the object identifier y9 of the business object y9 and the behavior identifier z5 of the evaluation operation behavior z5, that is, the object operation sample 117a includes the association relationship between the object identifier y9 and the behavior identifier z 5. The behavior identifier z4 and the behavior identifier z5 in the object operation sample in the set 105a are both behavior identifiers corresponding to the evaluation operation behaviors of the negative evaluation type, that is, the object operation sample in the set 105a is collected after the target user m performs the evaluation operation behaviors of the negative evaluation type on the business object. For example, the evaluation operation behavior z4 and the evaluation operation behavior z5 may be any one of a behavior in which the target user m is not interested in the business object click or a behavior in which a bad evaluation is performed. In addition, any one of the behavior flags corresponds to a training weight, that is, the behavior flag z1, the behavior flag z2, the behavior flag z3, the behavior flag z4, and the behavior flag z5 each correspond to a training weight, which can be set by itself according to an actual application scenario, where a value range of the training weight is 0 to 1, and a larger training weight indicates that the corresponding evaluation operation behavior has a larger influence on model training. For example, when the evaluation operation behavior is a praise behavior for the service object, the training weight corresponding to the behavior identifier of the evaluation operation behavior may be set to 0.3, and when the evaluation operation behavior is a collection behavior for the service object, the training weight corresponding to the behavior identifier of the evaluation operation behavior may be set to 0.7.

The server 100 may train the word vector model 106a through the set 102a, the set 103a, the set 104a, and the set 105a obtained as described above. First, the server 100 may obtain an object id from the set 102a as a central object id (which may be any object id in the set 102 a). Since the object identifiers in the set 102a have an order, the server 100 can obtain the surrounding object identifiers for the center object identifier (i.e., the object identifiers in the set 102a that are near the center object identifier). The server 100 may further obtain an identifier pair for the center object identifier through the obtained center object identifier and the obtained surrounding object identifiers. For example, if the object id y2 is obtained as the center object id in the set 102a, the surrounding object ids of the center object id may be the object id y1 and the object id y3, and the obtained id pairs for the center object id y2 are (y2, y1) and (y2, y 3). Next, the server may randomly obtain object identifiers from the set 103a as object identifiers to be trained, where the obtained object identifiers may be 1, that is, the obtained object identifiers to be trained may be one, and may obtain another training pair for the center object identifier according to the object identifier to be trained. For example, if the object identifier y5 is obtained as the object identifier to be trained in the set 103a, one identifier pair (y2, y5) of the center object identifier y2 may be obtained. The server 100 may further randomly obtain object operation samples from the set 104a, as training objects, the obtained object operation samples may be 1, that is, one training object may be obtained, and according to the training object, another identification pair for the center object identification may be obtained. For example, if the object operation sample 113a is obtained as a training object in the set 104a, one of the pair of identifiers y2 may be obtained as (y2, y 3). In addition, the server may also obtain training weights corresponding to the behavior identifications z1 in the object operation sample 113 a. The server 100 may further randomly obtain object operation samples from the set 105a, as training objects, the obtained object operation samples may be 1, that is, the obtained training objects may be one, and according to the training objects, another identification pair for the center object identification may be obtained. For example, if the target operation sample 116a is obtained as a training target in the set 105a, one of the pair of the center object id y2 may be obtained as (y2, y4), and a training weight corresponding to the behavior id z4 in the target operation sample 116a may also be obtained.

Each pair of identifiers obtained from the set 102a may be referred to as a center identifier pair, and when training of each pair of center identifiers corresponding to one center object identifier is completed, it indicates that training of the center object identifier is completed, and one center identifier trains the word vector model 106a once. For example, if the identifier pair (y2, y1) corresponding to the center object identifier y2 is selected to perform the first training, the server 100 may perform the first training on the word vector model 106a by obtaining the training weights corresponding to the identifier pair (y2, y1), the identifier pair (y2, y5), the identifier pair (y2, y3), the identifier pair (y2, y4), the behavior identifier z1, and the behavior identifier z 4. The server 100 may traverse each object id in the set 102a, and train the word vector model 106a using each object id in the set 102a as a center object id in turn. The training process of each central object mark is the same, and the next central object mark continues to be trained on the basis of the previous central object mark after being trained. And when the training of all the central object identifications corresponding to the target user m is finished, indicating that the training of the target user m is finished. When there are a plurality of target users, each target user may be trained in the same training manner as the target user m, and when training is completed for all target users, it indicates that training of the word vector model 106a is completed. It should be noted that the above training refers to updating, in the word vector model 106a, a vector corresponding to each business object in the business object set (an initial vector at first, and an updated initial vector after being updated), specifically: the server 100 generates an initial vector of each service object in the service object set 101a, and inputs the initial vector of each service object into the word vector model 106a, where each initial vector is associated with an object identifier of a corresponding service object, that is, the corresponding initial vector can be obtained through the object identifier of the service object, so that the word vector model 106a can be trained through the obtained identifier pair, and the vector corresponding to each service object in the service object set is continuously updated in the training process.

The server 100 may output an object attribute vector corresponding to each service object in the service object set 101a through the trained word vector model 106a, where the object attribute vector is a vector corresponding to the service object updated in the word vector model 106 a. Here, the set 107a includes an object attribute vector corresponding to each service object in the service object set 101a, specifically: vector c1 is an object attribute vector corresponding to a service object y1, vector c2 is an object attribute vector corresponding to a service object y2, vector c3 is an object attribute vector corresponding to a service object y3, vector c4 is an object attribute vector corresponding to a service object y4, vector c5 is an object attribute vector corresponding to a service object y5, vector c6 is an object attribute vector corresponding to a service object y6, vector c7 is an object attribute vector corresponding to a service object y7, vector c8 is an object attribute vector corresponding to a service object y8, vector c9 is an object attribute vector corresponding to a service object y9, and vector c10 is an object attribute vector corresponding to a service object y 10.

The server 100 may calculate a behavior vector mean value c11 corresponding to the target user m by using the obtained object attribute vector corresponding to each service object in the set 101 a. The specific process is as follows: and summing the object attribute vector corresponding to each business object clicked by the target user m and the object attribute vector corresponding to each business object executing the evaluation operation behavior (when summing, weighted summing can be performed on each object attribute vector, for example, a click corresponds to a weight, and each evaluation operation behavior corresponds to a weight, respectively). Then, the server may calculate a ratio between the summed vector and the total number of summed vectors, and use the ratio as the behavior vector mean c11 corresponding to the target user m. The vector distance between each object attribute vector in the set 107a and the behavior vector mean c11 may be calculated to obtain a set 109a, where the set 109a includes the vector distance between each object attribute vector and the behavior vector mean c 11. The method specifically comprises the following steps: distance j1 is the vector distance corresponding to object attribute vector c1, distance j2 is the vector distance corresponding to object attribute vector c2, distance j3 is the vector distance corresponding to object attribute vector c3, distance j4 is the vector distance corresponding to object attribute vector c4, distance j5 is the vector distance corresponding to object attribute vector c5, distance j6 is the vector distance corresponding to object attribute vector c6, distance j7 is the vector distance corresponding to object attribute vector c7, distance 8 is the vector distance corresponding to object attribute vector c8, distance j9 is the vector distance corresponding to object attribute vector c9, and distance j10 is the vector distance corresponding to object attribute vector c 10. The smaller the vector distance corresponding to a certain object attribute vector is, the greater the possibility that the target user m is interested in the business object corresponding to the object attribute vector is. The server 100 may compare the magnitudes of the vector distances in the set 109a, take the service object corresponding to the vector distance with the smallest magnitude in the set 109a as the target service object (here, the distance c9 is the smallest, that is, the target service object is the service object y9), and send the target service object to the terminal device 200a corresponding to the target user m. The terminal device 200a may recommend the target service object to the target user m on a recommendation page of "recommend for you". Here, taking the target business object as "fund 1" as an example, the terminal device 200a may display an icon 111a of "fund 1" and a name 110a of "fund 1" on a recommendation page of "recommend for you". When the target user m clicks the icon 111a or the name 110a, the terminal 200a may jump to a purchase page of "fund 1", and the purchase page of "fund 1" includes an icon of "fund 1", a fund type, a distributor, an organization form, and a purchase button 112a, thereby achieving a purpose of recommending "fund 1" to the target user m for purchase.

By the method and the device, when the object attribute vector corresponding to the business object is obtained, not only the exposure of the business object (determined by the clicking action of the user) but also diversified behaviors (namely diversified evaluation operation behaviors, such as praise, forwarding, collecting, negative feedback and the like) of the user aiming at the business object and the influence degree of each behavior on model learning (determined by the training weight corresponding to the behavior identification of the evaluation operation behavior) are considered, so that the obtaining mode of the object attribute vector of the business object is enriched, and the obtained object attribute vector of the business object is more accurate.

Please refer to fig. 2, which is a schematic flow chart of a data processing method provided in the present application, and as shown in fig. 2, the method may include:

step S101, acquiring a service object set; the business object set comprises a plurality of business objects;

specifically, the server may obtain a service object set, where the service object set includes a plurality of service objects, and in different service scenarios, the plurality of service objects may refer to different articles, and the articles may be real articles (such as clothes) or virtual articles (such as virtual coins). For example, in a scenario of apparel purchase, the plurality of business objects may refer to a plurality of pieces of apparel; in a book reading scene, the plurality of business objects can refer to a plurality of books; in the scenario of fund purchase, the plurality of business objects may refer to a plurality of funds; in a purchase scenario of a virtual skin (e.g., a skin of a virtual character in a game), the plurality of business objects may refer to a plurality of virtual skins. That is, the business object may be any item that can be operated by the user (including browsing, clicking, agreeing, forwarding, collecting, clicking uninterested, opinion feedback, and/or purchasing).

Step S102, acquiring browsing states of a target user aiming at the plurality of business objects, and determining a target positive sample set and a target negative sample set corresponding to the target user according to the browsing states and the business object sets;

specifically, the server may obtain a browsing status of the target user for each business object in the business object set, where the browsing status includes a browsed status and an unviewed status. The browsing state of the service object clicked by the target user through the terminal device is a browsed state (the clicked service object indicates browsed), and the browsing state of the service object not clicked by the target user through the terminal device is an unviewed state. The terminal device can respond to the click operation of the target user on the business object in the terminal interface to generate click information, and the click information also comprises the click time of the target user for the business object. The terminal may send the click information to the server, so that the server may know, through the click information, which service objects in the service object set have browsing states that are browsed (and browsing time of each service object whose browsing state is browsed, where the browsing time is the click time of the target user for the service object), and which service objects have browsing states that are not browsed. The server may use the business object in the browsed state corresponding to the target user as a positive sample, and use the business object in the unviewed state corresponding to the target user as a negative sample. And then the server can obtain a target positive sample set corresponding to the target user through the positive sample, and obtain a target negative sample set corresponding to the target user through the negative sample.

Step S103, acquiring a user behavior set corresponding to the target user; the user behavior set comprises evaluation operation behaviors of the target user for the plurality of business objects;

specifically, the terminal device of the target user may respond to the evaluation operation behavior of the target user for the service object, and generate the evaluation information. The terminal device can send the evaluation information to the server, and the server can know which evaluation operation behaviors are executed by the target user for each service object in the service object set through the evaluation information, so that the server can obtain the user behavior set of the target user for the service object set. The user behavior set includes an evaluation operation behavior of the target user for each business object in the business object set, and the evaluation operation behavior may be a behavior capable of characterizing the interest (including positive interest and negative interest) of the target user for the business object to some extent. For example, the evaluation operation behavior may be a behavior of the target user for forwarding, collecting, like, purchasing and not interested in clicking a business object, where the behavior of the target user for forwarding, collecting, like and purchasing the business object represents a positive interest of the target user for the business object, and the behavior of the target user not interested in clicking the business object represents a negative interest of the target user for the business object.

Step S104, according to the evaluation type of the evaluation operation behavior, acquiring an auxiliary positive sample set and an auxiliary negative sample set corresponding to the target user from the user behavior set;

specifically, the evaluation operation behavior includes two evaluation types, one is an evaluation operation behavior of a positive evaluation type, such as a behavior that can represent positive interest of the target user for the service object, such as likes, favorites, and forwards, and the other is an evaluation operation behavior of a negative evaluation type, such as a behavior that can represent negative interest of the target user for the service object, such as behavior that cannot be interested in clicking. The server can acquire the evaluation operation behaviors of the positive evaluation types corresponding to the business objects in the user behavior set to form an auxiliary positive sample set, and acquire the evaluation operation behaviors of the negative evaluation types corresponding to the business objects to form an auxiliary negative sample set.

Step S105, generating an object attribute vector corresponding to each business object in the business object set respectively based on the target positive sample set, the target negative sample set, the auxiliary positive sample set, the auxiliary negative sample set and a word vector model;

specifically, the server may train the word vector model through the obtained target positive sample set, target negative sample set, auxiliary positive sample set, and auxiliary negative sample set corresponding to the target user, and further output the object attribute vector corresponding to each service object in the service object set through the trained word vector model. The specific process of training the word vector model may refer to steps S205 to S207. When a plurality of target users exist, a target positive sample set, a target negative sample set, an auxiliary positive sample set and an auxiliary negative sample set corresponding to each target user can be respectively obtained, and then the word vector model can be sequentially trained through the target positive sample set, the target negative sample set, the auxiliary positive sample set and the auxiliary negative sample set corresponding to each target user. The processes of obtaining the target positive sample set, the target negative sample set, the auxiliary positive sample set and the auxiliary negative sample set corresponding to each target user are also mutually independent, that is, the target positive sample set, the target negative sample set, the auxiliary positive sample set and the auxiliary negative sample set corresponding to a certain target user are only related to the target user and are not related to other target users. The training process of the word vector model for each target user is the same and independent, and the subsequent target user continues training on the basis of the training of the word vector model by the previous target user until all target users are trained, which indicates that the training of the word vector model is finished. The training sequence of the word vector model for each target user has no influence on the training result of the model, namely the training sequence among a plurality of target users is not limited.

Through the above process, the vectorization of each service object in the service object set is completed, and the vectorized representation (i.e. the object attribute vector) of each service object is obtained. The object attribute vector corresponding to each business object has the vectorization feature of each business object, and the vectorization feature is obtained through diversified user operations (including click behaviors and various evaluation operation behaviors) of the target user for the business object.

Referring to fig. 3, a schematic flow chart of another data processing method provided in the present application is shown, and as shown in fig. 3, the method may include:

step S201, acquiring a service object set; the business object set comprises a plurality of business objects;

specifically, a specific implementation manner of step S201 may refer to the description of step S101 in the embodiment corresponding to fig. 2, and is not described herein again.

Step S202, acquiring browsing states of a target user for the plurality of business objects, and generating a target positive sample set according to object identifications corresponding to the business objects of which the browsing states are the browsing states; generating the target negative sample set according to the object identification corresponding to the business object with the browsing state being an unviewed state;

specifically, each service object in the service object set has an object identifier, the object identifier is used to uniquely represent each service object, and the object identifier of a certain service object may be a name of the service object or a symbol string set for the service object. The manner how the server obtains the browsing status of the target user for a plurality of business objects in the business object set may be referred to the above step S102.

The server may set an object identifier for each business object in the business object set, where the object identifier of each business object is used to uniquely represent the corresponding business object, and one business object corresponds to one object identifier. The server may generate the target positive sample set according to the object identifier corresponding to the business object whose browsing status is the browsing status: acquiring browsing timestamps corresponding to each business object in the browsing state as the browsed state, and determining the business object of the browsing timestamp in a target time period as a positive sample business object; generating a positive sample sequence according to at least one browsing timestamp and an object identifier corresponding to each positive sample business object, and adding the positive sample sequence to the target positive sample set:

the server may obtain a browsing timestamp corresponding to each service object whose browsing status is a browsed status, where 1 service object may correspond to 1 or more browsing timestamps, and a browsing timestamp corresponding to a certain service object is a time point corresponding to a browsing time of the target user for the service object (i.e., a time point of a click time of the target user for clicking the service object). The server may set a time period for sample collection (i.e., a target time period), and the server may regard the business object with the browsing timestamp in the target time period as a positive sample business object. The server may generate a positive sample sequence according to the timestamp of the positive sample service object in the target time period and the object identifier. For example, the target time period may be set to 1 month, which month may be decided by the actual application scenario. When a certain business object corresponds to 3 browsing timestamps, wherein 2 browsing timestamps are in the target time period, and 1 browsing timestamp is not in the target time period, discarding the 1 browsing timestamp which is not in the target time period, and only taking the 2 browsing timestamps which are in the target time period for generating the positive sample sequence. For example, please refer to fig. 4, which is a schematic view of a scene for acquiring a sample set according to the present application. As shown in fig. 4, the service object set 102E includes a service object a, a service object B, a service object C, a service object D, a service object E, a service object F, a service object G, a service object H, a service object I, a service object J, a service object K, and a service object L, where a, B, C, D, E, F, G, H, I, J, K, and L are object identifiers of corresponding service objects. The target user sequentially clicks (i.e. browsing timestamps are sequentially increased) the service object A, the service object B, the service object C, the service object A, the service object C, the service object D and the service object E in a target time period, namely, the service object A, the service object B, the service object C, the service object D and the service object E are all positive sample service objects, the service object A and the service object C respectively correspond to 2 timestamps in the target time period, and the service object B, the service object D and the service object E respectively correspond to 1 timestamp in the target time period. The positive sample sequence may be generated according to the object identifier of the positive sample business object in the order in which the browsing timestamps sequentially increase (i.e., the click time increases from morning to evening), where the obtained positive sample sequence is the sequence 100E, and the sequence 100E is a → B → C → a → C → D → E. The generated positive sample sequence may be added to the target positive sample set to obtain a target positive sample set 101e, that is, the object identifiers in the target positive sample set have an order, and the order is a click order of the target user for the corresponding service objects.

The server may generate the target negative sample set according to the object identifier corresponding to the service object whose browsing status is the non-browsing status: determining the number of the objects of the business objects in the positive sample sequence as a target number, and acquiring a negative sample extraction multiple aiming at the target number; extracting a business object as a negative sample business object from the business objects in the browsed state and the unviewed state according to the target number and the negative sample extraction multiple; adding the object identifier corresponding to the negative sample business object to the target negative sample set:

generally, in the target time period, the number of the business objects in the browsing state being in the non-browsing state is far greater than the number of the business objects in the browsing state being in the browsing state, so when the negative sample business objects are collected, the negative sample business objects can be randomly extracted from all the business objects in the target time period in which the browsing state is in the non-browsing state according to the multiple (i.e. the negative sample extraction multiple, for example, 5 times) that the number of the negative sample business objects is the number of the positive sample business objects (i.e. the target number). For example, when the number of the positive sample business objects is 20 and the negative sample extraction multiple is 5 times, 100 business objects among the business objects in the browsing state of the non-browsing state in the target time period may be randomly extracted as the negative sample business objects. It can be understood that, when the number of the business objects in the browsing state as the non-browsing state in the target time period is small, all the business objects in the browsing state as the non-browsing state in the target time period may also be taken as negative sample business objects. Optionally, in the business objects in the browsing state that is in the non-browsing state in the target time period, randomly extracting the business objects whose number is a multiple (for example, 5 times) of the number of the object identifiers in the target positive sample set as the negative sample business objects. As shown in fig. 4, in the target time period, the business objects whose browsing status is the browsed status in the business object set 102E include a business object a, a business object B, a business object C, a business object D, and a business object E, the business objects whose browsing status is the non-browsed status include a business object F, a business object G, a business object H, a business object I, a business object J, a business object K, and a business object L, the business object F, the business object G, the business object H, the business object I, the business object J, the business object K, and the business object L may all be negative sample business objects, the target negative sample set 103e obtained according to the object identifier of the negative sample service object includes an object identifier F, an object identifier G, an object identifier H, an object identifier I, an object identifier J, an object identifier K, and an object identifier L.

Step S203, acquiring a user behavior set corresponding to the target user; the user behavior set comprises evaluation operation behaviors of the target user for the plurality of business objects;

specifically, the manner of the server acquiring the evaluation operation behavior of the target user for the business object may be referred to in step S103. The server may obtain a user behavior set corresponding to the target user, where the user behavior set includes an evaluation operation behavior of the target user for each service object in the service object set. Specifically, the server may set a behavior identifier for each evaluation operation behavior, where the behavior identifier of each evaluation operation behavior is used to uniquely represent the corresponding evaluation operation behavior, and one evaluation operation behavior corresponds to one behavior identifier. The user behavior set may include a plurality of object operation samples, where an object operation sample includes an object identifier of a service object and a behavior identifier of an evaluation operation behavior of a target user for the service object. For example, if the target user performs praise on the service object a, the praise is an evaluation operation behavior, and the behavior identifier corresponding to the praise may be x1, the object operation sample formed between the service object a and the praise x1 may include the object identifier a and the behavior identifier x1, which indicates that the target user performs the evaluation operation behavior x1 on the service object a. For another example, the target user performs forwarding on the service object a, the forwarding is an evaluation operation behavior, the behavior identifier corresponding to the forwarding may be x3, and the corresponding object operation sample may include the object identifier a and the behavior identifier x 3. If the target user executes the same evaluation operation behavior for the same business object for multiple times, one object operation sample corresponds to each evaluation operation executed once, that is, one business object may correspond to multiple object operation samples, the multiple object operation samples are the same, and only the time for executing the evaluation operation behavior corresponding to the behavior identifier in each object operation sample by the target user is different. It can be understood that the target user corresponds to one object operation sample for each evaluation operation behavior of the business object, the object identifier and the behavior identifier in each object operation sample may be the same or different, and the time for the evaluation operation behavior corresponding to the behavior identifier in each object operation sample to be executed by the target user is different. In each object operation sample in the user behavior set acquired by the server, the evaluation operation behavior corresponding to the behavior identifier is executed by the target user in the target time period, that is, the samples in the target positive sample set, the target negative sample set, the auxiliary positive sample set and the auxiliary negative sample set are acquired in the same target time period.

Step S204, determining an object operation sample containing an evaluation operation behavior with a positive evaluation type in the user behavior set as a first object operation sample, and adding the first object operation sample to the auxiliary positive sample set; determining an object operation sample containing the evaluation operation behavior with the negative evaluation type in the user behavior set as a second object operation sample, and adding the second object operation sample to the auxiliary negative sample set;

specifically, it can be known from the above step S104 that the evaluation types of the evaluation operation behavior include a positive evaluation type and a negative evaluation type. Therefore, the server may use, as the first object operation sample, an object operation sample of a behavior identifier of an evaluation operation behavior of a positive evaluation type in the user behavior set, for example, an object operation sample of a behavior identifier of an evaluation operation behavior including approval, forwarding, and collection may be used as the first object operation sample. All first object operation samples can be added to the auxiliary positive sample set, namely the auxiliary positive sample set is formed by object operation samples corresponding to the evaluation operation behaviors of the positive evaluation type. Similarly, an object operation sample of a behavior identifier of an evaluation operation behavior of a negative evaluation type in the user behavior set may be used as the second object operation sample, for example, an object operation sample of a behavior identifier of an evaluation operation behavior including click dislike, negative feedback, and the like may be used as the second object operation sample. All second object operation samples can be added to the auxiliary negative sample set, namely the auxiliary negative sample set is formed by object operation samples corresponding to the evaluation operation behaviors of the negative evaluation type.

Please refer to fig. 5, which is a schematic view of another scenario for acquiring a sample set provided in the present application. As shown in FIG. 5, A, B, D, H and I are object identifiers of business objects, and x1, x2, x3, x4 and x5 are behavior identifiers for evaluating operation behaviors. The behavior identifier x1 is used for representing favorable evaluation operation behaviors, the behavior identifier x2 is used for representing negative feedback evaluation operation behaviors, the behavior identifier x3 is used for representing forwarded evaluation operation behaviors, the behavior identifier x4 is used for representing favorite evaluation operation behaviors, and the behavior identifier x5 is used for representing shared evaluation operation behaviors. Here, the behaviors 1, 2, 3, 4, 5, 6, and 7 are behaviors of the target user within the target time period, and the execution times of the behaviors 1 to 7 gradually become later, that is, of the 7 behaviors 1 to 7, the target user executes the behavior 1 first and the behavior 7 last. Behavior 1 of the target user indicates that praise is performed on the service object a, behavior 2 indicates that praise is performed on the service object B, behavior 3 indicates that negative feedback (e.g., bad comment) is performed on the service object D, behavior 4 indicates that praise is performed on the service object a, behavior 5 indicates that forwarding is performed on the service object a, behavior 6 indicates that collection is performed on the service object H, and behavior 7 indicates that sharing is performed on the service object I. An object operation sample 100f can be obtained through the behavior 1, where the object operation sample 100f includes an object identifier a and a behavior identifier x 1; an object operation sample 101f can be obtained through the behavior 2, and the object operation sample 101f includes an object identifier B and a behavior identifier x 1; an object operation sample 102f can be obtained through behavior 3, and the object operation sample 102f includes an object identifier D and a behavior identifier x 2; an object operation sample 103f can be obtained through behavior 4, and the object operation sample 103f includes an object identifier a and a behavior identifier x 1; an object operation sample 104f can be obtained through behavior 5, and the object operation sample 104f includes an object identifier a and a behavior identifier x 3; an object operation sample 105f can be obtained through the action 6, and the object operation sample 105f includes an object identifier H and an action identifier x 4; an object operation sample 106f can be obtained through action 7, and the object operation sample 106f includes an object identifier I and a behavior identifier x 5. A set of the above-described target operation sample 100f, target operation sample 101f, target operation sample 102f, target operation sample 103f, target operation sample 104f, target operation sample 105f, and target operation sample 106f may be referred to as the above-described user behavior set. The rating operation behavior of the approval, forwarding, collection and sharing is an evaluation operation behavior of a positive evaluation type, and the evaluation operation behavior of the negative feedback is an evaluation operation behavior of a negative evaluation type, so that the object operation sample 100f, the object operation sample 101f, the object operation sample 103f, the object operation sample 104f, the object operation sample 105f and the object operation sample 106f in the user behavior set can be used as a first object operation sample, and the object operation sample 102f can be used as a second object operation sample. Thus, an auxiliary positive sample set 107f consisting of the first object manipulation samples and an auxiliary negative sample set 108f consisting of the second object manipulation samples may be obtained.

Step S205, obtaining an object identifier S in the target positive sample set_jJ is a positive integer less than or equal to N, and N is the number of object identifiers in the target positive sample set;

specifically, the server may obtain the object identifier s in the target positive sample set corresponding to the target user_jAs the center object identifier, j is a positive integer less than or equal to N, i.e. j is greater than or equal to 1 and less than or equal to N, N is the number of object identifiers in the target positive sample set, and the object identifier s_jIt may be any object identification in the target positive sample set. In fact, all the object identifiers in the target positive sample set are traversed, and each object identifier in the target positive sample set is selected as a center object identifier in turn. Here to select an object identifier s_jThe description will be given taking the center object flag as an example.

Step S206, based on the traversal window with the target step length, obtaining the object identifier S in the target positive sample set_jCorresponding neighbor object identification;

specifically, the server may obtain the object identifier s in the target positive sample set through a traversal window with a target step size_jThe corresponding neighbor object identification. Wherein, the object identifiers s are arranged in the target positive sample set, that is, the object identifiers in the target positive sample set are actually a sequence_jThe corresponding neighbor object ID is the object ID s_jAs a center, an object marker around the center, and thus, an object marker s_jWhich may also be referred to as a central object identifier. Wherein the target step length determines the acquired object identifier s_jThe number of corresponding neighbor object identifications. For example, when the target step size is 1, the target is in the target positive sample set and is in the object identifier s through the traversal window_j1 object identifier on the left and at object identifier s_jIdentification of the right 1 objectAs object identifiers s_jThe neighbor object identification of (2); when the target step length is 2, the target positive sample set can be located in the object identifier s through the traversal window_j2 object identifications on the left and at object identification s_jThe right 2 object identifiers are used as object identifiers s_jThe neighbor object identification. Through the traversal window, the neighbor object identifier of each object identifier in the target positive sample set can be obtained through traversal.

Step S207, based on the object identification S_jUpdating an initial vector corresponding to each business object in the business object set in the word vector model by the neighbor object identification, the target negative sample set, the auxiliary positive sample set and the auxiliary negative sample set;

specifically, the server may randomly generate an initial vector corresponding to each service object in the service object set based on gaussian distribution, and associate each initial vector with an object identifier of the corresponding service object, that is, the initial vector corresponding to one service object may be obtained through the object identifier of the service object. The following object id training is actually an initial vector corresponding to the object id.

The obtained object identifier s_jThe method comprises the steps that a plurality of neighbor object identifications are obtained, firstly, one neighbor object identification can be selected from the obtained neighbor object identifications for training, it can be understood that a server can train each neighbor object identification, the training sequence of the neighbor object identifications has no influence on a training result, the training mode of each neighbor object identification is the same, and the next neighbor object identification continues to be trained on the basis of the previous neighbor object identification after training. After selecting a neighbor object identifier for training the word vector model, the server needs to randomly acquire one or more object identifiers (the number of the selected object identifiers is determined according to the actual application scenario, and is not limited herein, for example, 5 object identifiers are selected) in the target negative sample set as the first object identifier to be trained. Then, the server needs to randomly choose one or more from the auxiliary positive sample setA plurality of object operation samples (the number of selected object operation samples is determined according to the actual application scenario, and is not limited herein, for example, 5 are selected) are used as the first object operation sample to be trained, and one or more object operation samples (the number of selected object operation samples is determined according to the actual application scenario, is not limited herein, and is 5, for example) are randomly selected from the auxiliary negative sample set as the second object operation sample to be trained. The server can set different training weights for each evaluation operation behavior, and the larger the training weight is, the larger the influence of the corresponding evaluation operation behavior on the training result of the word vector model is. The server may associate the behavior identifier of each evaluation operation behavior with the corresponding training weight, that is, the corresponding training weight may be obtained through the behavior identifier. For example, a training weight of 0.3 is set for favorable evaluation operation behavior, and a training weight of 0.5 is set for favorite evaluation operation behavior. The server may obtain first behavior weight values respectively corresponding to the behavior identifiers in each first to-be-trained object operation sample (i.e., training weights respectively corresponding to the behavior identifiers in each first to-be-trained object operation sample). Similarly, the server may obtain a second behavior weight value corresponding to the behavior identifier in each second to-be-trained object operation sample (training weights corresponding to the behavior identifiers in each second to-be-trained object operation sample, respectively). The server can identify the object s_jAnd the selected neighbor object identification, the first object identification to be trained, the object identification in each first object operation sample to be trained and the initial vector respectively associated with the object identification in each second object operation sample to be trained are all used as initial vectors to be trained.

The selected neighbor object identifier, the first object identifier to be trained, the object identifier in each first object operation sample to be trained, and the object identifier in each second object operation sample to be trained respectively correspond to the initial vector to be trained, which can be the same as the object identifier s_jCorresponding initial vectors to be trained form vector pairs, and a word vector model can be trained through all the vector pairs, all the first behavior weight values and all the second behavior weight values, namely in the word vector modelAnd updating the initial vector corresponding to each business object in the business object set. Please refer to the following formula (1), which is an objective function of the word vector model provided in the present application, and all the vector pairs obtained above may be substituted into the following objective function to complete the update of the corresponding initial vector.

Formula (1):

wherein v in the formula (1)_cIndicating a central object identity (i.e. the selected object identity s mentioned above)_j) Corresponding initial vector, D_pRepresenting a set of target positive samples, D_nRepresenting a set of target negative examples, D_mpRepresenting a set of auxiliary positive samples, D_mnRepresenting a secondary negative sample set. The formula (1) includes 4 terms in total, each summation symbol Σ corresponds to 1 term, that is, the 1 st term is a summation term corresponding to the target positive sample set, the 2 nd term is a summation term corresponding to the target negative sample set, the 3 rd term is a summation term corresponding to the auxiliary positive sample set, and the 4 th term is a summation term corresponding to the auxiliary negative sample set. The above vector pair, i.e. v in formula (1)_cAnd v_lAnd forming vector pairs, wherein one vector pair comprises two vectors. The pair of vectors substituted in term 1 of equation (1) is the object identifier s_jCorresponding initial vector v_cInitial vector v corresponding to selected neighbor object identity_lThe constructed vector pair, i.e. v in item 1_lTo be in a target positive sample set D_pThe selected neighbor object identifies the corresponding initial vector. The pair of vectors substituted in term 2 of equation (1) is the object identification s_jCorresponding initial vector v_cInitial vector v corresponding to first object identification to be trained_lThe constructed vector pair, v in term 2_lTo be in a target negative sample set D_nWhen there are multiple first object identifiers to be trained, the initial vector corresponding to each first object identifier to be trained may be the object identifier s_jCorresponding toInitial vector v_cAnd forming vector pairs, and respectively substituting the vector pair corresponding to each first object to be trained into item 2 in the formula (1) and superposing the vector pairs. The pair of vectors substituted in term 3 of equation (1) is the object identifier s_jCorresponding initial vector v_cInitial vector v corresponding to object identification in first to-be-trained object operation sample_lThe constructed vector pair, v in item 3_lTo be in the auxiliary positive sample set D_mpWhen there are multiple first to-be-trained object operation samples, the initial vector corresponding to the object identifier in each first to-be-trained object operation sample may be the object identifier s_jCorresponding initial vector v_cForming a vector pair. And the behavior identifiers in each first object to be trained operation sample respectively correspond to 1 w_mp，w_mpFor the first behavior weight value, that is, the training weight associated with the behavior identifier, the vector pair corresponding to each first to-be-trained object operation sample and the first behavior weight value may be respectively substituted into item 3 in equation (1) and superimposed. The pair of vectors substituted in the 4 th term of formula (1) is the object identification s_jCorresponding initial vector v_cInitial vector v corresponding to object identification in second to-be-trained object operation sample_lThe constructed vector pair, v in item 4_lTo set D as auxiliary negative samples_npWhen there are multiple second to-be-trained object operation samples, the initial vector corresponding to the object identifier in each second to-be-trained object operation sample may be the same as the object identifier s_jCorresponding initial vector v_cForming a vector pair. And the behavior identifiers in each second object operation sample to be trained respectively correspond to 1 w_mn，w_mnFor the second behavior weight value, that is, the training weight associated with the behavior identifier, the vector pair corresponding to each second object to be trained operation sample and the second behavior weight value may be respectively substituted into item 4 in formula (1) and superimposed.

Through the upper partThe process of realizing the identification s of the object_jThe training of the 1 st neighbor object identification of (2) can be performed by the same process as described above, followed by the object identification s_jUntil the object identity s is trained_jWhen all the neighboring object ID training is completed, it indicates that the above-mentioned 1 object ID s is completed_jAnd finishing the training. In the same way, by training the object identifier s_jIn the same way, other object identifiers in the target positive sample set can be used as the object identifier s_jTraining is then carried out until each object identifier in the target positive sample set is used as an object identifier s_j(i.e., the center object id) indicates that the training for the 1 target user is complete. If a plurality of target users exist, each target user can be trained in the same way as the training of the target users, the training sequence among the target users does not affect the training result, and when the training of all the target users is completed, the training of the word vector model is completed, namely the word vector model completes the updating of the initial vector corresponding to each service object in the service object set. It should be noted that, in the training process, the next neighbor object identifier is trained on the basis of the training result of the previous neighbor object identifier, the next center object identifier is trained on the basis of the training result of the previous center object identifier, and the next target user is trained on the basis of the training result of the previous target user. The training refers to updating the vectors corresponding to the business objects in the business object set in the word vector model (the initial vectors are initially updated in the training process, that is, after the initial vectors are updated, the subsequent training is continuously updated on the basis of the updated initial vectors).

Please refer to fig. 6, which is a schematic view of a sample selection scenario provided in the present application. As shown in fig. 6, it is assumed that the set 100h is a target positive sample set corresponding to the target user R, and A, B, C and D are both object identifiers of corresponding business objects, that is, the set 100h sequentially includes an object identifier a and an object identifierIdentification B, object identification C and object identification D. The window 101h is a traversal window with a target step size of 1, and since there is no object identifier on the left of the object identifier a as the center object identifier in step 1 (the object identifier a is the 1 st object identifier in the set 100 h) and no object identifier on the right of the object identifier D as the center object identifier in step 4 (the object identifier a is the last 1 object identifier in the set 100 h), there is only the right part of the window 101h in step 1, and only the left part of the window 101h in step 4. Again, the central object id is the object id s obtained from the target positive sample set_jEach object id in the target positive sample set is selected as the center object id in turn. The object identifier a, the object identifier B, the object identifier C, and the object identifier D in the set 100h may be sequentially used as a central object identifier, and specifically include: step 1: with the object identifier a as the center object identifier, the neighbor object identifier of the center object identifier acquired through the window 101h is the object identifier B, and thus the identifier pair 102h (a, B) can be obtained. It should be noted that, because the corresponding vector (initially, the initial vector, and the updated vector after training) can be obtained through the object identifier, the identifier pair obtained here is equivalent to the above-mentioned vector pair, the 1 st object identifier in the identifier pair here is a central object identifier, and the 2 nd object identifier in the identifier pair is an object identifier that is a pair matched with the central object identifier. Step 2: with the object identifier B as the center object identifier, the neighbor object identifiers of the center object identifier obtained through the window 101h are the object identifier a and the object identifier C, and thus the identifier pair 103h (B, a) and the identifier pair 104h (B, C) can be obtained. And step 3, taking the object identifier C as a central object identifier, and taking the neighbor object identifiers of the central object identifier acquired through the window 101h as an object identifier B and an object identifier D, so as to obtain an identifier pair 105h (C, B) and an identifier pair 106h (C, D). And step 4, taking the object identifier D as a central object identifier, and taking a neighbor object identifier of the central object identifier acquired through the window 101h as an object identifier C, so as to obtain an identifier pair 107h (D, C). Then, each pair of markers obtained above can be trained, and the training process of each pair of markers is the sameSimilarly, the subsequent token pair continues to be trained on the basis of the previous token pair after training. Here, the example of training the pair of identifiers 102h (a, B) is described (i.e. the central object identifier is the object identifier a, and the selected neighbor object identifier is the object identifier B): the server may obtain a random first object identifier to be trained in the target negative sample set 108H (including the object identifier E, the object identifier F, the object identifier G, and the object identifier H), where if the object identifier E is obtained as the first object identifier to be trained, the identifier pair 115H (a, E) may be obtained. As shown in fig. 6, the auxiliary positive sample set 118h includes an object operation sample 110h (including an object id C and a behavior id P1), an object operation sample 111h (including an object id D and a behavior id P2), and an object operation sample 112h (including an object id G and a behavior id P3). The server may randomly obtain a first to-be-trained object operation sample in the auxiliary positive sample set 118h, where the object operation sample 111h is obtained as the first to-be-trained object operation sample, and then obtain an identification pair 116h (a, D) (the object identification D in the identification pair is the object identification D in the object operation sample 111 h), and obtain a first behavior weight value (i.e., a training weight) corresponding to the behavior identification P2 in the object operation sample 111 h. As shown in fig. 6, the auxiliary negative sample set 109h includes an object operation sample 113h (including an object identifier a and a behavior identifier P4) and an object operation sample 114h (including an object identifier B and a behavior identifier P5). The server may obtain a second object operation sample to be trained in the auxiliary negative sample set 109h, where the object operation sample 114h is obtained as the second object operation sample to be trained, and then obtain an identifier pair 117h (a, B) (where an object identifier B in the identifier pair is an object identifier B in the object operation sample 114 h), and obtain a second behavior weight value (i.e., a training weight) corresponding to the behavior identifier P5 in the object operation sample 114 h. The server may substitute the obtained vectors (initially, initial vectors, and updated initial vectors once updated) corresponding to the identifier pair 102h (a, B), the identifier pair 115h (a, E), the identifier pair 116h (a, D), and the identifier pair 117h (a, B) respectively, and the training weights corresponding to the behavior identifier P2 and the training weights corresponding to the behavior identifier P5 into the objective function (i.e., equation (1)) through the word vector model, thereby completing the processTraining 1 st for target user R. The method specifically comprises the following steps: the server may substitute the obtained vector corresponding to the pair of identifiers 102h (a, B) into the 1 st item in the formula (1) through a word vector model, substitute the vector corresponding to the pair of identifiers 115h (a, E) into the 2 nd item in the formula (1), substitute the vector corresponding to the pair of identifiers 116h (a, D) and the training weight corresponding to the behavior identifier P2 into the 3 rd item in the formula (1), and substitute the vector corresponding to the pair of identifiers 117h (a, B) and the training weight corresponding to the behavior identifier P5 into the 4 th item in the formula (1). By this, that is, the training for the pair of identifiers 102h (a, B) is completed, the pair of identifiers 103h (B, a), the pair of identifiers 104h (B, C), the pair of identifiers 105h (C, B), the pair of identifiers 106h (C, D), and the pair of identifiers 107h (D, C) may continue to be trained in sequence in the same training manner as the pair of identifiers 102h (a, B). When training is completed on the identification pair 102h (A, B), the identification pair 103h (B, A), the identification pair 104h (B, C), the identification pair 105h (C, B), the identification pair 106h (C, D) and the identification pair 107h (D, C), it indicates that training is completed on the target user R. When a plurality of target users exist, each target user can be sequentially training in an overlapping manner in the same training mode as the target user R. Again, the training mentioned in this application refers to updating the vector corresponding to each business object in the business object set in the word vector model (initially, the initial vector, and once updated, the updated initial vector).

When there are multiple target users, the target function of the above word vector model may be formula (2):

formula (2):

and Q represents the total number of target users, and each target user corresponds to a target positive sample set, a target negative sample set, an auxiliary positive sample set and an auxiliary negative sample set. The target positive sample set, the target negative sample set, the auxiliary positive sample set and the auxiliary negative sample set of a certain target user are only relevant to the certain target user and are not relevant to other target users. The word vector model can be trained through a target positive sample set, a target negative sample set, an auxiliary positive sample set and an auxiliary negative sample set which are respectively corresponding to each target user. The training effects of each target user are superposed, namely the training of the next target user is trained on the training result of the previous target user. And y is superposed with 1 from 1 every time, and values are sequentially taken to Q, which indicates that the training of Q target users is completed.

Optionally, a target user may also correspond to a plurality of target positive sample sets, a plurality of target negative sample sets, a plurality of auxiliary positive sample sets, and a plurality of auxiliary negative sample sets. For example, a plurality of target time periods may be set, and 1 target positive sample set, 1 target negative sample set, 1 auxiliary positive sample set, and 1 auxiliary negative sample set of the target user may be acquired within 1 target time period. The word vector model may be sequentially trained through the target positive sample set, the target negative sample set, the auxiliary positive sample set, and the auxiliary negative sample set acquired in each target time period through the same process, the training process for the target positive sample set, the target negative sample set, the auxiliary positive sample set, and the auxiliary negative sample set corresponding to each target time period is the same as the above process, and when the training for the target positive sample set, the target negative sample set, the auxiliary positive sample set, and the auxiliary negative sample set of a certain target user acquired in each target time period is completed, it indicates that the training for the certain target user is completed.

Step S208, respectively determining the initial vector updated in the word vector model as an object attribute vector corresponding to each business object in the business object set;

specifically, after the word vector model is trained, the initial vector corresponding to each service object updated in the word vector model may be respectively used as an object attribute vector corresponding to each service object, that is, each service object in the service object set respectively corresponds to 1 object attribute vector. The object attribute vector is trained by user operations (including click browsing operations and evaluation operation behaviors) of a target user on the business object. The object attribute vector of each service object is the final vector representation of each service object obtained by training the word vector model, i.e. the result of vectorizing each service object. The trained word vector model can be used for outputting a vector matrix, the vector matrix comprises a plurality of rows of vectors, and each row of vectors represents an object attribute vector of a service object. The server may obtain an object attribute vector corresponding to each service object in the service object set from a vector matrix output by the word vector model.

Wherein, the Word vector model may be a Word2vec model. The Word2vec model can map each business object to a vector respectively, the mapped vectors are multidimensional vectors, each dimension of the vector represents the characteristics of the business object on the corresponding dimension, and in other words, the Word2vec model can distribute various characteristics of the business object to multiple dimensions for representation. Since the target positive sample set is a sequence of object identifiers generated according to the click sequence of the user, it can be understood that, in the training process of the Word2vec model, the influence of the behavior of the target user continuously clicking a plurality of business objects on the object attribute vector of the business object to be generated is considered. The method specifically comprises the following steps: the central object identifier and the neighbor object identifier of the central object identifier are obtained through the behavior of the target user continuously clicking the service object, so that the Word2vec model can predict the service object which is possibly clicked by the target user according to the central object identifier and the neighbor object identifier of the central object identifier. In addition, because the identification pair formed by the central object identification and the object identification in the auxiliary positive sample set is added in the application to train the Word2vec model, the positive prediction of the Word2vec model in the predicted behavior is enhanced (namely, the prediction of the business object which is possibly clicked by the target user is positively enhanced). Meanwhile, the method and the device also increase an identification pair formed by the central object identification and the object identification in the auxiliary negative sample set to train the Word2vec model, and enhance the negative prediction of the Word2vec model in the predicted behavior (namely, negatively strengthen the prediction of the business object which may not be clicked by the target user). Therefore, the vector of each mapped business object can be continuously, accurately and effectively updated through the predicted behavior, and finally, an object attribute vector (updated vector corresponding to each business object) corresponding to each business object is obtained.

The application provides a model learning method, namely when obtaining an object attribute vector corresponding to each service object in a service object set, not only considering conventional user operations such as click browsing of a target user for the service object, but also considering multi-sample evaluation operation behaviors (including behaviors such as praise, forwarding, comment, purchase, collection, negative feedback and the like) of the user for the service object, and setting different training weights for each evaluation operation behavior during model training, namely considering the influence degree of each evaluation operation behavior on model training, so that the object attribute vector of each service object obtained through final training (namely through model learning) is more accurate.

More, the following process is described for 1 target user. The server can obtain the browsed business object and the evaluation business object corresponding to the target user in the business object set. The browsing user associated with the browsed business object includes the target user, that is, the browsed business object refers to the business object that the target user clicks the browsed business object in the target time period, and the object identifier corresponding to the business object is in the target positive sample set corresponding to the target user. The evaluation users associated with the evaluation business object include the target user, that is, the evaluation business object refers to a business object in which the target user has executed an evaluation operation behavior in a target time period, and an object identifier corresponding to the business object is in an auxiliary positive sample set or an auxiliary negative sample set corresponding to the target user. The server can calculate a behavior vector mean value corresponding to the target user through the obtained browsed service object and the obtained evaluation service object corresponding to the target user, wherein the behavior vector mean value is a result of vectorization of the target user. The method specifically comprises the following steps: the server may obtain an object attribute vector and a browsed weight value corresponding to each browsed service object (the browsed weight value corresponding to each browsed service object is the same), and may also obtain an object attribute vector corresponding to each evaluation service object and an evaluation operation weight array corresponding to all evaluation service objects, where the evaluation operation weight array includes a weight value corresponding to each evaluation service object, and a weight value of each evaluation service object in the evaluation operation weight array is determined by a type of an executed evaluation operation behavior, and 1 evaluation service object may correspond to multiple weight values. For example, if an evaluation operation behavior is forwarded 1 time by a target user and negatively fed back 1 time by the target user, the evaluation operation object corresponds to 1 weight value for forwarding (positive number, e.g., 0.3, due to the evaluation operation behavior of positive evaluation type) and 1 weight value for negative feedback (negative number, e.g., 0.3, due to the evaluation operation behavior of negative evaluation type). The object attribute vector corresponding to the browsed business object may be referred to as a first object attribute vector, and the object attribute vector corresponding to the evaluated business object may be referred to as a second object attribute vector. The server may multiply each first object attribute vector by the browsed weight value to obtain a first vector corresponding to each first object attribute vector, and the server may multiply each second object attribute vector by the weight value of the corresponding evaluation service object in the evaluation operation array to obtain a second vector corresponding to each second object attribute vector. It should be noted that, when a target user clicks and browses a certain service object 2 times within a target time period, the object attribute vector of the service object needs to be superimposed 2 times by a browsed weight value, that is, the service object corresponds to 2 first vectors. Similarly, when the target user performs 2 times (for example, 2 praise times) on a certain service object for 1 evaluation operation behavior in the target time period, the object attribute vector of the service object needs to be superimposed 2 times by the corresponding weight value, that is, the service object corresponds to 2 second vectors. Similarly, when the target user performs 1 time of the first evaluation operation behavior (for example, 1 praise) and 1 time of the second evaluation operation behavior (for example, 1 collect) on a certain service object within the target time period, the object attribute vector of the service object needs to be superimposed 1 time by the weight value corresponding to the first evaluation operation behavior, and superimposed 1 time by the weight value corresponding to the second evaluation operation behavior. The 1 business object may be both a viewed business object and an evaluated business object. In the target time period, 1 first vector corresponds to business objects browsed 1 time per click, and 1 second vector corresponds to evaluation operation behaviors executed 1 time. The server may sum all the acquired first vectors and all the acquired second vectors to obtain a target vector. And averaging the target vectors to obtain the behavior vector average value corresponding to the target user. The averaging of the target vectors refers to a value obtained by dividing the target vectors by the number of all vectors (i.e., the number of all first vectors and all second vectors) superimposed by the weight values (since the object attribute vectors corresponding to 1 service object may be superimposed multiple times, 1 service object may correspond to multiple superimposed vectors).

The process of obtaining the behavior vector mean corresponding to the target user can be expressed by the following formula (3):

wherein v is_userRepresenting the mean of the behavior vectors of the target user, n representing the total number of vectors superimposed, w_iRepresenting a vector v_iV (may be a viewed weight value or a weight value in the evaluation operation weight array), v_iRepresenting the i-th vector, v, being superimposed_iThe attribute vector of the object corresponding to the browsed business object may be, or the attribute vector of the object corresponding to the evaluated business object may be.

Please refer to fig. 7, which is a scene diagram for obtaining a mean value of a behavior vector according to the present application. As shown in fig. 7, the evaluation operation behavior includes approval and forwarding, and since the target user clicks the business object 100b and the business object 101b, the business object 100b and the business object 101b are the browsed business objects, and since the target user performs the evaluation operation behavior on the business object 102b and the business object 103b, the business object 102b and the business object 103b are the evaluation business objects. The browsed weight value corresponding to the service object 100b is a weight value 1, the browsed weight value corresponding to the service object 101b is also a weight value 1, and the weight value 2 corresponding to the service object 102b and the weight value 3 corresponding to the service object 103b constitute the evaluation operation weight array. The product between the service object 100b and the weight value 1 is 1 first vector, the product between the service object 101b and the weight value 1 is another 1 first vector, the product between the service object 102b and the weight value 2 is 1 second vector, and the product between the service object 103b and the weight value 3 is another 1 second vector. The first vector corresponding to the service object 100b, the first vector corresponding to the service object 101b, the second vector corresponding to the service object 102b, and the second vector corresponding to the service object 103b may be added (i.e., summed) to obtain a target vector, and the target vector is divided by 4 (because 4 vectors are superimposed, the vectors corresponding to the service object 100b, the vector corresponding to the service object 101b, the vector corresponding to the service object 102b, and the vector corresponding to the service object 103b are respectively obtained), so as to obtain a behavior vector mean value 104b corresponding to the target user.

The server can calculate the vector distance (namely cosine distance) between the object attribute vector corresponding to each service object in the service object set and the behavior vector mean value corresponding to the target user, and the smaller the vector distance between the object attribute vector of a certain service object and the behavior vector mean value corresponding to the target user is, the greater the degree of interest of the target user in the service object is, so that the service object corresponding to the object attribute vector having the smallest vector distance between the behavior vector mean values corresponding to the target user can be used as the target service object corresponding to the target user, and the target service object can be recommended to the target user. The server can send the recommended column of the target service object to the terminal corresponding to the target user, and the terminal corresponding to the target user can display the recommended column in the recommendation page, so that the purpose of recommending the target service object to the target user is achieved.

More, the server may further use the calculated vector distance between the object attribute vector corresponding to each service object in the service object set and the behavior vector mean corresponding to the target user as a cross feature, and use the cross feature as a feature for training a recommendation model, where the training feature of the recommendation model may have other features in addition to the cross feature, and the recommendation model is used for recommending the service object to the target user. By using the cross features to participate in the training of the recommendation model, the recommendation model obtained by training can recommend a proper service object to a target user more accurately. Optionally, a few minimum vector distances may also be selected as the cross features, please refer to fig. 8, which is a scene diagram for acquiring the cross features provided in the present application. As shown in fig. 8, the object attribute vectors corresponding to the business objects in the business object set include a vector 100d, a vector 101d, a vector 102d, a vector 103d, and a vector 104d, and the vector 105d is a behavior vector mean corresponding to the target user. By calculating the vector distance between vector 100d, vector 101d, vector 102d, vector 103d, vector 104d and vector 105d, respectively, it is found that the vector distance between vector 101d, vector 102d and vector 103d and vector 105d is closest. The vector distance between the vector 101d and the vector 105d is distance 1, the vector distance between the vector 102d and the vector 105d is distance 2, the vector distance between the vector 103d and the vector 105d is distance 3, the distance 1 is greater than the distance 2, and the distance 2 is greater than the distance 3. Optionally, when there are multiple target users, a vector distance between an object attribute vector corresponding to each service object in the service object set and a behavior vector mean corresponding to each target user may be used as a cross feature to participate in training of a recommendation model, where the recommendation model is used to recommend a service object to all target users.

The server may obtain the behavior vector mean corresponding to the user to be matched in the same manner as the obtaining of the behavior vector mean of the target user, and the user to be matched may be understood as another target user. The server may calculate a vector distance between the behavior vector mean value corresponding to the target user and the behavior vector mean value corresponding to the user to be matched, and if the vector distance is smaller than a first vector distance threshold (which may be set by itself), it is determined that the target user and the user to be matched have user similarity. If the target user and the user to be matched have user similarity, then subsequently when recommending the service object to the target user, the service object liked by the user to be matched (the collection or approval of the service object by the user to be matched can represent that the user to be matched likes the corresponding service object) can be recommended to the target user, and when recommending the service object to the user to be matched, the service object liked by the target user can be recommended to the user to be matched. Or, the service object may be recommended to the target user according to the historical browsing record of the to-be-matched user for the service object, in other words, the service object browsed by the to-be-matched user may be recommended to the target user. Please refer to fig. 9, which is a scene diagram of a recommended service object provided in the present application. As shown in fig. 9, the terminal 200a is the terminal corresponding to the target user 105k, and there is a "recommend more" button 100k in the recommendation page 112k of the terminal 200 a. The terminal 200a may generate a recommendation instruction in response to the click operation of the target user 105k on the button 100k, and the terminal 200a may transmit the recommendation instruction to the server 100. After receiving the recommendation instruction, the server 100 may obtain a behavior vector mean corresponding to the target user 105 k. Here, the user 106k to be matched, the user 107k to be matched, and the user 108k to be matched are taken as examples for explanation. The server 100 may obtain a behavior vector mean value corresponding to the user 106k to be matched, and calculate to obtain a vector distance 1 between the behavior vector mean value corresponding to the target user 105k and the behavior vector mean value corresponding to the user 106k to be matched; the server 100 may obtain a behavior vector mean value corresponding to the user 107k to be matched, and calculate to obtain a vector distance 2 between the behavior vector mean value corresponding to the target user 105k and the behavior vector mean value corresponding to the user 107k to be matched; the server 100 may also obtain a behavior vector mean value corresponding to the user 108k to be matched, and calculate to obtain a vector distance 3 between the behavior vector mean value corresponding to the target user 105k and the behavior vector mean value corresponding to the user 108k to be matched. The server 100 may compare the calculated vector distance 1, vector distance 2, and vector distance 3 with a first vector distance threshold (which may be preset by itself), and when only the vector distance 1 is smaller than the first vector distance threshold, it indicates that there is only the user 106k to be matched with the target user 105 k. The server 100 may retrieve the business object 5 marked as liked by the user 106k to be matched. The server 100 may send the cover and title of the business object 5 to the terminal 200a, and the terminal 200a may display the received cover 109k and title 110k of the business object 5 in the recommendation page 111k, so as to recommend the business object 5 to the target user 105 k. Subsequently, the terminal 200a may jump to the detail page 118k of the business object 5 in response to the click operation of the target user with respect to the cover 109k or the title 110 k. Here, taking the business object 5 as a basketball as an example, the name of the business object 5 (i.e., the brand name: basketball), the brand name of the business object 5 (i.e., the brand name: invincibility), and the price of the business object 5 (i.e., the commodity price: 1000) are displayed on the detail page 118 k.

The service object set comprises a first service object and a second service object, both the first service object and the second service object can be any one service object in the service object set, and the first service object and the second service object are not the same service object. The server may obtain a vector distance between an object attribute vector corresponding to the first service object and an object attribute vector corresponding to the second service object, and when the vector distance is smaller than a second vector distance threshold (which may be set by itself), it is determined that the object similarity exists between the first service object and the second service object. If the first service object and the second service object have object similarity, the second service object can be recommended to the target user when the target user likes the first service object (behaviors of the target user such as collection or approval of the service object can represent that the target user likes the corresponding service object). Or, when the target user includes the first business object in the historical browsing record for the business object (i.e., the target user browses the first business object), the second business object may be recommended to the target user. Please refer to fig. 10, which is a schematic view of another scenario for recommending a business object provided in the present application. As shown in fig. 10, the terminal 200a is a terminal corresponding to the target user, and there is a "recommend more" button 116k in the recommendation page 115k of the terminal 200 a. The terminal 200a may generate a recommendation instruction in response to a click operation of the target user on the button 116k, and the terminal 200a may transmit the recommendation instruction to the server 100. After receiving the recommendation instruction, the server 100 may obtain the service object 1 marked as favorite by the target user. Besides business object 1, business object 2, business object 3, and business object 4 are in the business object set. The server may obtain object attribute vectors corresponding to the service object 1, the service object 2, the service object 3, and the service object 4, respectively. The server 100 may calculate a vector distance 4 between the object attribute vector corresponding to the service object 1 and the object attribute vector corresponding to the service object 2, calculate a vector distance 5 between the object attribute vector corresponding to the service object 1 and the object attribute vector corresponding to the service object 3, and calculate a vector distance 6 between the object attribute vector corresponding to the service object 1 and the object attribute vector corresponding to the service object 4. The server 100 may compare the vector distance 4, the vector distance 5, and the vector distance 6 with a second vector distance threshold (which may be preset), and when only the vector distance 4 is smaller than the second vector distance threshold, the server 100 may regard the service object 2 as the service object 114k to be recommended, and the server 100 may send a cover and a title of the service object 114k to be recommended to the terminal 200 a. As shown in fig. 10, the terminal 200a may display and acquire the cover 101k and the title 102k (i.e., the service object 2) of the acquired service object 114k to be recommended in the recommendation page 113k, so as to achieve the purpose of recommending the service object 114k to be recommended to the target user. Subsequently, the terminal 200a may jump to the detail page 117k of the to-be-recommended service object 114k in response to the click operation of the target user on the front cover 101k or the title 102 k. Here, taking the service object 114k to be recommended as a football as an example, the name of the service object 114k to be recommended (i.e., the name of the product: football), the price of the service object 114k to be recommended before being discounted (i.e., the price before being discounted: 999), and the price of the service object 114k to be recommended after being discounted (i.e., the price after being discounted: 666) are displayed on the detail page 117 k.

Please refer to fig. 11, which is a schematic structural diagram of a data processing apparatus provided in the present application. As shown in fig. 11, the data processing apparatus 1 may include: a first obtaining module 101, a second obtaining module 102, a third obtaining module 103, a fourth obtaining module 104 and a generating module 105;

a first obtaining module 101, configured to obtain a service object set, where the service object set includes multiple service objects;

a second obtaining module 102, configured to obtain browsing statuses of a target user for the multiple service objects, and determine a target positive sample set and a target negative sample set corresponding to the target user according to the browsing statuses and the service object sets;

a third obtaining module 103, configured to obtain a user behavior set corresponding to the target user, where the user behavior set includes evaluation operation behaviors of the target user for the multiple service objects;

a fourth obtaining module 104, configured to obtain, according to the evaluation type of the evaluation operation behavior, an auxiliary positive sample set and an auxiliary negative sample set corresponding to the target user in the user behavior set;

a generating module 105, configured to generate an object attribute vector corresponding to each business object in the business object set based on the target positive sample set, the target negative sample set, the auxiliary positive sample set, the auxiliary negative sample set, and a word vector model.

For specific implementation of functions of the first obtaining module 101, the second obtaining module 102, the third obtaining module 103, the fourth obtaining module 104, and the generating module 105, please refer to steps S101 to S105 in the embodiment corresponding to fig. 2, which is not described herein again.

Wherein the browsing status comprises a browsed status and an unviewed status; the second obtaining module 102 includes a first generating unit 1021 and a second generating unit 1022;

a first generating unit 1021, configured to generate the target positive sample set according to an object identifier corresponding to the business object whose browsing status is the browsed status;

a second generating unit 1022, configured to generate the target negative sample set according to the object identifier corresponding to the service object whose browsing status is the non-browsing status.

For a specific implementation manner of the functions of the first generating unit 1021 and the second generating unit 1022, please refer to step S202 in the corresponding embodiment of fig. 3, which is not described herein again.

Wherein, the first generating unit 1021 comprises a time acquiring subunit 10211 and a first adding subunit 10212;

a time obtaining subunit 10211, configured to obtain a browsing timestamp corresponding to each service object in the browsing state, where the browsing state is the browsed state, determine a service object of the browsing timestamp in a target time period as a positive sample service object, where one positive sample service object corresponds to at least one browsing timestamp;

a first adding subunit 10212, configured to generate a positive sample sequence according to at least one browsing timestamp and an object identifier respectively corresponding to each positive sample service object, and add the positive sample sequence to the target positive sample set, where the positive sample sequence includes an object identifier respectively corresponding to each positive sample service object.

For a specific implementation manner of the functions of the time obtaining subunit 10211 and the first adding subunit 10212, please refer to step S202 in the corresponding embodiment of fig. 3, which is not described herein again.

The second generating unit 1022 includes a multiple obtaining subunit 10221, an extracting subunit 10222, and a second adding subunit 10223;

a multiple obtaining subunit 10221, configured to determine the number of objects of the business object in the positive sample sequence as a target number, and obtain a negative sample extraction multiple for the target number;

an extracting subunit 10222, configured to extract, according to the target number and the negative sample extraction multiple, a service object as a negative sample service object from the service objects in the browsing state that are in the non-browsing state, where the number of the negative sample service objects is equal to a product of the target number and the sample extraction multiple;

a second adding subunit 10223, configured to add the object identifier corresponding to the negative example service object to the target negative example set.

For a specific implementation manner of the functions of the multiple obtaining subunit 10221, the extracting subunit 10222 and the second adding subunit 10223, please refer to step S202 in the corresponding embodiment of fig. 3, which is not described herein again.

the fourth obtaining module 104 includes a first adding unit 1041 and a second adding unit 1042;

a first adding unit 1041, configured to determine, as a first object operation sample, an object operation sample that includes an evaluation operation behavior with the positive evaluation type in the user behavior set, and add the first object operation sample to the auxiliary positive sample set;

a second adding unit 1042, configured to determine, as a second object operation sample, an object operation sample that includes the evaluation operation behavior with the negative evaluation type in the user behavior set, and add the second object operation sample to the auxiliary negative sample set.

For a specific implementation manner of functions of the first adding unit 1041 and the second adding unit 1042, please refer to step S204 in the corresponding embodiment of fig. 3, which is not described herein again.

the generating module 105 includes a first identifier obtaining unit 1051, a second identifier obtaining unit 1052, an updating unit 1053 and a vector determining unit 1054;

a first identifier obtaining unit 1051, configured to obtain an object identifier s in the target positive sample set_jJ is a positive integer less than or equal to N, and N is the number of object identifiers in the target positive sample set;

a second identifier obtaining unit 1052, configured to obtain the object identifier s in the target positive sample set based on a traversal window with a target step size_jCorresponding neighbor object identification;

an updating unit 1053 for identifying s based on said object_jUpdating an initial vector corresponding to each business object in the business object set in the word vector model by the neighbor object identification, the target negative sample set, the auxiliary positive sample set and the auxiliary negative sample set;

a vector determining unit 1054, configured to determine the initial vector updated in the word vector model as an object attribute vector corresponding to each service object in the service object set.

For specific functional implementation manners of the first identifier obtaining unit 1051, the second identifier obtaining unit 1052, the updating unit 1053 and the vector determining unit 1054, please refer to step S205-step S208 in the embodiment corresponding to fig. 3, which is not described herein again.

the update unit 1053, which includes a vector generation subunit 10531, a sample acquisition subunit 10532, a weight acquisition subunit 10533, a vector determination subunit 10534, and an update subunit 10535;

a vector generating subunit 10531, configured to generate initial vectors corresponding to each service object in the service object set based on gaussian distribution, and associate each initial vector with an object identifier of the corresponding service object;

a sample obtaining subunit 10532, configured to obtain a first object to be trained identifier in the target negative sample set, obtain a first object to be trained operation sample in the auxiliary positive sample set, and obtain a second object to be trained operation sample in the auxiliary negative sample set;

a weight obtaining subunit 10533, configured to obtain a first behavior weight value corresponding to the behavior identifier in the first to-be-trained object operation sample, and obtain a second behavior weight value corresponding to the behavior identifier in the second to-be-trained object operation sample;

a vector determination subunit 10534 for identifying the object as s_jInitial vectors respectively associated with the neighbor object identifier, the first object identifier to be trained, the object identifier in the first object operation sample to be trained and the object identifier in the second object operation sample to be trained are all determined as initial vectors to be trained;

an updating subunit 10535, configured to update, in the word vector model, an initial vector corresponding to each business object in the business object set based on the initial vector to be trained, the first behavior weight value, and the second behavior weight value.

For specific implementation of functions of the vector generation subunit 10531, the sample acquisition subunit 10532, the weight acquisition subunit 10533, the vector determination subunit 10534, and the update subunit 10535, please refer to step S207 in the embodiment corresponding to fig. 3, which is not described herein again.

The data processing apparatus 1 further includes an object obtaining module 106, a first determining module 107, and a second determining module 108;

an object obtaining module 106, configured to obtain a browsed service object and an evaluation service object corresponding to a target user in the service object set, where a browsed user associated with the browsed service object includes the target user, an evaluation user associated with the evaluation service object includes the target user, and the evaluation user is a user performing an evaluation operation on the service object;

a first determining module 107, configured to determine a behavior vector mean value corresponding to the target user according to the browsed service object and the evaluation service object;

a second determining module 108, configured to determine a target service object for the target user according to the behavior vector mean corresponding to the target user and an object attribute vector corresponding to each service object in the service object set, and recommend the target service object to the target user.

For a specific implementation manner of functions of the object obtaining module 106, the first determining module 107, and the second determining module 108, please refer to step S208 in the corresponding embodiment of fig. 3, which is not described herein again.

Wherein the first determining module 107 includes an obtaining unit 1071, a first determining unit 1072, a first product unit 1073, a second product unit 1074, a summing unit 1075, and a second determining unit 1076;

an obtaining unit 1071, configured to obtain an object attribute vector and a browsed weight value corresponding to the browsed service object, and obtain an object attribute vector and an evaluation operation weight array corresponding to the evaluation service object;

a first determining unit 1072, configured to determine an object attribute vector corresponding to the browsed service object as a first object attribute vector, and determine an object attribute vector corresponding to the evaluated service object as a second object attribute vector;

a first multiplication unit 1073, configured to multiply each first object attribute vector with the browsed weight value to obtain a first vector corresponding to each first object attribute vector;

a second multiplication unit 1074, configured to multiply each second object attribute vector with a corresponding weight value in the evaluation operation weight array, respectively, to obtain a second vector corresponding to each second object attribute vector;

a summing unit 1075, configured to sum the first vector and the second vector to obtain a target vector, and sum the vector number of the first vector and the vector number of the second vector to obtain a target vector number;

a second determining unit 1076, configured to determine a ratio between the target vector and the number of target vectors as a behavior vector mean corresponding to the target user.

For a specific implementation manner of functions of the obtaining unit 1071, the first determining unit 1072, the first product unit 1073, the second product unit 1074, the summing unit 1075, and the second determining unit 1076, please refer to step S208 in the corresponding embodiment of fig. 3, which is not described herein again.

The second determining module 108 includes a distance obtaining unit 1081 and a third determining unit 1082;

a distance obtaining unit 1081, configured to obtain a vector distance between an object attribute vector corresponding to each service object in the service object set and a behavior vector mean corresponding to the target user, respectively;

a third determining unit 1082, configured to determine, as a target service object corresponding to the target user, a service object corresponding to an object attribute vector having a minimum vector distance between behavior vector means corresponding to the target user, and recommend the target service object to the target user.

For a specific implementation manner of the functions of the distance obtaining unit 1081 and the third determining unit 1082, please refer to step S208 in the corresponding embodiment of fig. 3, which is not described herein again.

The data processing apparatus 1 further includes a third determining module 109 and a training module 110;

a third determining module 109, configured to determine a vector distance between an object attribute vector corresponding to each service object in the service object set and a behavior vector mean corresponding to the target user as a cross feature;

a training module 110, configured to train a recommendation model based on the cross features, where the recommendation model is used to recommend a business object for the target user.

Please refer to step S208 in the embodiment corresponding to fig. 3 for a specific implementation manner of functions of the third determining module 109 and the training module 110, which is not described herein again.

The data processing apparatus 1 further includes a mean value obtaining module 111, a first similarity module 112, and a first recommending module 113;

the mean value obtaining module 111 is configured to obtain a mean value of the behavior vectors corresponding to the target user, and obtain a mean value of the behavior vectors corresponding to the user to be matched;

a first similarity module 112, configured to determine that user similarity exists between the target user and the user to be matched if a vector distance between the behavior vector mean value corresponding to the target user and the behavior vector mean value corresponding to the user to be matched is smaller than a first vector distance threshold;

a first recommending module 113, configured to recommend a service object to the target user according to a historical browsing record of the user to be matched for the service object if the target user and the user to be matched have the user similarity.

For a specific implementation manner of functions of the average obtaining module 111, the first similar module 112, and the first recommending module 113, please refer to step S208 in the corresponding embodiment of fig. 3, which is not described herein again.

The business objects in the business object set comprise a third business object and a fourth business object; the data processing apparatus 1 further comprises a vector obtaining module 114, a second similarity module 115 and a second recommendation module 116;

a vector obtaining module 114, configured to obtain an object attribute vector corresponding to the third service object, and obtain an object attribute vector corresponding to the fourth service object;

a second similarity module 115, configured to determine that there is object similarity between the third service object and the fourth service object if a vector distance between an object attribute vector corresponding to the third service object and an object attribute vector corresponding to the fourth service object is smaller than a second vector distance threshold;

a second recommending module 116, configured to recommend the second service object to the target user if the first service object and the second service object have the object similarity and the historical browsing record of the target user for the service object includes the first service object.

For a specific implementation manner of the functions of the vector obtaining module 114, the second similarity module 115, and the second recommendation module 116, please refer to step S208 in the corresponding embodiment of fig. 3, which is not described herein again.

Please refer to fig. 12, which is a schematic structural diagram of a computer device provided in the present application. As shown in fig. 12, the computer apparatus 1000 may include: the processor 1001, the network interface 1004, and the memory 1005, and the data processing apparatus 1000 may further include: a user interface 1003, and at least one communication bus 1002. Wherein a communication bus 1002 is used to enable connective communication between these components. The user interface 1003 may include a Display screen (Display) and a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a standard wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 1005 may optionally be at least one memory device located remotely from the processor 1001. As shown in fig. 12, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a device control application program.

In the computer device 1000 shown in fig. 12, the network interface 1004 may provide a network communication function; the user interface 1003 is an interface for providing a user with input; and the processor 1001 may be configured to call a device control application stored in the memory 1005 to implement the data processing method described in the embodiment corresponding to any of fig. 2 and 3. It should be understood that the data processing apparatus 1000 described in this application can also perform the description of the data processing apparatus 1 in the embodiment corresponding to fig. 11, and is not described herein again. In addition, the beneficial effects of the same method are not described in detail.

Further, here, it is to be noted that: the present application further provides a computer-readable storage medium, and the computer-readable storage medium stores therein the aforementioned computer program executed by the data processing apparatus 1, and the computer program includes program instructions, and when the processor executes the program instructions, the description of the data processing method in the embodiment corresponding to any of fig. 2 and fig. 3 can be executed, so that details will not be repeated here. In addition, the beneficial effects of the same method are not described in detail. For technical details not disclosed in the embodiments of the computer storage medium referred to in the present application, reference is made to the description of the embodiments of the method of the present application.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto but rather by the claims appended hereto.

Claims

1. A data processing method, comprising:

2. The method of claim 1, wherein the browsing state comprises a browsed state and an unviewed state; determining a target positive sample set and a target negative sample set corresponding to the target user according to the browsing state and the service object set, including:

3. The method according to claim 2, wherein the generating the target positive sample set according to the object identifier corresponding to the business object whose browsing status is the browsing status comprises:

4. The method according to claim 2, wherein the generating the target negative sample set according to the object identifier corresponding to the business object whose browsing status is the non-browsing status comprises:

5. The method of claim 1, wherein the evaluation types include a positive evaluation type and a negative evaluation type; the user behavior set comprises a plurality of object operation samples, wherein one object operation sample comprises an object identifier of a business object and a behavior identifier of an evaluation operation behavior of the target user aiming at the business object;

6. The method of claim 5, wherein each business object in the set of target positive samples has an object identification;

7. The method of claim 6, wherein each business object in the target negative sample set has an object identity;

said identification s based on said object_jThe neighbor object identification, the target negative sample set, theThe auxiliary positive sample set and the auxiliary negative sample set are used for updating an initial vector corresponding to each business object in the business object set in the word vector model, and the method comprises the following steps:

8. The method of claim 1, further comprising:

determining a behavior vector mean value corresponding to the target user according to the browsed service object and the evaluation service object;

9. The method according to claim 8, wherein the determining the behavior vector mean corresponding to the target user according to the browsed business objects and the rated business objects comprises:

10. The method according to claim 8, wherein the determining a target business object for the target user according to the behavior vector mean corresponding to the target user and the object attribute vector corresponding to each business object in the business object set, and recommending the target business object to the target user comprises:

11. The method of claim 10, further comprising:

12. The method of claim 8, further comprising:

13. The method of claim 1, wherein the business objects in the set of business objects comprise a first business object and a second business object; further comprising:

acquiring an object attribute vector corresponding to the first service object, and acquiring an object attribute vector corresponding to the second service object;

if the vector distance between the object attribute vector corresponding to the first business object and the object attribute vector corresponding to the second business object is smaller than a second vector distance threshold value, determining that the first business object and the second business object have object similarity;

14. A data processing apparatus, comprising:

15. A computer device comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of the method according to any one of claims 1-13.

16. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions which, when executed by a processor, perform the method according to any one of claims 1-13.