CN105608600A

CN105608600A - Method for evaluating and optimizing B2B seller performances

Info

Publication number: CN105608600A
Application number: CN201510955479.9A
Authority: CN
Inventors: 刘波; 李伟; 孟青; 卢晨旭; 黄建鹏; 房鹏展
Original assignee: Southeast University; Focus Technology Co Ltd
Current assignee: Southeast University; Focus Technology Co Ltd
Priority date: 2015-12-18
Filing date: 2015-12-18
Publication date: 2016-05-25

Abstract

The invention discloses a method for evaluating and optimizing B2B seller performances, and the method comprises the following steps: carrying out the selection and preprocessing of raw data: carrying out the selection and preprocessing of seller data information and seller goods information data, i.e., carrying out the re-clearing, integrating, regulating and converting of the data; carrying out the cluster analysis of seller sale indexes and building a model: selecting important indexes cared by a seller as the seller sale indexes, wherein the important indexes comprise an exposure amount, a click rate, and an inquiry amount; carrying out the clustering operation of the exposure amount, the click rate and the inquiry amount, and obtaining a final clustering result for analysis; carrying out the interactive correction of an evaluation model formed through data clustering; finding a reason which causes the seller promotional effectiveness is not ideal in a hierarchical manner through employing a decision tree machine learning algorithm, and providing an optimization scheme for the seller according to the evaluation result. Finally, the method can effectively increase the click rate, the exposure amount and the inquiry amount, improves the transaction success rate of the seller, and provides better E-commerce services for the seller. Therefore, the method can monitor the effects before and after seller promotional optimization.

Description

Method for evaluating and optimizing B2B seller effect

Technical Field

The invention discloses a method for evaluating and optimizing the promotion effect of a B2B seller, which is a technology for recommending and optimizing the promotion strategy of the seller by utilizing index (exposure, click rate and inquiry amount) information concerned by the seller, a clustering method and user feedback, and a decision tree method in combination with the description index and service index of the seller, and relates to the fields of B2B, data mining and machine learning.

Background

Abbreviations used herein are defined first:

B2B (Businesstobusiness): the system is a business model of marketing between enterprises, which is closely linked with clients through the Internet or various business platforms, and provides better service for the clients and promotes the business development of the enterprises through the characteristics of flexibility and convenience of the network.

K-Means: a clustering algorithm inputs the number K of clusters and data objects to be clustered and outputs K clusters meeting an end condition.

Cart (classificationregressatetree): the classification and regression tree is a very effective nonparametric classification and regression method. The prediction purpose is achieved by constructing a binary tree. The classification and regression tree CART model is widely used in the statistical field and data mining technology, adopts a completely different mode from the traditional statistics to construct a prediction criterion, is given in a binary tree form, and is easy to understand, use and explain. The prediction tree constructed by the CART model is more accurate than the algebra prediction criterion constructed by a common statistical method in many cases, and the more complex the data is, the more the variables are, the more the superiority of the algorithm is. By using the model and taking the evaluation indexes as decision nodes, the influence of each index on the evaluation result can be easily obtained.

At present, an e-commerce website based on B2B develops rapidly, the number of users of the website is multiplied, more and more commodity information needs to be displayed, in order to enhance the competitiveness of the website, more and more user behavior data need to be analyzed and processed by the website, and the traditional processing mode cannot meet the requirements of the development of the website at present.

The information source of the B2B e-commerce website has two aspects, one is that the B2B website is based on a database of user information, including basic information such as the user's registration name, password, etc. Another aspect is user information obtained through a web log. For the analysis and optimization of the B2B platform, the information is stored by a simple database, which is far from sufficient, and the information of the web log must be relied on. Based on the consideration, the analysis of the B2B platform is mainly focused on the analysis of web data, and according to the mined information, the seller websites are optimized and analyzed for performance indexes, so that the loyalty of users is improved, the flow of the websites is increased, the external connection of the websites is expanded, the connection of the websites is optimized, and the competitiveness of the B2B website in the same row is improved on the whole.

Early research mainly aims at the overall evaluation of a B2B e-commerce website, and applies a plurality of overall methods, such as an analytic hierarchy process and the like, or uses an access survey and expert survey scoring method to obtain an evaluation system, but does not specially aim at an effect evaluation system of a seller in an e-commerce website platform.

At present, an e-commerce website based on B2B develops rapidly, the number of users of the website is multiplied, more and more commodity information needs to be displayed, in order to enhance the competitiveness of the website, more and more user behavior data need to be analyzed and processed by the website, and the traditional processing mode cannot meet the requirements of the development of the website at present. In the method for evaluating and optimizing the promotion effect of the B2B seller, how to accurately and comprehensively measure and determine the promotion effect of the seller by using indexes with different magnitudes, how to establish the importance of each index and how to form a set of evaluation criteria are the basis of the whole method.

Therefore, the main task of the project is to establish a promotion effect evaluation index set of the seller based on website data, solve the problem of seller promotion optimization in the B2B business platform from the aspects of the effect obtained by the seller, the value-added service purchased, the activity on the website, the completeness of seller product data and the like, and provide an effective optimization scheme and good economic benefit for the promotion of the B2B seller.

Disclosure of Invention

Aiming at a B2B off-line transaction website platform, the invention aims to normalize data by a z-score method and comprehensively consider the geometric distance and the similarity between different tuples according to indexes generated by the promotion effect of a seller and indexes (exposure, click rate and inquiry amount) concerned by the seller, measure the distance relationship between different tuples and form a seller promotion effect evaluation mechanism by using the clustering idea of K-Means.

In order to solve the technical problem, the invention provides a seller effect evaluation standard of B2B, namely a seller effect evaluation and optimization mechanism; modifying the seller effectiveness evaluation and optimization mechanism.

The technical scheme of the invention is that the B2B seller effect evaluation and optimization method comprises the following steps:

1) selecting and preprocessing raw data:

the characteristics of incompleteness, inconsistency, noise and the like of data are common characteristics of large databases or data warehouses in the real world, wherein the characteristics include interesting attributes such as data information of sellers and commodity information of the sellers, but the information is not all useful, and the information needs to be cleared, integrated, stipulated and transformed again. The cleaning process is used for increasing the integrity and consistency of data by filling in default values, smoothing data, removing invalid data and the like. Making the input data efficient. And extracting target industry and useful characteristic information from the original database. Because different evaluation indexes often have different dimensions and dimension units, if the original dimensions are continuously used for measuring the size of the data, the result of data analysis is influenced, in order to eliminate the dimension influence of the different evaluation indexes, the data is standardized, so that the obtained evaluation index data is in the same order of magnitude, and then the next step of method is used for comprehensive evaluation.

In a database, defining a data table to store index information of merchants and the cleaned data;

2) the marketing index of the seller is subjected to cluster analysis and promotion effects:

the method comprises the following steps that important indexes (exposure, click rate and inquiry amount) concerned by a seller are selected by the sales indexes of the seller, clustering calculation is carried out according to the three indexes, and the final clustering result is obtained for analysis, wherein the steps comprise:

2-1) selecting the processed data set, and carrying out preliminary statistics: selecting the data processed in the first step from a database, carrying out statistics on the maximum value, the minimum value, the numerical sum and the square sum of each index, establishing a basic unit tuple of a cluster and statistics of each attribute according to the data, wherein the statistics of each index comprises the maximum value, the minimum value, the average value, the square sum and the variance of the index, and carrying out the next step;

2-2) normalization of tuple data: normalizing the data in the original tuple by using the statistic of each index in the previous stepThe normalization is performed by using a z-score method, and if we have k samples, a certain characteristic x of the k samples is marked as x for the ith sample_i(i is more than or equal to 0 and more than or equal to k-1), firstly, calculating the average value of the training sample characteristics xThe standard deviation of the samples was then calculatedComputing a feature x normalized valueSo that x' is in a normal distribution;

2-3) performing heuristic clustering: for the maximum class of data being 9 classes, the data is subjected to clustering evaluation by using a K-means algorithm, and in the clustering, a training sample is { x }⁽¹⁾，…，x^(m)}，To cluster into k classes, first randomly select the particles of k clusters as μ₁，μ₂，…，(Here we select the first k samples), x for each of the samples⁽ⁱ⁾Assume that each sample has l attributes attr_n(n is more than or equal to 1 and less than or equal to l), calculating the class to which the new class belongs

c^{(i)} : = \arg \min_{j} (A + B) ... (1)

Wherein,

recalculate its centroid for each cluster of classes j:

and (3) repeating the processes (1) and (2) until the difference between the two times is less than 1, and finishing the clustering process. Checking the distribution condition of the result, if the data volume of a certain class in the result is 0, considering that the result cannot be classified into the classes, reducing the classes, re-clustering until the number of all the classes is more than 0, and considering the class at this time as the final class of the cluster;

3) performing interactive correction on the clustering evaluation model of the data in the step 2): after the last step is finished, a cluster evaluation model is formed by default parameters, and the default result is that the processing values of the index exposure and the click rate in the centroid of the cluster are simultaneously less than zero or the processing value of the index inquiry amount is less than zero, so that the popularization effect is not good; the user can reselect the category of the model to be good in popularization effect (the label is 1), otherwise, the user considers the promotion effect to be poor (the label is 0), when the user can assist the visualization result according to the environment requirement, namely the average result of the left-side index and the right-side scatter diagram, the result category considered to be good in effect is selected, the model parameters are corrected in an assisting mode, at the moment, the method corrects according to the result selected by the user, and the subsequent steps are explained;

4) evaluation model parameters of visual clustering: and when the steps are finished (without user interaction), forming a clustering chart presentation result, synchronously presenting the clustering chart presentation result to the user in a three-dimensional scatter diagram mode, and refreshing the chart to present the user interaction result after the user interaction selection in the last step.

5) Establishing an optimization model according to the clustering result: the value-added service purchased by a user, the average series of product catalogues, the number of products, the richness of keywords, the total score of the main printing products and the perfection of product attributes are used as classification indexes, the clustering result is used as a classification result, a CART decision tree is generated through training, and the CART decision tree is used as an optimization model.

Generating a CART decision tree: and selecting the optimal feature by using the Gini index, and determining the optimal binary segmentation point of the feature. Given a sample set D, the King index isWherein C is_kIs the subset of samples in D that belong to the kth class, and K is the number of classes. According to the training data set, from a root node, recursively carrying out the following operations on each node to construct a binary decision tree:

(1) and (4) setting the training data set of the node as D, and calculating the Gini index of the existing characteristics to the data set. At this time, for each feature a, for each value a it may take, D is classified as D according to whether the test of the sample point pair a ═ a is yes or no₁And D₂Two parts, using formulasThe kini index at a ═ a was calculated.

(2) And selecting the feature with the minimum Gini index and the corresponding segmentation point as the optimal feature and the optimal segmentation point from all the possible features A and all the possible segmentation points a thereof. And generating two sub-nodes from the current node according to the optimal characteristics and the optimal segmentation points, and distributing the training data set to the two sub-nodes according to the characteristics.

(3) And (3) recursively calling (1) and (2) for the two sub-nodes until a stop condition is met. In the flow of the method, the stop condition is that the number of samples in the node is less than 2, or the sample set belongs to the same class, or no more features exist.

Evaluating the promotion effect of the merchant: and according to the exposure, the click rate and the inquiry amount of the merchant, evaluating the popularization effect of the merchant by using an evaluation model generated by clustering.

6) If the evaluation result is poor in popularization effect, an optimization model is used, paths classified in the decision tree are found according to the index values of the merchants, namely indexes with problems are found, and optimization opinions are provided for the indexes.

In the step 1), the necessary conditions of the seller promotion effect evaluation model include:

1-1) in an original data source, the data volume to be processed must have matched seller information, seller service and seller promotion data information in each month, and the contents must be complete contents at the same time;

1-2) the complete data of the core is large enough, has no industrial, aging and other tendencies, and must contain basic information of service and seller to provide subsequent statistical and quantitative operation;

1-3) carrying out preliminary division on the data according to industries, and naturally generating training data according to the selection of a user, wherein the maximum value of a K value must be given on the assumption that the number of the clustered categories is K.

In the step 2-1), a mathematical basic statistical method is used for counting the maximum value, the minimum value, the average value and the variance of the promotion effect indexes of the seller, and the counted legality is checked.

In the step 2-2), the seller promotion effect indexes are normalized by a z-score method, the standard deviation is 0, and the normalization result is processed according to 0.

In the step 5), the number of categories as classification results must be 2, namely the two categories of good popularization effect and poor popularization effect; in the training termination condition of the CART tree, the minimum number of records of the node is set to 0.

In the step 6), the category of the minimum value of the click quantity, the exposure quantity, the inquiry quantity and the defined distance of each centroid in the model of the seller to be evaluated in the seller promotion effect evaluation model formed by clustering is the category to which the seller belongs.

In the step 7), a path classified by the merchant in the decision tree is found according to the merchant index value, and each decision node on the path is analyzed to give an optimization strategy. If the index of the decision node is value-added service and the trend is less than the branch of the threshold value, suggesting to purchase the service; and if the index of the decision node is the description index of the merchant on the product, suggesting to enrich the product quantity, the product attribute, the product key word and the like.

In the method, the original data is normalized by using a z-score method, the original seller promotion effect index is scaled according to the proportion to fall into a small specific area, and the normalized data is stored in an original element group after being normalized, so that the normalized data conforms to normal distribution.

The method is based on K-Means clustering to form a B2B seller popularization effect evaluation model, starting from indexes (exposure, click rate and inquiry amount) concerned by a seller, normalizing data by a z-score method, so that the seller popularization effect indexes are different in comparison in the same order of magnitude, the geometrical distance and the similarity between different tuples are comprehensively considered, the distance relation between the different tuples is measured, and a seller popularization effect evaluation mechanism is formed by using the K-Means clustering idea.

The method uses the CART decision tree to generate an optimization model, uses value-added services purchased in sample data, average series of product catalogues, product number, keyword richness, total score of primary printing products and product attribute perfection as classification characteristic indexes, uses a clustering result as a classification result, trains and generates a CART decision tree, and uses the CART decision tree as the optimization model. And for the merchants with poor popularization effect as evaluation results, the optimization model is used, and indexes with problems influencing the popularization effect are found and suggestions are made.

Has the advantages that: in the aspect of obtaining an optimization strategy, aiming at the problems of how to influence each characteristic index on the promotion effect and the influence of the characteristic indexes, a CART decision tree algorithm is adopted to generate a tree structure, each merchant data can find a classification path according to the index value, the decision nodes on the path are the indexes influencing the promotion effect, and optimization suggestions can be given according to the indexes and the threshold values of the decision nodes. The method has the characteristics of establishing a set of complete seller effect evaluation mechanism, adjusting the promotion result based on the actual feedback of the user by data and the like, so that the system has strong practicability, and the seller promotion result is displayed in an visualized mode of integrating the promotion result and buyer indexes.

Drawings

FIG. 1 is a model diagram of a method for evaluating and optimizing the promotion effect of a vendor B2B according to the present invention;

FIG. 2 is a flowchart illustrating an embodiment of a method for evaluating and optimizing the seller promotional effectiveness of B2B according to the present invention;

FIG. 3 is a structural diagram of a method for evaluating and optimizing the promotion effect of a B2B seller according to the present invention.

Detailed Description

The present invention will be described in further detail with reference to the accompanying drawings.

1. Selecting, quantizing and preprocessing raw data:

the characteristics of incompleteness, inconsistency, noise and the like of data are common characteristics of large databases or data warehouses in the real world, wherein the characteristics include interesting attributes such as data information of a seller, commodity information of the seller and the like, but the information is not all useful, and the information needs to be cleared, integrated, stipulated and transformed again by combining the characteristics of a website. The cleaning process is used for increasing the integrity and consistency of data by filling in default values, smoothing data, removing invalid data and the like. Making the input data efficient. And extracting target industry and useful characteristic information from the original database. Because different evaluation indexes often have different dimensions and dimension units, if the original dimensions are continuously used for measuring the size of the data, the result of data analysis is influenced, in order to eliminate the dimension influence of the different evaluation indexes, the data is standardized, so that the obtained evaluation index data is in the same order of magnitude, and then the next step of method is used for comprehensive evaluation. In the database, a definition data table stores index information of merchants and cleaned data

The specific table structure is as follows:

(1) seller effect dictionary table structure

(2) Seller service quantization table

2. Constructing a seller promotion effect evaluation model:

the standard information for evaluating the promotion effect of the seller takes three important indexes (exposure, click rate and inquiry amount) most concerned by the seller as evaluation indexes, and the three quantities are taken as standards, so that the advantages and disadvantages of the three quantities are comprehensively considered, the distance between the seller with good promotion effect and the seller with poor promotion effect is as far as possible, the distance between the seller with good promotion effect and the seller with poor promotion effect is as close as possible as the evaluation standards, and finally the seller with little effect difference is clustered as far as possible, the final clustering result is obtained and displayed to the user, and the system provides a default result.

The B2B seller effect evaluation is to select three important indexes (exposure, click rate and inquiry amount) concerned by the seller, firstly, the exposure and the click rate inquiry amount of the seller are counted, the maximum value, the minimum value, the square sum and the average value of each index variable are counted, the legality of statistical data is judged, then, z-score normalization processing is carried out on the data, the maximum class is set to be 9 by the system, namely, K is 9, the normalized data is subjected to initial tentative clustering through a K-Means algorithm, the optimal class number is obtained, and the specific figure is shown in FIG. 3. The model is divided into two stages:

a) seller data preprocessing for selecting specific industries

And selecting from the database according to the industry list input by the user, and selecting data to store in the table of the database according to the first-level industry, the second-level industry and the third-level industry. A list of these sets of constituent tuples (attr1, attr2, attr3, attr4, attr5, attr6, time, origin _ attr1, origin _ attr2, origin _ attr3, number, flag) is then used, where origin _ attr1, origin _ attr2, origin _ attr3 are the original values of attr1, attr2, attr3, which are overwritten by the normalized data. In the tuple list, each attribute is counted respectively, and the values of original data values origin _ attr1, origin _ attr2 and origin _ attr3 are reserved

The following 3 conditions must be satisfied in the seller promotion effect evaluation model:

(1) in an original data source, the data volume to be processed must have matched seller information, seller service and seller promotion data information of each month, and the contents must be complete contents at the same time;

(2) the complete data of the core is large enough, has no industrial, aging and other tendencies, and must contain basic information of services and sellers to provide subsequent statistical and quantitative operations;

(3) the data is preliminarily divided according to the industry, and training data is naturally generated according to the selection of a user, and the maximum value of the K value must be given.

b) Establishing seller promotion effect evaluation model

Firstly, selecting data for generating a model from a database, firstly defaulting K front tuples in an tuple list to be centroids of K classes by utilizing statistics of the previous step, then traversing each tuple in the list, calculating the distance between the tuple and the centroids, and for two sample examples x in a sample⁽ⁱ⁾，x^(j)Assume that the sample has l attributes attr_n(n is more than or equal to 1 and less than or equal to l), and the distance measurement formula is as follows:

\sqrt{Σ_{n = 1}^{l} {(x^{(i)} . {attr}_{n} - x^{(j)} . {attr}_{n})}^{2}} + Σ_{n = 1}^{l} x^{(i)} . {attr}_{n} * x^{(j)} . {attr}_{n} / \sqrt{Σ_{n = 1}^{l} x^{(i)} . {attr}_{n}^{2}} * \sqrt{Σ_{n = 1}^{l} x^{(j)} . {attr}_{n_{i}}^{2}}

and continuously and iteratively calculating by using the data in the tuple list, wherein the calculation termination condition is that the sum of the distances from the tuples in each class cluster to the centroid is changed by less than 1.

3. Seller promotion effect evaluation model correction

After the system forms a seller popularization effect evaluation model, a default result of the system is obtained, the system cannot well adapt to data and detailed requirements of users along with changes of environments and requirements of the users, the users can correct according to average effects of all small categories in the model and distribution conditions of all categories in the space, the categories with good seller popularization effects are reselected, and the system can correct the model again according to the specific categories selected by the users.

4. The visual clustering evaluation model assists in correcting parameters: and when the steps are finished (without user interaction), forming a clustering chart presentation result, synchronously presenting the clustering chart presentation result to the user in a three-dimensional scatter diagram mode, and refreshing the chart to present the user interaction result after the user interaction selection in the last step.

5. Establishing an optimization model according to the clustering result: using value-added services purchased by a user, average series of product catalogues, product number, keyword richness, total score of main printing products and product attribute perfection as classification indexes, and if 42 types of value-added services are quantized by using all current sample data, 47 classification indexes are used in total; and (3) using the clustering result as a classification result, calling a weka function to train to generate a CART decision tree, wherein the training termination condition is that the gini index is not changed any more or the record number of one of the child nodes is 1, namely the classification is carried out until the existing characteristic indexes cannot distinguish the sample classes on the nodes. The model can be output in a character string mode, meanwhile, the model is stored in a global variable Treeresult.

6. Evaluating the promotion effect of the merchant and proposing optimization suggestions:

a) and according to the exposure, the click rate and the inquiry amount of the merchant, evaluating the popularization effect of the merchant by using an evaluation model generated by clustering.

b) If the evaluation result is that the popularization effect is poor, an optimization strategy is obtained: inquiring values of all evaluation indexes of the merchants, comparing the evaluation indexes with conditions of all decision points in the CART decision tree according to the levels, finding out paths classified by the merchants in the decision tree, and analyzing each decision node on the paths to give out an optimization strategy. If the index of the decision node is value-added service and the trend is less than the branch of the threshold value, suggesting to purchase the service; and if the index of the decision node is the description index of the merchant on the product, suggesting to enrich the product quantity, the product attribute, the product key word and the like.

The present invention may be embodied in other specific forms without departing from the spirit or essential attributes thereof, and it is intended that all such changes and modifications as would be within the spirit and scope of the appended claims be considered as included herein.

Claims

The method for evaluating and optimizing the effect of the B2B seller is characterized by comprising the following steps of:

1) selecting and preprocessing raw data:

selecting and preprocessing seller data information and seller commodity information data, namely, re-cleaning, integrating, stipulating and transforming the data; the completeness and consistency of data are increased by filling default values, smoothing data, removing invalid data and the like in the cleaning process; making the input data efficient; extracting target industry and useful characteristic information from an original database; carrying out standardization processing on the data to enable the obtained evaluation index data to be in the same order of magnitude, and then carrying out comprehensive evaluation by using the next step; in an original database, defining a data table to store index information of merchants and the cleaned data;

2) carrying out cluster analysis on the sales indexes of the sellers to establish a model:

the method comprises the following steps that important indexes concerned by a seller are selected by the sales indexes of the seller, including exposure, click rate and inquiry amount, clustering calculation is carried out according to the three amounts, and the final clustering result is obtained for analysis, wherein the steps comprise:

2-1) selecting the processed data set, and carrying out preliminary statistics: selecting the data processed in the first step from a database, carrying out statistics on the maximum value, the minimum value, the numerical sum and the square sum of each index, establishing a basic unit tuple of a cluster and statistics of each attribute according to the data, wherein the statistics of each index comprises the maximum value, the minimum value, the average value, the square sum and the variance of the index, and carrying out the next step;

2-2) normalization of tuple data: normalizing the data in the original tuple by using the statistic of each index in the previous step, and normalizing by using a z-score method to ensure that the original data is in normal distribution;

2-3) performing heuristic clustering: the maximum class of the data is set as 9 classes, the data is clustered by using a K-means algorithm, the distribution condition of the result is checked, if the data quantity of a certain class in the result is 0, the result is considered not to be classified into the classes, the classes are reduced, re-clustering is carried out until the quantity of all the classes is more than 0, and the class at this time is considered as the final class of the clustering;

3) and (3) performing interactive correction on an evaluation model formed by clustering of data: after the last step is finished, the method forms a default processing result, and the default result is that the processing values of the index exposure and the click rate in the centroid of the cluster are simultaneously less than zero or the processing value of the index inquiry amount is less than zero, so that the popularization effect is not good; the user can revise each category again to have good promotion effect and a label of 1, otherwise, the seller is considered to have poor promotion effect and the label of 0, when the user selects the result category considered to have good effect according to the environment requirement, namely the average result of the left-side index and the right-side scatter diagram, and the user can revise the model parameters in an auxiliary manner; at the moment, correcting according to the result selected by the user, and performing interpretation processing according to the subsequent steps;

4) and (4) visualizing clustering results: when the steps are completed, user interaction is not included, a clustering chart presentation result is formed and is synchronously presented to the user in a three-dimensional scatter diagram mode, and after the user interaction selection in the last step is performed, the chart is refreshed to be the presentation of the user interaction result;

5) establishing an optimization model according to the clustering result: using value-added services purchased by a user, the average series of product catalogs, the number of products, the richness of keywords, the total score of main printing products and the perfection of product attributes as classification indexes, using a clustering result as a classification result, training to generate a CART decision tree, and using the CART decision tree as an optimization model;

6) evaluating the promotion effect of the merchant: evaluating the popularization effect of the merchant by using an evaluation model generated by clustering according to the exposure, the click rate and the inquiry amount of the merchant;

7) if the evaluation result is poor in popularization effect, an optimization model is used, paths classified in the decision tree are found according to the index values of the merchants, namely indexes with problems are found, and optimization opinions are provided for the indexes.
2. The method for evaluating and optimizing the promotion effect of B2B vendor as claimed in claim 1, wherein the step 1) of preprocessing the data comprises the steps of:

1-1) in an original data source, data to be processed must have matched seller information, seller service and seller promotion data information of each month, and the content of the data must be complete content at the same time;

1-2) the complete data of the core is large enough, has no industrial, aging and other tendencies, and must contain basic information of service and seller to provide subsequent statistical and quantitative operation;

1-3) the data is divided according to the industry, training data is naturally generated according to the selection of a user, the maximum value of the K value must be given, and the method can automatically reduce the data to the proper category according to the data.
3. The method for evaluating and optimizing the promotion effect of the B2B seller according to claim 1, wherein in the step 2-1), a mathematical basic statistical method is used to count the maximum value, the minimum value, the average value and the variance of the promotion effect indexes of the seller, and the counted legality is checked.
4. The method for evaluating and optimizing the seller promotional effect according to claim 1 of B2B, wherein in the step 2-2), the seller promotional effect index is normalized by a z-score method, the standard deviation is 0, and the normalized result is processed according to 0.
5. The method for obtaining an optimization strategy of popularization effect according to claim 1, wherein indexes of value-added services purchased by merchants, commodity description conditions and the like are used as input in the step 5), a CART decision tree is trained for output as a clustering result and used as an optimization model, a path classified by the merchants in the decision tree is found according to the index value of the merchants in the step 7), and each decision node on the path is analyzed to give an optimization strategy; if the index of the decision node is value-added service and the trend is less than the branch of the threshold value, suggesting to purchase the service; and if the index of the decision node is the description index of the merchant on the product, suggesting to enrich the product quantity, the product attribute and the product key word.
6. The method for evaluating the promotion effect of the seller according to claim 1, wherein in the step 6), the type of the minimum value of the defined distances between the click quantity, the exposure quantity, the enquiry quantity and each centroid in the model is the type of the seller.
7. The method for evaluating the promotion effect of the seller according to claim 1, wherein in the step 5), an optimization model is established according to the clustering result: using value-added services purchased by a user, the average series of product catalogs, the number of products, the richness of keywords, the total score of main printing products and the perfection of product attributes as classification indexes, using a clustering result as a classification result, training to generate a CART decision tree, and using the CART decision tree as an optimization model;

generating a CART decision tree: selecting an optimal feature by using the Gini index, and determining an optimal binary segmentation point of the feature; given a sample set D, the King index isWherein C is_kIs the sample subset belonging to the kth class in D, K is the number of classes; according to the training data set, from a root node, recursively carrying out the following operations on each node to construct a binary decision tree:

(1) setting the training data set of the nodes as D, and calculating the kini index of the existing characteristics to the data set; at this time, for each feature a, for each value a it may take, D is classified as D according to whether the test of the sample point pair a ═ a is yes or no₁And D₂Two parts, using formulasCalculating the King index when A is a;

(2) selecting the feature with the minimum Gini index and the corresponding segmentation point as the optimal feature and the optimal segmentation point from all the possible features A and all the possible segmentation points a thereof; generating two sub-nodes from the current node according to the optimal characteristics and the optimal segmentation points, and distributing the training data set to the two sub-nodes according to the characteristics;

(3) recursively calling (1) and (2) for the two sub-nodes until a stop condition is met; in the method, the stop condition is that the number of samples in the node is less than 2, or the sample set belongs to the same class, or no more features exist;

evaluating the promotion effect of the merchant: and according to the exposure, the click rate and the inquiry amount of the merchant, evaluating the popularization effect of the merchant by using an evaluation model generated by clustering.