CN106339502A - Modeling recommendation method based on user behavior data fragmentation cluster - Google Patents
Modeling recommendation method based on user behavior data fragmentation cluster Download PDFInfo
- Publication number
- CN106339502A CN106339502A CN201610828355.9A CN201610828355A CN106339502A CN 106339502 A CN106339502 A CN 106339502A CN 201610828355 A CN201610828355 A CN 201610828355A CN 106339502 A CN106339502 A CN 106339502A
- Authority
- CN
- China
- Prior art keywords
- interest
- user
- point
- behavioral data
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q30/00—Commerce
- G06Q30/06—Buying, selling or leasing transactions
- G06Q30/0601—Electronic shopping [e-shopping]
- G06Q30/0631—Item recommendations
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Business, Economics & Management (AREA)
- General Physics & Mathematics (AREA)
- Accounting & Taxation (AREA)
- Finance (AREA)
- General Engineering & Computer Science (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Computational Biology (AREA)
- Strategic Management (AREA)
- Marketing (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Business, Economics & Management (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Economics (AREA)
- Development Economics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention relates to an internet personalized recommendation technology and particularly relates to a modeling recommendation method based on user behavior data fragmentation cluster. According to the modeling recommendation method, the user behavior data is subjected to fragmentation cluster treatment, a user dynamic interest model is established, so that the personalized recommendation is realized. Compared with the existing personalized recommendation method, the modeling recommendation method has the following differences: the existing personalized recommendation method only considers user interest dynamic time-varying characteristics, while the modeling recommendation method not only considers user interest time-varying characteristics, and also excavates multi-dimensional discrete interest points from behavior data, so that a user interest model is depicted more accurately. According to the modeling recommendation method, aiming at the multi-dimensional discrete interest theme of target users, the concurrence of interest points of users is preliminarily recommended, and finally, the weight, memory and preliminary recommendation result of the interest points of the target user are predicted and scored to be finally recommended, so that the accuracy and the processing capability of the personalized recommendation result are improved.
Description
Technical field
The present invention relates to Internet technology, particularly to a kind of modeling recommendation side based on user behavior data burst cluster
Method.
Background technology
With internet application development, problem of information overload is more and more prominent in the world today.User is from magnanimity information
Finding oneself information interested is an extremely difficult thing.Personalized recommendation technology is passed through to analyze user's a large amount of behavior number
According to the interest preference carrying out digging user and potential demand, processed by personalized recommendation system, thus recommending its sense emerging for user
The service of interest, commodity or the information content.At present, personalized recommendation technology be widely used in ecommerce, social networks,
The fields such as location-based service, search service, advertising service.Wherein, foremost is exactly the ecommerce such as Amazon, Taobao, Jingdone district
Platform, it is recommended that system about which increases 20% to 30% sales volume, brings golden eggs.And search engine is as people
Conventional information retrieval tool in daily life, after user uses search engine it is also possible to obtain user interested in
Hold theme.User interest theme acquired in search engine and access behavioural information are introduced commending system can more accurately carve
Draw user interest model.
By analyzing the feature of user behavior data in electronic business platform, find that the interest of each user is not one
Become constant, in dynamic characteristics such as certain time variation, multi-dimensional nature, discretenesses in generally changing with Spatio-temporal factors.For example, when with
During the out on tours of family, its interest is the information such as local transit, hotel, food and drink, local conditions and customs;Operationally, its interest is user
Obtain be engaged in trade information;In amusement and recreation life, its interest is to obtain the amusement letter such as video display, music, news, physical culture
Breath.Additionally, these interest of user also can be drifted about in time, that is, embody point of interest Dynamic Changes, this interest transition are led to
Often also there is certain discreteness, such as when user likes the recreational and sports activities of stimulation of taking a risk at an early age, then like easypro after stepping into the middle age
Slow easily stress-relieving activity.Therefore, when analyzing its AOI based on user behavior data, need to take into full account user interest
Time variation, multi-dimensional nature, discreteness feature, precisely portray the current interest model of user to reach.
From the point of view of existing user dynamic interest model method, it is broadly divided into based on sliding window with based on time parameter
Model method.Based on sliding window model method be by arrange a fixed size sliding time window, this window with
The passage of time ceaselessly moves forward.Need only to during user interest is excavated consider the number in current window
According to the data outside window that falls then may be considered the interest before user, can not pay attention to.This method realizes difficulty
Less, but Orientation observation time window size be difficult to setting because user is different, and mean to neglect using sliding window
Depending on the user data beyond sliding window, thus leading to omit the wide scope interest of user.Model method based on time parameter
There are a variety of schemes, the representative model scheme being namely based on forgetting curve, using the history to user for the forgetting curve
Score data is changed accordingly, that is, give the weights of correlation.When amended scoring is less than certain threshold value, just abandon this
Scoring.It is based on an assumption that during building user interest model, user is recent based on the model method of time parameter
Behavioral data more important than the behavioral data of user's history because these data more can reflect the current interest of user, away from
Away from now more, data more can not reflect user's current interest.
Above user dynamic interest model method is typically only applicable to the situation that those interest gradually change, less applicable
In the situation of user interest jump change, that is, user is larger to another one point of interest span from a point of interest.Particularly exist
Have in the large-scale synthesis class electric business plateform system of search engine, user interest can over time, place, the factor such as wish occur
Great changes, and assume certain discreteness.It is can not accurately to portray the dynamic of user interest model only according to time parameter
Change.
In sum, user when accessing large-scale synthesis class electric business plateform system, its point of interest has dynamic time variation, many
Dimension property and discreteness feature.Dynamic time variation refers to that user interest theme can vary over.For example, user daytime can
Can be interested in job information, evening is then interested in life & amusement.Multi-dimensional nature refers to that user interest theme has in different aspect
Multiple different hobbies.For example, user, in terms of study, has multiple difference sections purpose hobby;In terms of amusement and recreation, hobby
Different activities.Discreteness refers to that between multiple interest topics of user, span is larger.For example, user is in terms of tourism and work
The interest of aspect there is certain discreteness.Traditional personalization recommends method typically to adopt collaborative filtering, and this algorithm
Principle is that the commodity liked by finding the user having similar behavior to targeted customer are recommended for targeted customer.But due to
There is larger difference and complexity in the interest topic between user, this brings for the interest topic Similarity Measure between user
Difficult.It is assumed that user a, b, c have job interest and life interest.The job interest similarity of a and b and c is respectively 50%
With 10%, and life interest similarity is respectively 50% and 85% it is clear that the similarity thinking a and b that can not be average is higher than a
Similarity with c.Therefore traditional personalized recommendation method does not comprehensively consider the multi-dimensional nature of user interest and discrete sex chromosome mosaicism,
Personalized recommendation precision under the dynamic interests change of user can not be solved the problems, such as very well.In addition, user puts down in access electric business
It will usually obtain oneself required information using search engine functionality in platform during platform.If commending system is not to retrieval
Keyword and its browse data and carry out cluster analysis, is just difficult to the interest point range of focused user behavioral data, is unfavorable for improving
The process performance of commending system and recommendation precision.
Content of the invention
The present invention limits to for the technology that existing personalized recommendation method exists under user's dynamic interest scene, proposes one
Plant and method is recommended based on the modeling of user behavior data burst cluster.
Technical scheme: a kind of modeling recommendation method based on user behavior data burst cluster, its feature exists
In, comprising:
A. user behavior data customized treatment, specifically includes:
A1. user behavior data collection;When described user behavior data refers to that user passes through internet access electric business platform,
The user behavior data that electric business platform is gathered, at least includes the categorical data such as logging in, retrieve, browse, buy and evaluate, simultaneously
Each user behavior data all includes the base attribute information of electric business platform imparting, and described base attribute information at least includes
Session id, user id, behavior type, content of the act, user ip, logging device and time;
A2. user behavior data burst;Specifically the behavioral data of collection in step a1 is organized by user, then with
User is unit to each transaction session of electric business platform, user behavior data is divided by transaction session, makes each stroke
The behavioral data fragment divided only comprises an affairs theme, and the behavioral data fragment that this user is comprised similar topic word is carried out
Merger is processed;Described transaction session refers to create in User logs in electric business platform, and during destruction after user terminates to access
Between fragment;
B. pass through user behavior data cluster analysis establishment and use public user interest model, specifically include:
B1. after the behavioral data burst to different user for step a, using each behavioral data fragment as a class
Not, calculate the similarity between all categories;Particularly as follows: assuming there is uiAnd ujTwo behavioral data fragments, then their descriptor
Set similarity s (ui,uj) computational methods equation below 1:
Wherein, s (ui,uj) represent behavioral data fragment uiAnd ujBetween similarity, v (ui) and v (uj) represent behavior respectively
Data slot uiAnd ujTheme set of words, calculate descriptor intersection of sets collection when, only when searching motif word is identical and has
During identical part of speech, just think that two searching motif words are identical;
B2. by two categories combinations of similarity highest of gained be a classification, and using two classifications average phase
Like the similarity spent as new category, repeat step b2 is till obtaining the classification of specified quantity;
B3. extract descriptor from each classification that step b2 finally obtains as interest topic, build public user emerging
Interesting model;
C. electric business platform is recommended to user, method particularly includes:
Electric business platform is analyzed to the behavioral data of targeted customer, the public user interest model obtaining from step b
In find out each point of interest of targeted customer, tentatively recommended respectively using collaborative filtering, then used in conjunction with target
The weight of each point of interest in family, memory degree and the prediction scoring of preliminary recommendation results carry out consequently recommended.Assume that i-th point of interest accounts for mesh
The weight of mark user interest is λi, the computational methods of this weight are: set siFor i-th point of interest of targeted customer, len (si) be
Point of interest siIn targeted customer's behavior record number of comprising, then point of interest siAccount for weight λ of targeted customer's interestiCalculation
Shown in equation below 2:
According to user interest point forgetting law, using forgetting function h (t) to point of interest λiWeight is processed;It is assumed that t is
In certain user interest point, last behavior record time of origin is to the time interval of recommendation time, then user interest point memory degree
Calculation equation below 3 shown in:
H (t)=e-t(formula 3)
Wherein, the unit of t is the moon;When last behavior record time of origin is identical with the time of recommendation, representative is spaced apart
0, then h (0)=1, represent user and forgetting is not had started to this interest.Finally, the preliminary recommendation results of each point of interest of user are entered
Row weighted calculation sorts, and obtains point of interest sorted lists p.It is assumed that targeted customer has n point of interest, i-th point of interest recommends knot
The prediction of fruit is scored as pi, then the calculation of point of interest sorted lists p can be expressed as follows formula 4:
P=sort (p1*λ1*h(t1),p2*λ2*h(t2),p3*λ3*h(t3),…,pi*λi*h(ti),…,pn*λn*h(tn))
(formula 4)
Wherein, the preliminary recommendation results prediction scoring of each point of interest of sort () function pair targeted customer, interest weight, memory
Degree is weighted, and end value sequence is processed.piRepresent the preliminary recommendation results prediction scoring of this user interest point i, ti
Represent the interval that this point of interest extremely recommends the time, h (ti) it is the memory degree to point of interest i for the user.Finally, sorted according to point of interest
List items calculated value, selects train value highest point of interest recommendation results to be supplied to targeted customer, thus realizing considering user
The personalized recommendation of the dynamic Characteristic of Interest of time variation, multi-dimensional nature, discreteness.
The method of the present invention is passed through to process user behavior data burst, numerous and disorderly behavioral data is pressed transaction session and organizes
To fragment, and solve the process problem that behavioral data key words extraction and similar behavioral data merge.Simultaneously to user behavior number
Carry out cluster analysis according to fragment, the behavioral data fragment of all users is carried out classification process by containing interest topic, excavates
There is the interest point set of similar users behavior, and construct public user interest model, solve the dynamic interest model of user
Portray precision Upgrade Problem.For the analysis of targeted customer's behavioral data, in the dynamic interest model of public user, obtain this use
Family interest point set, and applicating cooperation filter algorithm is tentatively recommended respectively to each point of interest of targeted customer.Then to target
The preliminary recommendation results prediction scoring of each point of interest of user, interest weight, memory degree are weighted, and end value are sorted
Process, choose train value highest point of interest recommendation results and be supplied to targeted customer, thus solving user's dynamic interest time variation, many
Personalized recommendation difficulties under dimension property, discreteness feature.
The present invention, as the personalized recommendation method of the dynamic interest of legacy user, is by analyzing user behavior data
The dynamic point of interest of mode digging user, set up dynamic user interest model.Also utilize collaborative filtering to be directed to use simultaneously
Family interest is recommended, and produces the personalized recommendation result based on the dynamic interest of user.The present invention and existing personalized recommendation side
The different place of method is, existing personalized recommendation method only considers user interest dynamic time-varying implementations, and the present invention
Not only consider the dynamic time variation of user interest moreover it is possible to excavate with multi-dimensional nature, discreteness user interest point in subordinate act data,
Thus more accurately portraying the dynamic interest model of user.Present invention is alternatively directed to the multidimensional interest master with discretization of targeted customer
Topic, is concurrently tentatively recommended for each point of interest of this user, finally by the weight of each point of interest, memory degree, preliminary recommendation results
Prediction scoring is weighted, and the point of interest choosing highest calculated value realizes recommendation process, thus improving personalized recommendation knot
The precision of fruit and process performance.
Beneficial effects of the present invention are that the method for the present invention is entered by user is accessed with the behavioral data of electric business plateform system
Row burst cluster analysis is processed, solve user behavior data contain the time variation of interest topic, multi-dimensional nature, discreteness etc. process difficult
Point problem, can accurately portray the dynamic interest model of user, thus providing basis for precisely realizing personalized recommendation.For being based on
The multidimensional point of interest that cluster analysis is extracted, this method is tentatively recommended respectively, later in conjunction with currently each interest of targeted customer
Point weight, memory degree and the prediction scoring of preliminary recommendation results carry out combined recommendation so that recommendation results are more accurate.Additionally, with now
The dynamic interest personalized recommendation method having is compared, and the present invention carries out burst and merger and processes to user behavior data, for follow-up
User behavior data cluster analysis processes and reduces expense.Equally, each multidimensional point of interest of user extracting for cluster analysis, holds
Row parallelization personalized recommendation, can improve the process performance of commending system.
Brief description
Fig. 1 is the system structure diagram of the inventive method model;
Fig. 2 is the overview flow chart of the inventive method model treatment;
Fig. 3 is user behavior data burst process chart;
Fig. 4 is user behavior data cluster analysis flow chart;
Fig. 5 is the recommended flowsheet figure of targeted customer;
Fig. 6 is user behavior data gatherer process schematic diagram;
Fig. 7 is user behavior data slicing principle schematic diagram;
Fig. 8 is user behavior data fragment process of cluster analysis schematic diagram.
Specific embodiment
With reference to the accompanying drawings and examples the present invention is described in detail
In order to improve the recommendation precision of personalized recommendation system, need comprehensively to consider the time-varying of the dynamic interest of user
The characteristics such as property, multi-dimensional nature and discreteness.In order to substantial amounts of for user behavioral data is effectively gathered and is facilitated analyzing and processing, this
Invent in units of each transaction session that each user accesses electric business platform, by the involved visit in transaction session of this user
Ask that operation is organized in a behavioral data fragment, the behavioral data of each user will carry out burst process.Due to user's
Each affairs behavioral data fragment all contains certain interest or wish, and the present invention will analyze extraction each behavioral data fragment of user
Descriptor so as to collection a large number of users behavioral data fragment be analyzed process.For certain user's different dimensions of classifying
Behavioral data fragment, each behavioral data fragment of this user carries out merger by similar topic word by the present invention, and only retaining should
The different themes behavioral data fragment of user.In addition it is also necessary to all users after each behavioral data fragment obtaining unique user
Behavioral data fragment carry out cluster analysis, extract the behavioral data fragment collection in all users with similar interests theme
Close.User behavior data in each set contains these users and has identical interest topic.Thus excavating all
User has, multidimensional interest topic, and set up the interest model of public user according to these interest topics, to realize
Personalized recommendation.Additionally, by continuous analysis user behavior data, new interest topic is added in user interest model, from
And realize user interest model and dynamically update.When to targeted customer's execution personalized recommendation, find targeted customer's sense first emerging
The theme set of interest, the user comprising then in conjunction with user behavior fragments all in this interest topic buys data and scoring number
According to collaborative filtering, being that each interest topic of targeted customer executes personalized recommendation respectively.Finally, according to targeted customer
Currently each point of interest weight, memory degree and the prediction scoring of preliminary recommendation results, be weighted and sort process, chooses highest
The point of interest recommendation results of train value provide targeted customer.Its concrete process step is as follows:
1st, user behavior data burst.The once complete transaction session of user is defined as a behavioral data piece by the present invention
Section, main sliced fashion is with the establishment of session and to destroy as boundary, using user operation data in this period as one
Behavioral data fragment.
2nd, a large amount of behavior fragment datas being directed to each user carry out the merger process of similar topic.First, extract each row
For the theme set of words (one or more) of data slot, and according to epigraph mark part of speech based on user's browsing content, thus solving
The certainly merger problem of polysemy.Secondly, the similarity between each descriptor relatively in each behavioral data fragment, high similarity
Behavioral data fragment merges, and is that subsequent user behavioral data cluster analysis processes minimizing expense.Finally, obtain having of this user
The behavioral data fragment of multidimensional theme.
3rd, the potential point of interest of user is excavated by cluster analysis, set up the dynamic interest model of public user.Because having phase
Behavioral data fragment like descriptor necessarily contains similar interest topic, so the present invention passes through all users of cluster analysis
Behavioral data fragment characteristic vector, excavate the behavioral data set of segments with similar topic word, thus extracting use
The multidimensional point of interest with discretization in family, and user interest model is built according to these user interest points.First, extract each behavior
The tf-idf (descriptor weight) of descriptor and part-of-speech information in data slot, and for each behavioral data fragment generate feature to
Amount.Secondly, the similarity between each behavioral data fragment is calculated according to characteristic vector, and calculate with bottom-up hierarchical clustering
Iteration clusters each behavioral data fragment to method successively, obtains the behavioral data set of segments of similar interests.Then, by extracting each collection
The higher descriptor of the tf-idf value of all behavioral data fragments in conjunction, just obtains each interest topic of all users, thus
Set up public user interest model.
4th, it is that each point of interest of targeted customer executes personalized recommendation algorithm.First, for the behavioral data piece of targeted customer
Section is analyzed, and from public user interest model, finds all points of interest of this user.Then, press each interest simultaneously
Point parallelization ground execution collaborative filtering, each point of interest for this user produces personalized recommendation PRELIMINARY RESULTS respectively.
5th, according to targeted customer, currently the weight of each point of interest, memory degree and scoring are weighted, and process of sorting,
The point of interest recommendation results choosing highest calculated value provide targeted customer.First, calculate the power of each point of interest of targeted customer respectively
Weight, the prediction scoring of memory degree, recommendation results, and they are weighted.Then, the weighted calculation value of each point of interest is entered
Row sequence, chooses weighted calculation value arrangement highest point of interest recommendation results as consequently recommended result.
As shown in figure 1, the inventive method model is related to electric business platform, behavioral data acquisition module, Users' Interests Mining mould
Block, four parts of system recommendation module.Electric business platform is the application foundation of commending system, and it is except providing electronics for client
Outside business service, also will record user in this platform database and log system and search for, browse, buying, evaluating the behaviour such as commodity
Make behavioral data.Behavioral data acquisition module is responsible for gathering the use of correlation from customer data base, log system, merchandising database
Family behavioral data and user's score data.Users' Interests Mining module carries out burst process to user behavior data, then carries again
Take the characteristic vector of each behavioral data fragment, and cluster analysis carried out with this, digging user is multidimensional, the point of interest of discretization,
Thus setting up the public user interest model of electric business platform.Recommending module is analyzed according to targeted customer's behavioral data, in the public
Extract targeted customer's interest point set in user interest model, and provide targeted customer using collaborative filtering method for electric business platform
Personalized recommendation.
In the inventive method model, Users' Interests Mining module is mainly by user behavior data burst, behavioral data piece
The processing unit compositions such as section feature vector extraction, the calculating of behavioral data segment-similarity and behavioral data fragment cluster analysis.
Wherein, user behavior data sharding unit carries out burst process to behavioral data in units of each transaction session of user, and
The data slot of this user is carried out merger process by similar topic word, thus obtaining one group of behavioral data containing different themes
Fragment.Behavioral data segment characterizations vector extracting unit is responsible for extracting the tf-idf value of descriptor in each behavioral data fragment, and
Arrange each descriptor and its tf-idf value according to Chinese vocabulary table order, generate the characteristic vector of behavior data slot.Feature
Vector represents the feature of user behavior data fragment, processes for calculating the similarity between behavioral data fragment.Behavioral data
Segment-similarity computing unit is divided into two classes to calculate.First kind calculating is that all behavioral data fragments for unique user are carried out
Similarity Measure, the behavioral data fragment merger for will have like descriptor is processed.Equations of The Second Kind is useful for platform institute
The behavioral data segment characterizations vector at family carries out Similarity Measure, provides the similarity degree of data slot for cluster analysis unit
Amount.Behavioral data fragment cluster analysis unit carries out cluster analysis to all user behavior fragment datas, excavates out one group and contains
The data slot set of different themes.The data slot set of each theme has similar interests point, and then it is flat to build electric business
The public user interest model of platform.System recommendation module is analyzed processing for targeted customer's behavioral data, and uses from the public
Family interest model excavates out the interest point set of this targeted customer.Then, execute respectively for each point of interest of targeted customer collaborative
Filtering recommendation algorithms generate preliminary recommendation results.Finally, by the weight of each point of interest of targeted customer, memory degree, preliminary recommendation
Prediction of result scoring is weighted, and chooses calculated value highest point of interest recommendation results and ties as final personalized recommendation
Really.
As shown in Fig. 2 the personalized recommendation method process of the present invention, be divided into public user interest model set up subprocess and
Targeted customer recommends subprocess.Public user interest model is set up process and divides four steps to complete: first, user is accessing electric business platform
During, plateform system records the peration data of each user automatically.Then, user behavior acquisition module from operating database and
Gather the behavioral data of each user in daily record data database, and carry out behavioral data list organization by user.Afterwards, this mould
Each user behavior data is carried out burst process by transaction session by type, obtains some user behavior data pieces containing descriptor
Section, has the behavioral data fragment of same subject word in merger unique user.Finally, the behavioral data fragment of all users is entered
Row cluster analysis, obtains some user behavior data set of segments containing different themes word, i.e. each data slot set is accumulate
Containing similar interest topic, and the interest model of public user is built with this.
Targeted customer's personalized recommendation process divides three steps to complete: first, analyzes targeted customer's behavioral data fragment, according to public
Many user interest models find all points of interest of targeted customer.Then, each interest being this user with collaborative filtering
Point generates preliminary recommendation results.Finally, for the weight of each point of interest of this user, memory degree, the prediction scoring of preliminary recommendation results
It is weighted, choosing calculated value highest point of interest recommendation results provides targeted customer, completes personalized recommendation process.
The processing method of key modules of the present invention is described below.
1st, user behavior data collection
User behavior data is the data basis that personalized recommendation is realized, and the inventive method not only needs to gather the inspection of user
Rope is with navigation patterns data in addition it is also necessary to collection user buys and scoring behavioral data.Retrieval is mainly used in navigation patterns data
The point of interest that digging user is dynamic, many peacekeepings are discrete, is bought and is then used at commercial product recommending in collaborative filtering with scoring behavioral data
Reason.User behavior data mainly carries out data acquisition from customer data base, merchandising database and log system, steps on including user
Five kinds of behavior classes such as record behavioral data, retrieval behavioral data, navigation patterns data, buying behavior and user's scoring behavioral data
Type data.The every behavioral data being gathered not only needs to comprise session id, user id, commodity id, behavior type, content of the act
Etc. information in addition it is also necessary to comprise the attribute informations such as timestamp, browsing terminal and place.These data are arranged by session id
Sequence, generates user behavior data list, thus facilitating user behavior data burst to process and cluster analysis process.
2nd, user behavior data burst
In e-commerce platform, each session of user has clearer and more definite purpose, so the user behaviour in this session
Make to contain certain interest topic.Therefore, the present invention is in units of transaction session of user, and user behavior data is carried out point
Piece is processed.In order to support the efficient process of user behavior data fragment cluster analysis, this unit is also to user behavior data fragment
Content carries out key phrases extraction, then carries out merger process to the behavioral data fragment with similar topic word.Its process such as Fig. 3
Shown.
From the figure 3, it may be seen that user behavior data Slicing procedure is divided into following key step:
1) read the behavioral data of unique user from the database of acquisition module, including user retrieval behavior, browse row
For, buying behavior, scoring behavior, log in the behavioral datas such as behavior, reactive power optimization.
2) each for user complete transaction session is defined as a behavioral data fragment.Concrete grammar is according to each user
It is a behavioral data fragment that session creates to session the sequence of operations destroying in the time period.
3) from each data slot content of user, extract user search and the descriptor browsing information, generate behavior number
Theme set of words according to fragment.It is that corresponding descriptor gives different parts of speech according to browsing content information, thus solving nature
Polysemy problem in language.
4) the behavioral data fragment in unique user with similar topic set of words is carried out merger process.Relatively each behavior
The similarity of theme set of words in data slot, merges behavioral data fragment high for similarity, obtains one group of this user and have
The behavioral data fragment of different themes word.
3rd, user behavior cluster analysis
User behavior cluster analysis is that the behavioral data fragment for all users is analyzed processing, and therefrom excavates public affairs
Many user interest themes.It comprises user behavior data segment characterizations vector and extracts and user behavior data fragment cluster analysis two
Individual processing unit.Behavioral data fragment because having similar topic word necessarily contains similar interest topic, so this module
By calculating the similarity of each data slot descriptor, and adopt cluster analysis, excavate the behavior with similar topic word
Data slot set, thus extracting the multidimensional point of interest with discretization of user, and builds user interest model with this.Its mistake
Journey is as shown in Figure 4:
1) characteristic vector pickup of user behavior data fragment.The user behavior number being obtained according to behavioral data Slicing procedure
According to the theme set of words of fragment, calculate the tf-idf value of each descriptor, this value is the tolerance of descriptor importance.By Chinese vocabulary
Table order is arranged in order descriptor and its tf-idf value, just constitutes the characteristic vector of behavior data slot.
2) calculate the intersegmental similarity of each behavioral data piece.With each descriptor of characteristic vector for a dimension, build
Vector space model.Then when two characteristic vectors are orthogonal, behavioral data segment-similarity is 0.When characteristic vector overlaps, OK
It is 100% for data slot similarity.Therefore the intersegmental similarity of each behavioral data piece can be calculated using cosine law formula, obtain
The cosine value arriving is the intersegmental Similarity value of each behavioral data piece.
3) run hierarchical clustering algorithm program and cluster analysis is carried out to each data slot.The present invention adopts bottom-up
Hierarchical clustering algorithm, two most like behavioral data fragment classifications of continuous iteration cluster, and then complete all user behavior numbers
Cluster process according to fragment.First, each behavioral data fragment is regarded as a classification.Then, by similarity highest two
Categories combination is a class.Iteration successively, till reaching specified class number.
4) select optimal cluster level, determine cluster result.Bottom-up hierarchical clustering algorithm ultimately generates tree-shaped
Cluster result.The level maximum by finding the change of similarity between class, just can determine that the cluster result of optimum, obtains having many
Dimensionization, the user behavior data set of segments of the interest topic of discretization.
5) according to above-mentioned cluster result, set up the user interest model of the public.Each behavioral data piece from cluster result
In Duan Jihe, extract tf-idf value highest descriptor in the behavioral data fragment comprising, obtain user interest theme (interest
Point).These multidimensional, discretization user interest themes and its behavioral data set of segments are organized together, just constitutes
Public user interest model.
4th, the personalized recommendation of targeted customer
When accessing electric business platform, there is dynamic time variation, multi-dimensional nature and discreteness in user interest, i.e. the not Tongfang of user
Face interest exists compared with large span.Therefore, only recommended respectively for each point of interest of user, just can effectively improve personalization and push away
Recommend precision.First, each point of interest for targeted customer executes personalized recommendation respectively, generates preliminary recommendation results.Then,
It is weighted for the weight of each point of interest of this user, memory degree, the prediction scoring of preliminary recommendation results, choose calculated value
High point of interest recommendation results provide targeted customer.Its handling process is as shown in Figure 5.
Fig. 5 is the personalized recommendation flow chart of targeted customer, and its step is as follows:
1) find targeted customer's each point of interest interested.Based on public user interest model, analyze the row of targeted customer
For data slot, find out the interest point set of this user.
2) it is directed to each point of interest of targeted customer, execute Collaborative Filtering Recommendation Algorithm respectively, generate each point of interest
Preliminary recommendation results.
3) targeted customer's each point of interest sequence calculates.To the weight of each point of interest of targeted customer, memory degree, preliminary recommendation knot
Fruit prediction scoring is weighted, and they are ranked up processing.Its weighted calculation value and sequence reflection targeted customer work as
Front degree interested in each point of interest.
4) generate personalized recommendation result.From the weighted calculation value list of each point of interest, choose the point of interest of peak
Recommendation results are as final recommendation results.
Embodiment:
1st, user behavior data collection
From traditional personalized recommendation system only gather user buy, score data different, this example also need to gather user
Retrieval behavior and navigation patterns data.Wherein, the every behavioral data being gathered not only needs to comprise session id, user
The information such as id, commodity id, behavior type, content of the act are in addition it is also necessary to comprise the attributes such as timestamp, browsing terminal and place letter
Breath.User behavior data burst for next step is processed to provide and supports by these primary attributes.Its concrete gatherer process such as Fig. 6 institute
Show.
As shown in fig. 6, user behavior data acquisition module first from the customer data base of electric business platform, merchandising database with
And gather out User logs in log system, the behavioral data of classification such as retrieve, browse, buying, scoring.Each behavioral data is equal
Comprise base attribute information (as session id, user id, behavior type, content of the act, user ip, logging device, time etc.).Its
In, for ensureing the integrality of user behavior data collection, electric business platform creates session when user starts access system, when
User destroys this session information after exiting.After User logs in electric business platform, it will words id (id of session) are closed and are coupled to
In the behavioral data list of this user.Log in the session information containing user in behavioral data, can be used for user behavior number
Process according to burst.User search and navigation patterns data are mainly used in the point of interest that digging user is dynamic, many peacekeepings are discrete, so that
Set up public user interest model.User buys and scoring behavioral data is then used for facilitating personalized recommendation.
2nd, user behavior Slicing procedure
Because each transactions access that user accesses electric business platform mostly carries interesting purpose, that is, this affairs all operations is all
There is identical interest topic.Therefore in units of each transaction session of user, user behavior data is carried out data slot and draws
Divide it is possible to make each fragment behavioral data contain a theme.Then, for all behavioral data pieces of each user
Section, the similar topic word according to containing carries out merger process, thus process for follow-up Users' Interests Mining improving performance.User
Behavioral data burst handling principle is as shown in Figure 7.
1) behavioral data of each user is read respectively from user behavior data storehouse, and by user's id organizational behavior data row
Table.The behavioral data being read, in addition to comprising basic attribute data, further comprises behavior relevant operating data.
2) in units of user's single transaction session, one group of behavioral data in this affairs is divided into a behavioral data
Fragment.Concrete grammar is with the establishment of user session and to destroy as boundary, by all behavioral datas of user in this time period
As a behavioral data fragment, and filter nullity data (after logging in, exiting at once), reduce user behavior number
According to noise.
3) extract the descriptor of behavioral data fragment.For user retrieval behavior data slot, the theme of its behavior segment
Word is search key.For browsing and buying behavior fragment data, the present invention extracts by the following method and browses and purchase
Buy the potential descriptor of content of the act.First by Chinese word segmentation software module, the text data of content of the act is carried out at participle
Reason, and filter insignificant function word information, obtain the set of letters that content of the act comprises.Then, calculated using tf-idf algorithm
The importance degree of each word.It is assumed that tiThe number of times occurring for word i, t is the number of times that all words occur, then the tf-idf value of word i
Computing formula is shown in formula 5.
Wherein, first by ti/ t counts the word frequency information (term of word i in the detailed description of browsing content
Frequency, is abbreviated as tf), then calculate inverse document frequency (the inverse document of word i in describing in detail
Frequency, is abbreviated as idf), its computing formula is log (d/di), wherein d is entire service number, diRepresent word i in di
Occur in individual descriptive labelling.Finally, calculate tf the and idf product of each word, obtain the importance degree of each word.Select importance degree
High several words as browse and buy the potential theme set of words of content.Additionally, being the descriptor improving extraction further
Precision in addition it is also necessary to browse or buy the attribute (classification, purposes etc.) of content according to user, based on epigraph add label, thus
Solve the problems, such as polysemy.Definition k is user behavior data fragment descriptor, and s is the browsing content information of this descriptor, then have
The behavioral data fragment theme set of words having n descriptor can be expressed as (k1<s1>,k2<s2>,…,ki<si>,…,kn<sn
>).
4) the behavioral data fragment that will have like descriptor merges.By calculating the theme word set of each behavioral data fragment
The similarity closed, when similarity exceedes certain threshold value, (as 80%) merges this two behavioral data fragments.Behavioral data fragment
Between Similarity Measure can with set cosine similarity computational methods.Suppose there is uiAnd ujTwo behavioral data fragments, then
Their theme set of words similarity s (ui,uj) computational methods are shown in formula 6.
Wherein, s (ui,uj) represent behavioral data fragment uiAnd ujBetween similarity, v (ui) and v (uj) represent behavior respectively
Data slot uiAnd ujTheme set of words.When calculating descriptor intersection of sets collection, only when searching motif word is identical and has
During identical part of speech, just think that two searching motif words are identical.Can will be high for theme set of words similitude by said method
Behavioral data fragment merges, and then reduces the data volume of subsequent data analysis, is conducive to improving holding of Users' Interests Mining
Row performance.
3rd, user behavior data cluster analysis is realized
Behavioral data fragment because having similar topic set of words contains similar point of interest, so this module purpose is
Analyze the theme word information of all user behavior data fragments using clustering technique, will have like the use of theme word feature vector
Family behavioral data fragment clusters out, sorts out the behavioral data set of segments to have similar users point of interest, and then extracts
The multidimensional interest point set with discretization of user, and public user interest model is set up with this.
Needed to be calculated the characteristic vector data of user behavior data fragment before using hierarchical clustering algorithm.This
Bright first in behavior data fragmentation processing procedure, the theme set of words of the user behavior data fragment obtaining and its part of speech letter
Breath.Secondly, calculate the frequency (tf) that each descriptor occurs in each behavioral data fragment respectively, computational methods are in formula 6-1
Be given.Then, calculate the inverse document frequency (idf) of each descriptor, its computing formula is log (d/di), wherein d is institute
There are the behavioral data fragment number of user, diRepresent the number of times that descriptor i occurs in all behavioral data fragments.Respectively will be each
The tf value of descriptor obtains the tf-idf value of each descriptor with idf value after being multiplied.Finally, arranged successively according to common words table order
Arrange each descriptor and its tf-idf value, thus obtaining the characteristic vector of each behavioral data fragment.This feature vector reflects user
The interest characteristics of behavioral data fragment.
After the characteristic vector obtaining each behavioral data fragment, start to execute bottom-up hierarchical clustering algorithm completing to gather
Alanysis, its process of cluster analysis is illustrated as shown in Figure 8.
First, each behavioral data fragment is regarded as a classification, in such as Fig. 8, have 30 behavioral data fragments, each
Fragment is a classification.Then, the characteristic vector according to each behavioral data fragment, calculates the similarity between them, by phase
It is a class like degree two categories combinations of highest.When comprising multiple behavioral data fragment in two classes, using class between each behavior
The average similarity of data slot is as the similarity of this two classes.Iteration successively, till specifying class number, ultimately generates
Tree clustering result in Fig. 8.
Wherein, the present invention measures the similarity between each behavioral data segment characterizations vector using cosine law formula.False
Fixed (x1,x2,…,xn) and (y1,y2,…,yn) vectorial (note: can use as vacancy of the behavioral data segment characterizations for x and y
The method of descriptor zero padding, solves block eigenvector length inconsistence problems), then the computing formula of the similarity cos θ of x and y is shown in public affairs
Formula 7.
In addition it is also necessary to determine selected which layer conduct after tree clustering result is obtained by bottom-up hierarchical clustering algorithm
Final cluster result.Research finds that such is not had too with the similarity of other classes after merging two classification of theme identical
Big change.But after merging two different classification of theme, similarity and between other classes can be led to substantially reduce.Divide merging
After class, the maximum previous level of similarity change between class, as optimal cluster result.
By above-mentioned cluster analysis, the available one group user behavior data set of segments with similar interests.From each
Extract tf-idf value highest descriptor in set as interest topic, and then the interest topic collection of public user can be obtained
Close.These interest topics and its behavioral data set of segments are organized together, just establishes the interest model of public user.
4th, the personalized recommendation functional realiey of targeted customer
When accessing electric business platform, user interest has time variation, multi-dimensional nature and discreteness feature.For this situation,
Using each point of interest of targeted customer respectively Generalization bounds method, the precision that it is recommended than based on all behavioral data of user
Property recommendation results high.But, because user is different to the favorable rating of each point of interest, so each point of interest personalized recommendation knot
Weight shared by fruit is also different.For example, certain user interest point saComprise 1000 user behaviors, and in addition certain user interest point sbOnly
Only comprise 20 user behaviors, even if now saMiddle recommendation results a prediction scoring is slightly below sbMiddle recommendation results b, but user is to a
What degree of liking was possible will be far longer than commodity b.If additionally, point of interest saUp-to-date behavior record is than point of interest sbRemote much, then
User equally possible to the degree of liking of commodity b more than commodity a because the interest of user may have occurred that change.So such as
What calculates the weight of each point of interest of user, and the recommendation results weighting for each point of interest in proportion, and arrangement obtains final individual character
Change recommendation list, be the key obtaining accurately personalized recommendation result.
The present invention, from user interest model, finds targeted customer's each point of interest potential, and extracts these interest
All user's score data information that theme is related to, then run user-based collaborative filtering, obtain preliminary each emerging
Interest point recommendation results and its prediction scoring pi.Because user is interested in certain point of interest, the operation row related to this point of interest
It is more, so the present invention calculates the power of each point of interest of targeted customer according to the number of user behavior record in each point of interest
Weight (λi).It is assumed that siFor i-th point of interest of targeted customer, len (si) point of interest siThe behavior record number comprising, then point of interest si
Weight (λi) calculation is as shown in Equation 8.
Can be obtained by the point of interest weight of targeted customer by said method, but user interest also has dynamic time-varying
Property, point of interest user more remote is lower to the interest-degree of this point of interest.Draw with reference to German psychologist's Chinese mugwort guest's this research great
Forgetting curve, find user point of interest equally meet forgetting curve rule.For this present invention according to user interest point
Forgetting law, using forgetting function h (t) to point of interest λiWeight is processed.It is assumed that t is last in certain user interest point
To the time interval of recommendation time, then the calculation of user interest point memory degree is as shown in Equation 9 for behavior record time of origin.
H (t)=e-t(formula 9)
Wherein, the unit of t is the moon.When last behavior record time of origin is identical with the time of recommendation, representative is spaced apart
0.Then h (0)=1, represents user and does not have started forgetting to this interest.Finally, the preliminary recommendation results of each point of interest of user are entered
Row weighted calculation sorts, and obtains point of interest sorted lists p.It is assumed that targeted customer has n point of interest, then each point of interest sorted lists
The calculation of p can be expressed as follows formula 10.
P=sort (p1*λ1*h(t1),p2*λ2*h(t2),p3*λ3*h(t3),…,pi*λi*h(ti),…,pn*λn*h(tn)) (public
Formula 10)
In formula 6-6, the preliminary recommendation results prediction scoring of each point of interest of sort () function pair targeted customer, interest power
Weight, memory degree are weighted, and end value sequence is processed.piRepresent the preliminary recommendation knot of this i-th point of interest of user
Fruit prediction scoring, tiRepresent the interval that this point of interest extremely recommends the time, h (ti) it is the memory degree to point of interest i for the user.Finally, root
According to point of interest Sorted list list item calculated value, train value highest point of interest recommendation results are selected to be supplied to targeted customer, thus realizing
Consider user's time variation, multi-dimensional nature, the personalized recommendation of the dynamic Characteristic of Interest of discreteness.
Claims (1)
1. a kind of modeling recommendation method based on user behavior data burst cluster is it is characterised in that include:
A. user behavior data customized treatment, specifically includes:
A1. user behavior data collection;When described user behavior data refers to that user passes through internet access electric business platform, electric business
The user behavior data that platform is gathered, at least includes logging in, retrieve, browse, buy and evaluating etc. categorical data, simultaneously each
Plant the base attribute information that user behavior data all includes the imparting of electric business platform, described base attribute information at least includes session
Id, user id, behavior type, content of the act, user ip, logging device and time;
A2. user behavior data burst;Specifically the behavioral data of collection in step a1 is organized by user, then with user
Each transaction session to electric business platform is unit, and user behavior data is divided by transaction session, so that each is divided
Behavioral data fragment only comprises an affairs theme, and the behavioral data fragment that this user is comprised similar topic word carries out merger
Process;Described transaction session refers to create in User logs in electric business platform, and the timeslice destroyed after user terminates to access
Section;
B. pass through user behavior data cluster analysis establishment and use public user interest model, specifically include:
B1. after the behavioral data burst to different user for step a, using each behavioral data fragment as a classification, count
Calculate the similarity between all categories;Particularly as follows: assuming there is uiAnd ujTwo behavioral data fragments, then their theme set of words
Similarity s (ui,uj) computational methods such as following formula:
Wherein, s (ui,uj) represent behavioral data fragment uiAnd ujBetween similarity, v (ui) and v (uj) represent behavioral data respectively
Fragment uiAnd ujTheme set of words, calculate descriptor intersection of sets collection when, only when searching motif word is identical and has identical
During part of speech, just think that two searching motif words are identical;
B2. by two categories combinations of similarity highest of gained be a classification, and using two classifications average similarity
As the similarity of new category, repeat step b2 is till obtaining the classification of specified quantity;
B3. extract descriptor from each classification that step b2 finally obtains as interest topic, build public user interest mould
Type;
C. electric business platform is recommended to user, method particularly includes:
Electric business platform is analyzed to the behavioral data of targeted customer, looks in the public user interest model obtaining from step b
Go out each point of interest of targeted customer, tentatively recommended respectively using collaborative filtering, then each in conjunction with targeted customer
The weight of point of interest, memory degree and the prediction scoring of preliminary recommendation results carry out consequently recommended it is assumed that i-th point of interest accounts for target and use
The weight of family interest is λi, the computational methods of this weight are: set siFor i-th point of interest of targeted customer, len (si) it is interest
Point siIn targeted customer's behavior record number of comprising, then point of interest siAccount for weight λ of targeted customer's interestiCalculation is as follows
Shown in formula:
According to user interest point forgetting law, using forgetting function h (t) to point of interest λiWeight is processed;It is assumed that t uses for certain
In the point of interest of family, last behavior record time of origin is to the time interval of recommendation time, the then meter of user interest point memory degree
Calculation mode is shown below:
H (t)=e-t
Wherein, the unit of t is the moon;When last behavior record time of origin is identical with the time of recommendation, represents and be spaced apart 0, then h
(0)=1, represent user and forgetting is not had started to this interest, finally, the preliminary recommendation results of each point of interest of user are weighted
Calculate sequence, obtain point of interest sorted lists p it is assumed that targeted customer has a n point of interest, i-th point of interest recommendation results pre-
Test and appraisal are divided into pi, then the calculation of point of interest sorted lists p can be expressed as follows:
P=sort (p1*λ1*h(t1),p2*λ2*h(t2),p3*λ3*h(t3),…,pi*λi*h(ti),…,pn*λn*h(tn))
Wherein, the preliminary recommendation results prediction scoring of each point of interest of sort () function pair targeted customer, interest weight, memory degree enter
Row weighted calculation, and end value sequence is processed, piRepresent the preliminary recommendation results prediction scoring of this user interest point i, tiRepresent
This point of interest extremely recommends the interval of time, h (ti) it is the memory degree to point of interest i for the user, finally, according to point of interest sorted lists
Item calculated value, selects train value highest point of interest recommendation results to be supplied to targeted customer, thus realizing considering user's time-varying
Property, the personalized recommendation of the dynamic Characteristic of Interest of multi-dimensional nature, discreteness.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610828355.9A CN106339502A (en) | 2016-09-18 | 2016-09-18 | Modeling recommendation method based on user behavior data fragmentation cluster |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610828355.9A CN106339502A (en) | 2016-09-18 | 2016-09-18 | Modeling recommendation method based on user behavior data fragmentation cluster |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106339502A true CN106339502A (en) | 2017-01-18 |
Family
ID=57840108
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610828355.9A Pending CN106339502A (en) | 2016-09-18 | 2016-09-18 | Modeling recommendation method based on user behavior data fragmentation cluster |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106339502A (en) |
Cited By (44)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106851349A (en) * | 2017-03-21 | 2017-06-13 | 上海星红桉数据科技有限公司 | Based on magnanimity across the live recommendation method for shielding viewing behavior data |
CN106919653A (en) * | 2017-01-24 | 2017-07-04 | 广西师范学院 | Daily record filter method based on user behavior |
CN106952130A (en) * | 2017-02-27 | 2017-07-14 | 华南理工大学 | Common user item based on collaborative filtering recommends method |
CN106980662A (en) * | 2017-03-21 | 2017-07-25 | 上海星红桉数据科技有限公司 | Based on magnanimity across the user tag sorting technique for shielding viewing behavior data |
CN107169821A (en) * | 2017-05-02 | 2017-09-15 | 杭州泰指尚科技有限公司 | Big data inquires about recommendation method and its system |
CN107194769A (en) * | 2017-05-17 | 2017-09-22 | 东莞市华睿电子科技有限公司 | A kind of Method of Commodity Recommendation that content is searched for based on user |
CN107357835A (en) * | 2017-06-22 | 2017-11-17 | 电子科技大学 | It is a kind of that method for digging and system are predicted based on the interest of topic model and forgetting law |
CN107391687A (en) * | 2017-07-24 | 2017-11-24 | 华中师范大学 | A kind of mixing commending system towards local chronicle website |
CN107886357A (en) * | 2017-11-06 | 2018-04-06 | 北京希格斯科技发展有限公司 | The method and system of content value is judged based on user behavior data |
CN107944485A (en) * | 2017-11-17 | 2018-04-20 | 西安电子科技大学 | The commending system and method, personalized recommendation system found based on cluster group |
CN108717654A (en) * | 2018-05-17 | 2018-10-30 | 南京大学 | A kind of more electric business intersection recommendation method based on cluster feature migration |
CN108733684A (en) * | 2017-04-17 | 2018-11-02 | 合信息技术(北京)有限公司 | The recommendation method and device of multimedia resource |
CN108846698A (en) * | 2018-06-14 | 2018-11-20 | 安徽鼎龙网络传媒有限公司 | A kind of micro- scene management backstage wechat store cloud processing compressibility |
CN108874959A (en) * | 2018-06-06 | 2018-11-23 | 电子科技大学 | A kind of user's dynamic interest model method for building up based on big data technology |
CN108921670A (en) * | 2018-07-04 | 2018-11-30 | 重庆大学 | A kind of potential interest of fusion user, the Drug trading recommended method of space-time data and classification popularity |
WO2018218403A1 (en) * | 2017-05-27 | 2018-12-06 | 深圳大学 | Content pushing method and device |
CN109034248A (en) * | 2018-07-27 | 2018-12-18 | 电子科技大学 | A kind of classification method of the Noise label image based on deep learning |
CN109543109A (en) * | 2018-11-27 | 2019-03-29 | 山东建筑大学 | A kind of proposed algorithm of time of fusion window setting technique and score in predicting model |
CN109684552A (en) * | 2018-12-26 | 2019-04-26 | 云南宾飞科技有限公司 | A kind of intelligent information recommendation system |
CN109727056A (en) * | 2018-07-06 | 2019-05-07 | 平安科技(深圳)有限公司 | Financial institution's recommended method, equipment, storage medium and device |
CN110060129A (en) * | 2019-04-22 | 2019-07-26 | 深圳市活力天汇科技股份有限公司 | A kind of air ticket intelligent recommendation method |
CN110135463A (en) * | 2019-04-18 | 2019-08-16 | 微梦创科网络科技(中国)有限公司 | A kind of commodity method for pushing and device |
CN110807052A (en) * | 2019-11-05 | 2020-02-18 | 佳都新太科技股份有限公司 | User group classification method, device, equipment and storage medium |
CN110852846A (en) * | 2019-11-11 | 2020-02-28 | 京东数字科技控股有限公司 | Processing method and device for recommended object, electronic equipment and storage medium |
WO2020088058A1 (en) * | 2018-10-31 | 2020-05-07 | 北京字节跳动网络技术有限公司 | Information generating method and device |
CN111209486A (en) * | 2019-12-19 | 2020-05-29 | 杭州安恒信息技术股份有限公司 | Management platform data recommendation method based on mixed recommendation rule |
CN111209474A (en) * | 2019-12-27 | 2020-05-29 | 广东德诚科教有限公司 | Online course recommendation method and device, computer equipment and storage medium |
CN111400591A (en) * | 2020-03-11 | 2020-07-10 | 腾讯科技(北京)有限公司 | Information recommendation method and device, electronic equipment and storage medium |
CN111506813A (en) * | 2020-04-08 | 2020-08-07 | 中国电子科技集团公司第五十四研究所 | Remote sensing information accurate recommendation method based on user portrait |
CN111861526A (en) * | 2019-04-30 | 2020-10-30 | 京东城市(南京)科技有限公司 | Method and device for analyzing object source |
CN111984874A (en) * | 2020-08-26 | 2020-11-24 | 河南科技大学 | Parallel recommendation method integrating emotion calculation and network crowdsourcing |
CN112036951A (en) * | 2020-09-03 | 2020-12-04 | 猪八戒股份有限公司 | Business opportunity recommendation method, system, electronic device and medium based on CNN model |
CN112199455A (en) * | 2020-09-14 | 2021-01-08 | 汉海信息技术(上海)有限公司 | Method and device for sorting geographic information points, electronic equipment and computer medium |
CN112765400A (en) * | 2020-12-31 | 2021-05-07 | 上海众源网络有限公司 | Weight updating method of interest tag, content recommendation method, device and equipment |
CN112988845A (en) * | 2021-04-01 | 2021-06-18 | 毕延杰 | Data information processing method and information service platform in big data service scene |
CN113378065A (en) * | 2021-07-09 | 2021-09-10 | 小红书科技有限公司 | Method for determining content diversity based on sliding spectrum decomposition and method for selecting content |
CN114201680A (en) * | 2021-12-13 | 2022-03-18 | 中数通信息有限公司 | Method for recommending marketing product content to user |
CN114780606A (en) * | 2022-03-30 | 2022-07-22 | 欧阳安安 | Big data mining method and system |
CN114817774A (en) * | 2022-05-12 | 2022-07-29 | 中国人民解放军国防科技大学 | Method for determining social behavior relationship among space-time co-occurrence area, non-public place and user |
CN115659046A (en) * | 2022-11-10 | 2023-01-31 | 果子(青岛)数字技术有限公司 | AI big data based technical transaction recommendation system and method |
CN116541731A (en) * | 2023-05-26 | 2023-08-04 | 北京百度网讯科技有限公司 | Processing method, device and equipment of network behavior data |
CN116821228A (en) * | 2023-06-01 | 2023-09-29 | 成都亚保科技有限公司 | Visual configuration method for insurance products based on data analysis |
CN117078362A (en) * | 2023-10-17 | 2023-11-17 | 北京铭洋商务服务有限公司 | Personalized travel route recommendation method and system |
CN118396684A (en) * | 2024-06-26 | 2024-07-26 | 广东省广告集团股份有限公司 | User advertisement recommendation method, device and storage medium based on converged neural network |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102402766A (en) * | 2011-12-27 | 2012-04-04 | 纽海信息技术(上海)有限公司 | User interest modeling method based on webpage browsing |
CN102542489A (en) * | 2011-12-27 | 2012-07-04 | 纽海信息技术(上海)有限公司 | Recommendation method based on user interest association |
CN103106208A (en) * | 2011-11-11 | 2013-05-15 | 中国移动通信集团公司 | Streaming media content recommendation method and system in mobile internet |
CN103399883A (en) * | 2013-07-19 | 2013-11-20 | 百度在线网络技术(北京)有限公司 | Method and system for performing personalized recommendation according to user interest points/concerns |
CN103678710A (en) * | 2013-12-31 | 2014-03-26 | 同济大学 | Information recommendation method based on user behaviors |
CN103927347A (en) * | 2014-04-01 | 2014-07-16 | 复旦大学 | Collaborative filtering recommendation algorithm based on user behavior models and ant colony clustering |
CN104809243A (en) * | 2015-05-15 | 2015-07-29 | 东南大学 | Mixed recommendation method based on excavation of user behavior compositing factor |
CN105426548A (en) * | 2015-12-29 | 2016-03-23 | 海信集团有限公司 | Video recommendation method and device based on multiple users |
CN105512326A (en) * | 2015-12-23 | 2016-04-20 | 成都品果科技有限公司 | Picture recommending method and system |
-
2016
- 2016-09-18 CN CN201610828355.9A patent/CN106339502A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103106208A (en) * | 2011-11-11 | 2013-05-15 | 中国移动通信集团公司 | Streaming media content recommendation method and system in mobile internet |
CN102402766A (en) * | 2011-12-27 | 2012-04-04 | 纽海信息技术(上海)有限公司 | User interest modeling method based on webpage browsing |
CN102542489A (en) * | 2011-12-27 | 2012-07-04 | 纽海信息技术(上海)有限公司 | Recommendation method based on user interest association |
CN103399883A (en) * | 2013-07-19 | 2013-11-20 | 百度在线网络技术(北京)有限公司 | Method and system for performing personalized recommendation according to user interest points/concerns |
CN103678710A (en) * | 2013-12-31 | 2014-03-26 | 同济大学 | Information recommendation method based on user behaviors |
CN103927347A (en) * | 2014-04-01 | 2014-07-16 | 复旦大学 | Collaborative filtering recommendation algorithm based on user behavior models and ant colony clustering |
CN104809243A (en) * | 2015-05-15 | 2015-07-29 | 东南大学 | Mixed recommendation method based on excavation of user behavior compositing factor |
CN105512326A (en) * | 2015-12-23 | 2016-04-20 | 成都品果科技有限公司 | Picture recommending method and system |
CN105426548A (en) * | 2015-12-29 | 2016-03-23 | 海信集团有限公司 | Video recommendation method and device based on multiple users |
Non-Patent Citations (3)
Title |
---|
HYUNG JUN AHN: ""A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem"", 《ELSEVIER》 * |
胡旭 等: ""初始聚类中心优化的K-均值项目聚类推荐算法"", 《空军预警学院学报》 * |
胡畔: ""基于关联规则的跨平台个性化推荐算法及实现"", 《中国优秀硕士学位论文全文数据库》 * |
Cited By (68)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106919653A (en) * | 2017-01-24 | 2017-07-04 | 广西师范学院 | Daily record filter method based on user behavior |
CN106919653B (en) * | 2017-01-24 | 2020-12-15 | 南宁师范大学 | Log filtering method based on user behavior |
CN106952130B (en) * | 2017-02-27 | 2020-10-27 | 华南理工大学 | General article recommendation method based on collaborative filtering |
CN106952130A (en) * | 2017-02-27 | 2017-07-14 | 华南理工大学 | Common user item based on collaborative filtering recommends method |
CN106980662A (en) * | 2017-03-21 | 2017-07-25 | 上海星红桉数据科技有限公司 | Based on magnanimity across the user tag sorting technique for shielding viewing behavior data |
CN106851349A (en) * | 2017-03-21 | 2017-06-13 | 上海星红桉数据科技有限公司 | Based on magnanimity across the live recommendation method for shielding viewing behavior data |
CN108733684A (en) * | 2017-04-17 | 2018-11-02 | 合信息技术(北京)有限公司 | The recommendation method and device of multimedia resource |
CN107169821A (en) * | 2017-05-02 | 2017-09-15 | 杭州泰指尚科技有限公司 | Big data inquires about recommendation method and its system |
CN107169821B (en) * | 2017-05-02 | 2020-12-15 | 杭州泰一指尚科技有限公司 | Big data query recommendation method and system |
CN107194769A (en) * | 2017-05-17 | 2017-09-22 | 东莞市华睿电子科技有限公司 | A kind of Method of Commodity Recommendation that content is searched for based on user |
WO2018218403A1 (en) * | 2017-05-27 | 2018-12-06 | 深圳大学 | Content pushing method and device |
CN107357835A (en) * | 2017-06-22 | 2017-11-17 | 电子科技大学 | It is a kind of that method for digging and system are predicted based on the interest of topic model and forgetting law |
CN107357835B (en) * | 2017-06-22 | 2020-11-03 | 电子科技大学 | Interest prediction mining method and system based on topic model and forgetting rule |
CN107391687A (en) * | 2017-07-24 | 2017-11-24 | 华中师范大学 | A kind of mixing commending system towards local chronicle website |
CN107886357A (en) * | 2017-11-06 | 2018-04-06 | 北京希格斯科技发展有限公司 | The method and system of content value is judged based on user behavior data |
CN107944485A (en) * | 2017-11-17 | 2018-04-20 | 西安电子科技大学 | The commending system and method, personalized recommendation system found based on cluster group |
CN107944485B (en) * | 2017-11-17 | 2020-03-06 | 西安电子科技大学 | Recommendation system and method based on cluster group discovery and personalized recommendation system |
CN108717654B (en) * | 2018-05-17 | 2022-03-25 | 南京大学 | Multi-provider cross recommendation method based on clustering feature migration |
CN108717654A (en) * | 2018-05-17 | 2018-10-30 | 南京大学 | A kind of more electric business intersection recommendation method based on cluster feature migration |
CN108874959A (en) * | 2018-06-06 | 2018-11-23 | 电子科技大学 | A kind of user's dynamic interest model method for building up based on big data technology |
CN108874959B (en) * | 2018-06-06 | 2022-03-29 | 电子科技大学 | User dynamic interest model building method based on big data technology |
CN108846698A (en) * | 2018-06-14 | 2018-11-20 | 安徽鼎龙网络传媒有限公司 | A kind of micro- scene management backstage wechat store cloud processing compressibility |
CN108921670B (en) * | 2018-07-04 | 2022-06-14 | 重庆大学 | Drug transaction recommendation method fusing potential interest, spatio-temporal data and category popularity of user |
CN108921670A (en) * | 2018-07-04 | 2018-11-30 | 重庆大学 | A kind of potential interest of fusion user, the Drug trading recommended method of space-time data and classification popularity |
CN109727056A (en) * | 2018-07-06 | 2019-05-07 | 平安科技(深圳)有限公司 | Financial institution's recommended method, equipment, storage medium and device |
CN109727056B (en) * | 2018-07-06 | 2023-04-18 | 平安科技(深圳)有限公司 | Financial institution recommendation method, device, storage medium and device |
CN109034248B (en) * | 2018-07-27 | 2022-04-05 | 电子科技大学 | Deep learning-based classification method for noise-containing label images |
CN109034248A (en) * | 2018-07-27 | 2018-12-18 | 电子科技大学 | A kind of classification method of the Noise label image based on deep learning |
WO2020088058A1 (en) * | 2018-10-31 | 2020-05-07 | 北京字节跳动网络技术有限公司 | Information generating method and device |
CN109543109A (en) * | 2018-11-27 | 2019-03-29 | 山东建筑大学 | A kind of proposed algorithm of time of fusion window setting technique and score in predicting model |
CN109543109B (en) * | 2018-11-27 | 2021-06-22 | 山东建筑大学 | Recommendation algorithm integrating time window technology and scoring prediction model |
CN109684552A (en) * | 2018-12-26 | 2019-04-26 | 云南宾飞科技有限公司 | A kind of intelligent information recommendation system |
CN110135463A (en) * | 2019-04-18 | 2019-08-16 | 微梦创科网络科技(中国)有限公司 | A kind of commodity method for pushing and device |
CN110060129A (en) * | 2019-04-22 | 2019-07-26 | 深圳市活力天汇科技股份有限公司 | A kind of air ticket intelligent recommendation method |
CN111861526A (en) * | 2019-04-30 | 2020-10-30 | 京东城市(南京)科技有限公司 | Method and device for analyzing object source |
CN111861526B (en) * | 2019-04-30 | 2024-05-21 | 京东城市(南京)科技有限公司 | Method and device for analyzing object source |
CN110807052B (en) * | 2019-11-05 | 2022-08-02 | 佳都科技集团股份有限公司 | User group classification method, device, equipment and storage medium |
CN110807052A (en) * | 2019-11-05 | 2020-02-18 | 佳都新太科技股份有限公司 | User group classification method, device, equipment and storage medium |
CN110852846A (en) * | 2019-11-11 | 2020-02-28 | 京东数字科技控股有限公司 | Processing method and device for recommended object, electronic equipment and storage medium |
CN111209486B (en) * | 2019-12-19 | 2023-04-11 | 杭州安恒信息技术股份有限公司 | Management platform data recommendation method based on mixed recommendation rule |
CN111209486A (en) * | 2019-12-19 | 2020-05-29 | 杭州安恒信息技术股份有限公司 | Management platform data recommendation method based on mixed recommendation rule |
CN111209474A (en) * | 2019-12-27 | 2020-05-29 | 广东德诚科教有限公司 | Online course recommendation method and device, computer equipment and storage medium |
CN111400591B (en) * | 2020-03-11 | 2023-04-07 | 深圳市雅阅科技有限公司 | Information recommendation method and device, electronic equipment and storage medium |
CN111400591A (en) * | 2020-03-11 | 2020-07-10 | 腾讯科技(北京)有限公司 | Information recommendation method and device, electronic equipment and storage medium |
CN111506813A (en) * | 2020-04-08 | 2020-08-07 | 中国电子科技集团公司第五十四研究所 | Remote sensing information accurate recommendation method based on user portrait |
CN111984874A (en) * | 2020-08-26 | 2020-11-24 | 河南科技大学 | Parallel recommendation method integrating emotion calculation and network crowdsourcing |
CN111984874B (en) * | 2020-08-26 | 2022-07-22 | 河南科技大学 | Parallel recommendation method integrating emotion calculation and network crowdsourcing |
CN112036951A (en) * | 2020-09-03 | 2020-12-04 | 猪八戒股份有限公司 | Business opportunity recommendation method, system, electronic device and medium based on CNN model |
CN112199455A (en) * | 2020-09-14 | 2021-01-08 | 汉海信息技术(上海)有限公司 | Method and device for sorting geographic information points, electronic equipment and computer medium |
CN112765400B (en) * | 2020-12-31 | 2024-04-23 | 上海众源网络有限公司 | Weight updating method, content recommending method, device and equipment for interest labels |
CN112765400A (en) * | 2020-12-31 | 2021-05-07 | 上海众源网络有限公司 | Weight updating method of interest tag, content recommendation method, device and equipment |
CN112988845A (en) * | 2021-04-01 | 2021-06-18 | 毕延杰 | Data information processing method and information service platform in big data service scene |
CN112988845B (en) * | 2021-04-01 | 2021-11-16 | 湖南机械之家信息科技有限公司 | Data information processing method and information service platform in big data service scene |
CN113378065A (en) * | 2021-07-09 | 2021-09-10 | 小红书科技有限公司 | Method for determining content diversity based on sliding spectrum decomposition and method for selecting content |
CN113378065B (en) * | 2021-07-09 | 2023-07-04 | 小红书科技有限公司 | Method for determining content diversity based on sliding spectrum decomposition and method for selecting content |
CN114201680A (en) * | 2021-12-13 | 2022-03-18 | 中数通信息有限公司 | Method for recommending marketing product content to user |
CN114780606A (en) * | 2022-03-30 | 2022-07-22 | 欧阳安安 | Big data mining method and system |
CN114817774A (en) * | 2022-05-12 | 2022-07-29 | 中国人民解放军国防科技大学 | Method for determining social behavior relationship among space-time co-occurrence area, non-public place and user |
CN114817774B (en) * | 2022-05-12 | 2023-08-22 | 中国人民解放军国防科技大学 | Method for determining social behavior relationship among space-time co-occurrence area, non-public place and user |
CN115659046B (en) * | 2022-11-10 | 2023-03-10 | 果子(青岛)数字技术有限公司 | AI big data based technical transaction recommendation system and method |
CN115659046A (en) * | 2022-11-10 | 2023-01-31 | 果子(青岛)数字技术有限公司 | AI big data based technical transaction recommendation system and method |
CN116541731A (en) * | 2023-05-26 | 2023-08-04 | 北京百度网讯科技有限公司 | Processing method, device and equipment of network behavior data |
CN116541731B (en) * | 2023-05-26 | 2024-07-23 | 北京百度网讯科技有限公司 | Processing method, device and equipment of network behavior data |
CN116821228A (en) * | 2023-06-01 | 2023-09-29 | 成都亚保科技有限公司 | Visual configuration method for insurance products based on data analysis |
CN117078362A (en) * | 2023-10-17 | 2023-11-17 | 北京铭洋商务服务有限公司 | Personalized travel route recommendation method and system |
CN117078362B (en) * | 2023-10-17 | 2023-12-29 | 北京铭洋商务服务有限公司 | Personalized travel route recommendation method and system |
CN118396684A (en) * | 2024-06-26 | 2024-07-26 | 广东省广告集团股份有限公司 | User advertisement recommendation method, device and storage medium based on converged neural network |
CN118396684B (en) * | 2024-06-26 | 2024-09-20 | 广东省广告集团股份有限公司 | User advertisement recommendation method and device based on fused neural network and model construction method thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106339502A (en) | Modeling recommendation method based on user behavior data fragmentation cluster | |
CN107577759B (en) | Automatic recommendation method for user comments | |
KR102075833B1 (en) | Curation method and system for recommending of art contents | |
CN104199822B (en) | It is a kind of to identify the method and system for searching for corresponding demand classification | |
CN103914478B (en) | Webpage training method and system, webpage Forecasting Methodology and system | |
CN101246499B (en) | Network information search method and system | |
CN111708740A (en) | Mass search query log calculation analysis system based on cloud platform | |
EP2560111A2 (en) | Systems and methods for facilitating the gathering of open source intelligence | |
US9031944B2 (en) | System and method for providing multi-core and multi-level topical organization in social indexes | |
US20060004753A1 (en) | System and method for document analysis, processing and information extraction | |
CN103455487B (en) | The extracting method and device of a kind of search term | |
CN105426514A (en) | Personalized mobile APP recommendation method | |
CN103425799A (en) | Personalized research direction recommending system and method based on themes | |
WO2001025947A1 (en) | Method of dynamically recommending web sites and answering user queries based upon affinity groups | |
CN101408886A (en) | Selecting tags for a document by analyzing paragraphs of the document | |
CN102609523A (en) | Collaborative filtering recommendation algorithm based on article sorting and user sorting | |
CN103186550A (en) | Method and system for generating video-related video list | |
CN102236646A (en) | Personalized item-level vertical pagerank algorithm iRank | |
JP2011154668A (en) | Method for recommending the most appropriate information in real time by properly recognizing main idea of web page and preference of user | |
CN104484431A (en) | Multi-source individualized news webpage recommending method based on field body | |
Allisio et al. | Felicittà: Visualizing and estimating happiness in italian cities from geotagged tweets | |
CN103678710A (en) | Information recommendation method based on user behaviors | |
CN110717089A (en) | User behavior analysis system and method based on weblog | |
CN108021715A (en) | Isomery tag fusion system based on semantic structure signature analysis | |
CN102567392A (en) | Control method for interest subject excavation based on time window |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170118 |