Nothing Special   »   [go: up one dir, main page]

CN106339502A - Modeling recommendation method based on user behavior data fragmentation cluster - Google Patents

Modeling recommendation method based on user behavior data fragmentation cluster Download PDF

Info

Publication number
CN106339502A
CN106339502A CN201610828355.9A CN201610828355A CN106339502A CN 106339502 A CN106339502 A CN 106339502A CN 201610828355 A CN201610828355 A CN 201610828355A CN 106339502 A CN106339502 A CN 106339502A
Authority
CN
China
Prior art keywords
interest
user
point
behavioral data
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610828355.9A
Other languages
Chinese (zh)
Inventor
陆鑫
邓玉林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201610828355.9A priority Critical patent/CN106339502A/en
Publication of CN106339502A publication Critical patent/CN106339502A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • G06F18/232Non-hierarchical techniques
    • G06F18/2321Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0631Item recommendations

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • General Engineering & Computer Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to an internet personalized recommendation technology and particularly relates to a modeling recommendation method based on user behavior data fragmentation cluster. According to the modeling recommendation method, the user behavior data is subjected to fragmentation cluster treatment, a user dynamic interest model is established, so that the personalized recommendation is realized. Compared with the existing personalized recommendation method, the modeling recommendation method has the following differences: the existing personalized recommendation method only considers user interest dynamic time-varying characteristics, while the modeling recommendation method not only considers user interest time-varying characteristics, and also excavates multi-dimensional discrete interest points from behavior data, so that a user interest model is depicted more accurately. According to the modeling recommendation method, aiming at the multi-dimensional discrete interest theme of target users, the concurrence of interest points of users is preliminarily recommended, and finally, the weight, memory and preliminary recommendation result of the interest points of the target user are predicted and scored to be finally recommended, so that the accuracy and the processing capability of the personalized recommendation result are improved.

Description

A kind of modeling recommendation method based on user behavior data burst cluster
Technical field
The present invention relates to Internet technology, particularly to a kind of modeling recommendation side based on user behavior data burst cluster Method.
Background technology
With internet application development, problem of information overload is more and more prominent in the world today.User is from magnanimity information Finding oneself information interested is an extremely difficult thing.Personalized recommendation technology is passed through to analyze user's a large amount of behavior number According to the interest preference carrying out digging user and potential demand, processed by personalized recommendation system, thus recommending its sense emerging for user The service of interest, commodity or the information content.At present, personalized recommendation technology be widely used in ecommerce, social networks, The fields such as location-based service, search service, advertising service.Wherein, foremost is exactly the ecommerce such as Amazon, Taobao, Jingdone district Platform, it is recommended that system about which increases 20% to 30% sales volume, brings golden eggs.And search engine is as people Conventional information retrieval tool in daily life, after user uses search engine it is also possible to obtain user interested in Hold theme.User interest theme acquired in search engine and access behavioural information are introduced commending system can more accurately carve Draw user interest model.
By analyzing the feature of user behavior data in electronic business platform, find that the interest of each user is not one Become constant, in dynamic characteristics such as certain time variation, multi-dimensional nature, discretenesses in generally changing with Spatio-temporal factors.For example, when with During the out on tours of family, its interest is the information such as local transit, hotel, food and drink, local conditions and customs;Operationally, its interest is user Obtain be engaged in trade information;In amusement and recreation life, its interest is to obtain the amusement letter such as video display, music, news, physical culture Breath.Additionally, these interest of user also can be drifted about in time, that is, embody point of interest Dynamic Changes, this interest transition are led to Often also there is certain discreteness, such as when user likes the recreational and sports activities of stimulation of taking a risk at an early age, then like easypro after stepping into the middle age Slow easily stress-relieving activity.Therefore, when analyzing its AOI based on user behavior data, need to take into full account user interest Time variation, multi-dimensional nature, discreteness feature, precisely portray the current interest model of user to reach.
From the point of view of existing user dynamic interest model method, it is broadly divided into based on sliding window with based on time parameter Model method.Based on sliding window model method be by arrange a fixed size sliding time window, this window with The passage of time ceaselessly moves forward.Need only to during user interest is excavated consider the number in current window According to the data outside window that falls then may be considered the interest before user, can not pay attention to.This method realizes difficulty Less, but Orientation observation time window size be difficult to setting because user is different, and mean to neglect using sliding window Depending on the user data beyond sliding window, thus leading to omit the wide scope interest of user.Model method based on time parameter There are a variety of schemes, the representative model scheme being namely based on forgetting curve, using the history to user for the forgetting curve Score data is changed accordingly, that is, give the weights of correlation.When amended scoring is less than certain threshold value, just abandon this Scoring.It is based on an assumption that during building user interest model, user is recent based on the model method of time parameter Behavioral data more important than the behavioral data of user's history because these data more can reflect the current interest of user, away from Away from now more, data more can not reflect user's current interest.
Above user dynamic interest model method is typically only applicable to the situation that those interest gradually change, less applicable In the situation of user interest jump change, that is, user is larger to another one point of interest span from a point of interest.Particularly exist Have in the large-scale synthesis class electric business plateform system of search engine, user interest can over time, place, the factor such as wish occur Great changes, and assume certain discreteness.It is can not accurately to portray the dynamic of user interest model only according to time parameter Change.
In sum, user when accessing large-scale synthesis class electric business plateform system, its point of interest has dynamic time variation, many Dimension property and discreteness feature.Dynamic time variation refers to that user interest theme can vary over.For example, user daytime can Can be interested in job information, evening is then interested in life & amusement.Multi-dimensional nature refers to that user interest theme has in different aspect Multiple different hobbies.For example, user, in terms of study, has multiple difference sections purpose hobby;In terms of amusement and recreation, hobby Different activities.Discreteness refers to that between multiple interest topics of user, span is larger.For example, user is in terms of tourism and work The interest of aspect there is certain discreteness.Traditional personalization recommends method typically to adopt collaborative filtering, and this algorithm Principle is that the commodity liked by finding the user having similar behavior to targeted customer are recommended for targeted customer.But due to There is larger difference and complexity in the interest topic between user, this brings for the interest topic Similarity Measure between user Difficult.It is assumed that user a, b, c have job interest and life interest.The job interest similarity of a and b and c is respectively 50% With 10%, and life interest similarity is respectively 50% and 85% it is clear that the similarity thinking a and b that can not be average is higher than a Similarity with c.Therefore traditional personalized recommendation method does not comprehensively consider the multi-dimensional nature of user interest and discrete sex chromosome mosaicism, Personalized recommendation precision under the dynamic interests change of user can not be solved the problems, such as very well.In addition, user puts down in access electric business It will usually obtain oneself required information using search engine functionality in platform during platform.If commending system is not to retrieval Keyword and its browse data and carry out cluster analysis, is just difficult to the interest point range of focused user behavioral data, is unfavorable for improving The process performance of commending system and recommendation precision.
Content of the invention
The present invention limits to for the technology that existing personalized recommendation method exists under user's dynamic interest scene, proposes one Plant and method is recommended based on the modeling of user behavior data burst cluster.
Technical scheme: a kind of modeling recommendation method based on user behavior data burst cluster, its feature exists In, comprising:
A. user behavior data customized treatment, specifically includes:
A1. user behavior data collection;When described user behavior data refers to that user passes through internet access electric business platform, The user behavior data that electric business platform is gathered, at least includes the categorical data such as logging in, retrieve, browse, buy and evaluate, simultaneously Each user behavior data all includes the base attribute information of electric business platform imparting, and described base attribute information at least includes Session id, user id, behavior type, content of the act, user ip, logging device and time;
A2. user behavior data burst;Specifically the behavioral data of collection in step a1 is organized by user, then with User is unit to each transaction session of electric business platform, user behavior data is divided by transaction session, makes each stroke The behavioral data fragment divided only comprises an affairs theme, and the behavioral data fragment that this user is comprised similar topic word is carried out Merger is processed;Described transaction session refers to create in User logs in electric business platform, and during destruction after user terminates to access Between fragment;
B. pass through user behavior data cluster analysis establishment and use public user interest model, specifically include:
B1. after the behavioral data burst to different user for step a, using each behavioral data fragment as a class Not, calculate the similarity between all categories;Particularly as follows: assuming there is uiAnd ujTwo behavioral data fragments, then their descriptor Set similarity s (ui,uj) computational methods equation below 1:
Wherein, s (ui,uj) represent behavioral data fragment uiAnd ujBetween similarity, v (ui) and v (uj) represent behavior respectively Data slot uiAnd ujTheme set of words, calculate descriptor intersection of sets collection when, only when searching motif word is identical and has During identical part of speech, just think that two searching motif words are identical;
B2. by two categories combinations of similarity highest of gained be a classification, and using two classifications average phase Like the similarity spent as new category, repeat step b2 is till obtaining the classification of specified quantity;
B3. extract descriptor from each classification that step b2 finally obtains as interest topic, build public user emerging Interesting model;
C. electric business platform is recommended to user, method particularly includes:
Electric business platform is analyzed to the behavioral data of targeted customer, the public user interest model obtaining from step b In find out each point of interest of targeted customer, tentatively recommended respectively using collaborative filtering, then used in conjunction with target The weight of each point of interest in family, memory degree and the prediction scoring of preliminary recommendation results carry out consequently recommended.Assume that i-th point of interest accounts for mesh The weight of mark user interest is λi, the computational methods of this weight are: set siFor i-th point of interest of targeted customer, len (si) be Point of interest siIn targeted customer's behavior record number of comprising, then point of interest siAccount for weight λ of targeted customer's interestiCalculation Shown in equation below 2:
According to user interest point forgetting law, using forgetting function h (t) to point of interest λiWeight is processed;It is assumed that t is In certain user interest point, last behavior record time of origin is to the time interval of recommendation time, then user interest point memory degree Calculation equation below 3 shown in:
H (t)=e-t(formula 3)
Wherein, the unit of t is the moon;When last behavior record time of origin is identical with the time of recommendation, representative is spaced apart 0, then h (0)=1, represent user and forgetting is not had started to this interest.Finally, the preliminary recommendation results of each point of interest of user are entered Row weighted calculation sorts, and obtains point of interest sorted lists p.It is assumed that targeted customer has n point of interest, i-th point of interest recommends knot The prediction of fruit is scored as pi, then the calculation of point of interest sorted lists p can be expressed as follows formula 4:
P=sort (p11*h(t1),p22*h(t2),p33*h(t3),…,pii*h(ti),…,pnn*h(tn)) (formula 4)
Wherein, the preliminary recommendation results prediction scoring of each point of interest of sort () function pair targeted customer, interest weight, memory Degree is weighted, and end value sequence is processed.piRepresent the preliminary recommendation results prediction scoring of this user interest point i, ti Represent the interval that this point of interest extremely recommends the time, h (ti) it is the memory degree to point of interest i for the user.Finally, sorted according to point of interest List items calculated value, selects train value highest point of interest recommendation results to be supplied to targeted customer, thus realizing considering user The personalized recommendation of the dynamic Characteristic of Interest of time variation, multi-dimensional nature, discreteness.
The method of the present invention is passed through to process user behavior data burst, numerous and disorderly behavioral data is pressed transaction session and organizes To fragment, and solve the process problem that behavioral data key words extraction and similar behavioral data merge.Simultaneously to user behavior number Carry out cluster analysis according to fragment, the behavioral data fragment of all users is carried out classification process by containing interest topic, excavates There is the interest point set of similar users behavior, and construct public user interest model, solve the dynamic interest model of user Portray precision Upgrade Problem.For the analysis of targeted customer's behavioral data, in the dynamic interest model of public user, obtain this use Family interest point set, and applicating cooperation filter algorithm is tentatively recommended respectively to each point of interest of targeted customer.Then to target The preliminary recommendation results prediction scoring of each point of interest of user, interest weight, memory degree are weighted, and end value are sorted Process, choose train value highest point of interest recommendation results and be supplied to targeted customer, thus solving user's dynamic interest time variation, many Personalized recommendation difficulties under dimension property, discreteness feature.
The present invention, as the personalized recommendation method of the dynamic interest of legacy user, is by analyzing user behavior data The dynamic point of interest of mode digging user, set up dynamic user interest model.Also utilize collaborative filtering to be directed to use simultaneously Family interest is recommended, and produces the personalized recommendation result based on the dynamic interest of user.The present invention and existing personalized recommendation side The different place of method is, existing personalized recommendation method only considers user interest dynamic time-varying implementations, and the present invention Not only consider the dynamic time variation of user interest moreover it is possible to excavate with multi-dimensional nature, discreteness user interest point in subordinate act data, Thus more accurately portraying the dynamic interest model of user.Present invention is alternatively directed to the multidimensional interest master with discretization of targeted customer Topic, is concurrently tentatively recommended for each point of interest of this user, finally by the weight of each point of interest, memory degree, preliminary recommendation results Prediction scoring is weighted, and the point of interest choosing highest calculated value realizes recommendation process, thus improving personalized recommendation knot The precision of fruit and process performance.
Beneficial effects of the present invention are that the method for the present invention is entered by user is accessed with the behavioral data of electric business plateform system Row burst cluster analysis is processed, solve user behavior data contain the time variation of interest topic, multi-dimensional nature, discreteness etc. process difficult Point problem, can accurately portray the dynamic interest model of user, thus providing basis for precisely realizing personalized recommendation.For being based on The multidimensional point of interest that cluster analysis is extracted, this method is tentatively recommended respectively, later in conjunction with currently each interest of targeted customer Point weight, memory degree and the prediction scoring of preliminary recommendation results carry out combined recommendation so that recommendation results are more accurate.Additionally, with now The dynamic interest personalized recommendation method having is compared, and the present invention carries out burst and merger and processes to user behavior data, for follow-up User behavior data cluster analysis processes and reduces expense.Equally, each multidimensional point of interest of user extracting for cluster analysis, holds Row parallelization personalized recommendation, can improve the process performance of commending system.
Brief description
Fig. 1 is the system structure diagram of the inventive method model;
Fig. 2 is the overview flow chart of the inventive method model treatment;
Fig. 3 is user behavior data burst process chart;
Fig. 4 is user behavior data cluster analysis flow chart;
Fig. 5 is the recommended flowsheet figure of targeted customer;
Fig. 6 is user behavior data gatherer process schematic diagram;
Fig. 7 is user behavior data slicing principle schematic diagram;
Fig. 8 is user behavior data fragment process of cluster analysis schematic diagram.
Specific embodiment
With reference to the accompanying drawings and examples the present invention is described in detail
In order to improve the recommendation precision of personalized recommendation system, need comprehensively to consider the time-varying of the dynamic interest of user The characteristics such as property, multi-dimensional nature and discreteness.In order to substantial amounts of for user behavioral data is effectively gathered and is facilitated analyzing and processing, this Invent in units of each transaction session that each user accesses electric business platform, by the involved visit in transaction session of this user Ask that operation is organized in a behavioral data fragment, the behavioral data of each user will carry out burst process.Due to user's Each affairs behavioral data fragment all contains certain interest or wish, and the present invention will analyze extraction each behavioral data fragment of user Descriptor so as to collection a large number of users behavioral data fragment be analyzed process.For certain user's different dimensions of classifying Behavioral data fragment, each behavioral data fragment of this user carries out merger by similar topic word by the present invention, and only retaining should The different themes behavioral data fragment of user.In addition it is also necessary to all users after each behavioral data fragment obtaining unique user Behavioral data fragment carry out cluster analysis, extract the behavioral data fragment collection in all users with similar interests theme Close.User behavior data in each set contains these users and has identical interest topic.Thus excavating all User has, multidimensional interest topic, and set up the interest model of public user according to these interest topics, to realize Personalized recommendation.Additionally, by continuous analysis user behavior data, new interest topic is added in user interest model, from And realize user interest model and dynamically update.When to targeted customer's execution personalized recommendation, find targeted customer's sense first emerging The theme set of interest, the user comprising then in conjunction with user behavior fragments all in this interest topic buys data and scoring number According to collaborative filtering, being that each interest topic of targeted customer executes personalized recommendation respectively.Finally, according to targeted customer Currently each point of interest weight, memory degree and the prediction scoring of preliminary recommendation results, be weighted and sort process, chooses highest The point of interest recommendation results of train value provide targeted customer.Its concrete process step is as follows:
1st, user behavior data burst.The once complete transaction session of user is defined as a behavioral data piece by the present invention Section, main sliced fashion is with the establishment of session and to destroy as boundary, using user operation data in this period as one Behavioral data fragment.
2nd, a large amount of behavior fragment datas being directed to each user carry out the merger process of similar topic.First, extract each row For the theme set of words (one or more) of data slot, and according to epigraph mark part of speech based on user's browsing content, thus solving The certainly merger problem of polysemy.Secondly, the similarity between each descriptor relatively in each behavioral data fragment, high similarity Behavioral data fragment merges, and is that subsequent user behavioral data cluster analysis processes minimizing expense.Finally, obtain having of this user The behavioral data fragment of multidimensional theme.
3rd, the potential point of interest of user is excavated by cluster analysis, set up the dynamic interest model of public user.Because having phase Behavioral data fragment like descriptor necessarily contains similar interest topic, so the present invention passes through all users of cluster analysis Behavioral data fragment characteristic vector, excavate the behavioral data set of segments with similar topic word, thus extracting use The multidimensional point of interest with discretization in family, and user interest model is built according to these user interest points.First, extract each behavior The tf-idf (descriptor weight) of descriptor and part-of-speech information in data slot, and for each behavioral data fragment generate feature to Amount.Secondly, the similarity between each behavioral data fragment is calculated according to characteristic vector, and calculate with bottom-up hierarchical clustering Iteration clusters each behavioral data fragment to method successively, obtains the behavioral data set of segments of similar interests.Then, by extracting each collection The higher descriptor of the tf-idf value of all behavioral data fragments in conjunction, just obtains each interest topic of all users, thus Set up public user interest model.
4th, it is that each point of interest of targeted customer executes personalized recommendation algorithm.First, for the behavioral data piece of targeted customer Section is analyzed, and from public user interest model, finds all points of interest of this user.Then, press each interest simultaneously Point parallelization ground execution collaborative filtering, each point of interest for this user produces personalized recommendation PRELIMINARY RESULTS respectively.
5th, according to targeted customer, currently the weight of each point of interest, memory degree and scoring are weighted, and process of sorting, The point of interest recommendation results choosing highest calculated value provide targeted customer.First, calculate the power of each point of interest of targeted customer respectively Weight, the prediction scoring of memory degree, recommendation results, and they are weighted.Then, the weighted calculation value of each point of interest is entered Row sequence, chooses weighted calculation value arrangement highest point of interest recommendation results as consequently recommended result.
As shown in figure 1, the inventive method model is related to electric business platform, behavioral data acquisition module, Users' Interests Mining mould Block, four parts of system recommendation module.Electric business platform is the application foundation of commending system, and it is except providing electronics for client Outside business service, also will record user in this platform database and log system and search for, browse, buying, evaluating the behaviour such as commodity Make behavioral data.Behavioral data acquisition module is responsible for gathering the use of correlation from customer data base, log system, merchandising database Family behavioral data and user's score data.Users' Interests Mining module carries out burst process to user behavior data, then carries again Take the characteristic vector of each behavioral data fragment, and cluster analysis carried out with this, digging user is multidimensional, the point of interest of discretization, Thus setting up the public user interest model of electric business platform.Recommending module is analyzed according to targeted customer's behavioral data, in the public Extract targeted customer's interest point set in user interest model, and provide targeted customer using collaborative filtering method for electric business platform Personalized recommendation.
In the inventive method model, Users' Interests Mining module is mainly by user behavior data burst, behavioral data piece The processing unit compositions such as section feature vector extraction, the calculating of behavioral data segment-similarity and behavioral data fragment cluster analysis. Wherein, user behavior data sharding unit carries out burst process to behavioral data in units of each transaction session of user, and The data slot of this user is carried out merger process by similar topic word, thus obtaining one group of behavioral data containing different themes Fragment.Behavioral data segment characterizations vector extracting unit is responsible for extracting the tf-idf value of descriptor in each behavioral data fragment, and Arrange each descriptor and its tf-idf value according to Chinese vocabulary table order, generate the characteristic vector of behavior data slot.Feature Vector represents the feature of user behavior data fragment, processes for calculating the similarity between behavioral data fragment.Behavioral data Segment-similarity computing unit is divided into two classes to calculate.First kind calculating is that all behavioral data fragments for unique user are carried out Similarity Measure, the behavioral data fragment merger for will have like descriptor is processed.Equations of The Second Kind is useful for platform institute The behavioral data segment characterizations vector at family carries out Similarity Measure, provides the similarity degree of data slot for cluster analysis unit Amount.Behavioral data fragment cluster analysis unit carries out cluster analysis to all user behavior fragment datas, excavates out one group and contains The data slot set of different themes.The data slot set of each theme has similar interests point, and then it is flat to build electric business The public user interest model of platform.System recommendation module is analyzed processing for targeted customer's behavioral data, and uses from the public Family interest model excavates out the interest point set of this targeted customer.Then, execute respectively for each point of interest of targeted customer collaborative Filtering recommendation algorithms generate preliminary recommendation results.Finally, by the weight of each point of interest of targeted customer, memory degree, preliminary recommendation Prediction of result scoring is weighted, and chooses calculated value highest point of interest recommendation results and ties as final personalized recommendation Really.
As shown in Fig. 2 the personalized recommendation method process of the present invention, be divided into public user interest model set up subprocess and Targeted customer recommends subprocess.Public user interest model is set up process and divides four steps to complete: first, user is accessing electric business platform During, plateform system records the peration data of each user automatically.Then, user behavior acquisition module from operating database and Gather the behavioral data of each user in daily record data database, and carry out behavioral data list organization by user.Afterwards, this mould Each user behavior data is carried out burst process by transaction session by type, obtains some user behavior data pieces containing descriptor Section, has the behavioral data fragment of same subject word in merger unique user.Finally, the behavioral data fragment of all users is entered Row cluster analysis, obtains some user behavior data set of segments containing different themes word, i.e. each data slot set is accumulate Containing similar interest topic, and the interest model of public user is built with this.
Targeted customer's personalized recommendation process divides three steps to complete: first, analyzes targeted customer's behavioral data fragment, according to public Many user interest models find all points of interest of targeted customer.Then, each interest being this user with collaborative filtering Point generates preliminary recommendation results.Finally, for the weight of each point of interest of this user, memory degree, the prediction scoring of preliminary recommendation results It is weighted, choosing calculated value highest point of interest recommendation results provides targeted customer, completes personalized recommendation process.
The processing method of key modules of the present invention is described below.
1st, user behavior data collection
User behavior data is the data basis that personalized recommendation is realized, and the inventive method not only needs to gather the inspection of user Rope is with navigation patterns data in addition it is also necessary to collection user buys and scoring behavioral data.Retrieval is mainly used in navigation patterns data The point of interest that digging user is dynamic, many peacekeepings are discrete, is bought and is then used at commercial product recommending in collaborative filtering with scoring behavioral data Reason.User behavior data mainly carries out data acquisition from customer data base, merchandising database and log system, steps on including user Five kinds of behavior classes such as record behavioral data, retrieval behavioral data, navigation patterns data, buying behavior and user's scoring behavioral data Type data.The every behavioral data being gathered not only needs to comprise session id, user id, commodity id, behavior type, content of the act Etc. information in addition it is also necessary to comprise the attribute informations such as timestamp, browsing terminal and place.These data are arranged by session id Sequence, generates user behavior data list, thus facilitating user behavior data burst to process and cluster analysis process.
2nd, user behavior data burst
In e-commerce platform, each session of user has clearer and more definite purpose, so the user behaviour in this session Make to contain certain interest topic.Therefore, the present invention is in units of transaction session of user, and user behavior data is carried out point Piece is processed.In order to support the efficient process of user behavior data fragment cluster analysis, this unit is also to user behavior data fragment Content carries out key phrases extraction, then carries out merger process to the behavioral data fragment with similar topic word.Its process such as Fig. 3 Shown.
From the figure 3, it may be seen that user behavior data Slicing procedure is divided into following key step:
1) read the behavioral data of unique user from the database of acquisition module, including user retrieval behavior, browse row For, buying behavior, scoring behavior, log in the behavioral datas such as behavior, reactive power optimization.
2) each for user complete transaction session is defined as a behavioral data fragment.Concrete grammar is according to each user It is a behavioral data fragment that session creates to session the sequence of operations destroying in the time period.
3) from each data slot content of user, extract user search and the descriptor browsing information, generate behavior number Theme set of words according to fragment.It is that corresponding descriptor gives different parts of speech according to browsing content information, thus solving nature Polysemy problem in language.
4) the behavioral data fragment in unique user with similar topic set of words is carried out merger process.Relatively each behavior The similarity of theme set of words in data slot, merges behavioral data fragment high for similarity, obtains one group of this user and have The behavioral data fragment of different themes word.
3rd, user behavior cluster analysis
User behavior cluster analysis is that the behavioral data fragment for all users is analyzed processing, and therefrom excavates public affairs Many user interest themes.It comprises user behavior data segment characterizations vector and extracts and user behavior data fragment cluster analysis two Individual processing unit.Behavioral data fragment because having similar topic word necessarily contains similar interest topic, so this module By calculating the similarity of each data slot descriptor, and adopt cluster analysis, excavate the behavior with similar topic word Data slot set, thus extracting the multidimensional point of interest with discretization of user, and builds user interest model with this.Its mistake Journey is as shown in Figure 4:
1) characteristic vector pickup of user behavior data fragment.The user behavior number being obtained according to behavioral data Slicing procedure According to the theme set of words of fragment, calculate the tf-idf value of each descriptor, this value is the tolerance of descriptor importance.By Chinese vocabulary Table order is arranged in order descriptor and its tf-idf value, just constitutes the characteristic vector of behavior data slot.
2) calculate the intersegmental similarity of each behavioral data piece.With each descriptor of characteristic vector for a dimension, build Vector space model.Then when two characteristic vectors are orthogonal, behavioral data segment-similarity is 0.When characteristic vector overlaps, OK It is 100% for data slot similarity.Therefore the intersegmental similarity of each behavioral data piece can be calculated using cosine law formula, obtain The cosine value arriving is the intersegmental Similarity value of each behavioral data piece.
3) run hierarchical clustering algorithm program and cluster analysis is carried out to each data slot.The present invention adopts bottom-up Hierarchical clustering algorithm, two most like behavioral data fragment classifications of continuous iteration cluster, and then complete all user behavior numbers Cluster process according to fragment.First, each behavioral data fragment is regarded as a classification.Then, by similarity highest two Categories combination is a class.Iteration successively, till reaching specified class number.
4) select optimal cluster level, determine cluster result.Bottom-up hierarchical clustering algorithm ultimately generates tree-shaped Cluster result.The level maximum by finding the change of similarity between class, just can determine that the cluster result of optimum, obtains having many Dimensionization, the user behavior data set of segments of the interest topic of discretization.
5) according to above-mentioned cluster result, set up the user interest model of the public.Each behavioral data piece from cluster result In Duan Jihe, extract tf-idf value highest descriptor in the behavioral data fragment comprising, obtain user interest theme (interest Point).These multidimensional, discretization user interest themes and its behavioral data set of segments are organized together, just constitutes Public user interest model.
4th, the personalized recommendation of targeted customer
When accessing electric business platform, there is dynamic time variation, multi-dimensional nature and discreteness in user interest, i.e. the not Tongfang of user Face interest exists compared with large span.Therefore, only recommended respectively for each point of interest of user, just can effectively improve personalization and push away Recommend precision.First, each point of interest for targeted customer executes personalized recommendation respectively, generates preliminary recommendation results.Then, It is weighted for the weight of each point of interest of this user, memory degree, the prediction scoring of preliminary recommendation results, choose calculated value High point of interest recommendation results provide targeted customer.Its handling process is as shown in Figure 5.
Fig. 5 is the personalized recommendation flow chart of targeted customer, and its step is as follows:
1) find targeted customer's each point of interest interested.Based on public user interest model, analyze the row of targeted customer For data slot, find out the interest point set of this user.
2) it is directed to each point of interest of targeted customer, execute Collaborative Filtering Recommendation Algorithm respectively, generate each point of interest Preliminary recommendation results.
3) targeted customer's each point of interest sequence calculates.To the weight of each point of interest of targeted customer, memory degree, preliminary recommendation knot Fruit prediction scoring is weighted, and they are ranked up processing.Its weighted calculation value and sequence reflection targeted customer work as Front degree interested in each point of interest.
4) generate personalized recommendation result.From the weighted calculation value list of each point of interest, choose the point of interest of peak Recommendation results are as final recommendation results.
Embodiment:
1st, user behavior data collection
From traditional personalized recommendation system only gather user buy, score data different, this example also need to gather user Retrieval behavior and navigation patterns data.Wherein, the every behavioral data being gathered not only needs to comprise session id, user The information such as id, commodity id, behavior type, content of the act are in addition it is also necessary to comprise the attributes such as timestamp, browsing terminal and place letter Breath.User behavior data burst for next step is processed to provide and supports by these primary attributes.Its concrete gatherer process such as Fig. 6 institute Show.
As shown in fig. 6, user behavior data acquisition module first from the customer data base of electric business platform, merchandising database with And gather out User logs in log system, the behavioral data of classification such as retrieve, browse, buying, scoring.Each behavioral data is equal Comprise base attribute information (as session id, user id, behavior type, content of the act, user ip, logging device, time etc.).Its In, for ensureing the integrality of user behavior data collection, electric business platform creates session when user starts access system, when User destroys this session information after exiting.After User logs in electric business platform, it will words id (id of session) are closed and are coupled to In the behavioral data list of this user.Log in the session information containing user in behavioral data, can be used for user behavior number Process according to burst.User search and navigation patterns data are mainly used in the point of interest that digging user is dynamic, many peacekeepings are discrete, so that Set up public user interest model.User buys and scoring behavioral data is then used for facilitating personalized recommendation.
2nd, user behavior Slicing procedure
Because each transactions access that user accesses electric business platform mostly carries interesting purpose, that is, this affairs all operations is all There is identical interest topic.Therefore in units of each transaction session of user, user behavior data is carried out data slot and draws Divide it is possible to make each fragment behavioral data contain a theme.Then, for all behavioral data pieces of each user Section, the similar topic word according to containing carries out merger process, thus process for follow-up Users' Interests Mining improving performance.User Behavioral data burst handling principle is as shown in Figure 7.
1) behavioral data of each user is read respectively from user behavior data storehouse, and by user's id organizational behavior data row Table.The behavioral data being read, in addition to comprising basic attribute data, further comprises behavior relevant operating data.
2) in units of user's single transaction session, one group of behavioral data in this affairs is divided into a behavioral data Fragment.Concrete grammar is with the establishment of user session and to destroy as boundary, by all behavioral datas of user in this time period As a behavioral data fragment, and filter nullity data (after logging in, exiting at once), reduce user behavior number According to noise.
3) extract the descriptor of behavioral data fragment.For user retrieval behavior data slot, the theme of its behavior segment Word is search key.For browsing and buying behavior fragment data, the present invention extracts by the following method and browses and purchase Buy the potential descriptor of content of the act.First by Chinese word segmentation software module, the text data of content of the act is carried out at participle Reason, and filter insignificant function word information, obtain the set of letters that content of the act comprises.Then, calculated using tf-idf algorithm The importance degree of each word.It is assumed that tiThe number of times occurring for word i, t is the number of times that all words occur, then the tf-idf value of word i Computing formula is shown in formula 5.
Wherein, first by ti/ t counts the word frequency information (term of word i in the detailed description of browsing content Frequency, is abbreviated as tf), then calculate inverse document frequency (the inverse document of word i in describing in detail Frequency, is abbreviated as idf), its computing formula is log (d/di), wherein d is entire service number, diRepresent word i in di Occur in individual descriptive labelling.Finally, calculate tf the and idf product of each word, obtain the importance degree of each word.Select importance degree High several words as browse and buy the potential theme set of words of content.Additionally, being the descriptor improving extraction further Precision in addition it is also necessary to browse or buy the attribute (classification, purposes etc.) of content according to user, based on epigraph add label, thus Solve the problems, such as polysemy.Definition k is user behavior data fragment descriptor, and s is the browsing content information of this descriptor, then have The behavioral data fragment theme set of words having n descriptor can be expressed as (k1<s1>,k2<s2>,…,ki<si>,…,kn<sn >).
4) the behavioral data fragment that will have like descriptor merges.By calculating the theme word set of each behavioral data fragment The similarity closed, when similarity exceedes certain threshold value, (as 80%) merges this two behavioral data fragments.Behavioral data fragment Between Similarity Measure can with set cosine similarity computational methods.Suppose there is uiAnd ujTwo behavioral data fragments, then Their theme set of words similarity s (ui,uj) computational methods are shown in formula 6.
Wherein, s (ui,uj) represent behavioral data fragment uiAnd ujBetween similarity, v (ui) and v (uj) represent behavior respectively Data slot uiAnd ujTheme set of words.When calculating descriptor intersection of sets collection, only when searching motif word is identical and has During identical part of speech, just think that two searching motif words are identical.Can will be high for theme set of words similitude by said method Behavioral data fragment merges, and then reduces the data volume of subsequent data analysis, is conducive to improving holding of Users' Interests Mining Row performance.
3rd, user behavior data cluster analysis is realized
Behavioral data fragment because having similar topic set of words contains similar point of interest, so this module purpose is Analyze the theme word information of all user behavior data fragments using clustering technique, will have like the use of theme word feature vector Family behavioral data fragment clusters out, sorts out the behavioral data set of segments to have similar users point of interest, and then extracts The multidimensional interest point set with discretization of user, and public user interest model is set up with this.
Needed to be calculated the characteristic vector data of user behavior data fragment before using hierarchical clustering algorithm.This Bright first in behavior data fragmentation processing procedure, the theme set of words of the user behavior data fragment obtaining and its part of speech letter Breath.Secondly, calculate the frequency (tf) that each descriptor occurs in each behavioral data fragment respectively, computational methods are in formula 6-1 Be given.Then, calculate the inverse document frequency (idf) of each descriptor, its computing formula is log (d/di), wherein d is institute There are the behavioral data fragment number of user, diRepresent the number of times that descriptor i occurs in all behavioral data fragments.Respectively will be each The tf value of descriptor obtains the tf-idf value of each descriptor with idf value after being multiplied.Finally, arranged successively according to common words table order Arrange each descriptor and its tf-idf value, thus obtaining the characteristic vector of each behavioral data fragment.This feature vector reflects user The interest characteristics of behavioral data fragment.
After the characteristic vector obtaining each behavioral data fragment, start to execute bottom-up hierarchical clustering algorithm completing to gather Alanysis, its process of cluster analysis is illustrated as shown in Figure 8.
First, each behavioral data fragment is regarded as a classification, in such as Fig. 8, have 30 behavioral data fragments, each Fragment is a classification.Then, the characteristic vector according to each behavioral data fragment, calculates the similarity between them, by phase It is a class like degree two categories combinations of highest.When comprising multiple behavioral data fragment in two classes, using class between each behavior The average similarity of data slot is as the similarity of this two classes.Iteration successively, till specifying class number, ultimately generates Tree clustering result in Fig. 8.
Wherein, the present invention measures the similarity between each behavioral data segment characterizations vector using cosine law formula.False Fixed (x1,x2,…,xn) and (y1,y2,…,yn) vectorial (note: can use as vacancy of the behavioral data segment characterizations for x and y The method of descriptor zero padding, solves block eigenvector length inconsistence problems), then the computing formula of the similarity cos θ of x and y is shown in public affairs Formula 7.
In addition it is also necessary to determine selected which layer conduct after tree clustering result is obtained by bottom-up hierarchical clustering algorithm Final cluster result.Research finds that such is not had too with the similarity of other classes after merging two classification of theme identical Big change.But after merging two different classification of theme, similarity and between other classes can be led to substantially reduce.Divide merging After class, the maximum previous level of similarity change between class, as optimal cluster result.
By above-mentioned cluster analysis, the available one group user behavior data set of segments with similar interests.From each Extract tf-idf value highest descriptor in set as interest topic, and then the interest topic collection of public user can be obtained Close.These interest topics and its behavioral data set of segments are organized together, just establishes the interest model of public user.
4th, the personalized recommendation functional realiey of targeted customer
When accessing electric business platform, user interest has time variation, multi-dimensional nature and discreteness feature.For this situation, Using each point of interest of targeted customer respectively Generalization bounds method, the precision that it is recommended than based on all behavioral data of user Property recommendation results high.But, because user is different to the favorable rating of each point of interest, so each point of interest personalized recommendation knot Weight shared by fruit is also different.For example, certain user interest point saComprise 1000 user behaviors, and in addition certain user interest point sbOnly Only comprise 20 user behaviors, even if now saMiddle recommendation results a prediction scoring is slightly below sbMiddle recommendation results b, but user is to a What degree of liking was possible will be far longer than commodity b.If additionally, point of interest saUp-to-date behavior record is than point of interest sbRemote much, then User equally possible to the degree of liking of commodity b more than commodity a because the interest of user may have occurred that change.So such as What calculates the weight of each point of interest of user, and the recommendation results weighting for each point of interest in proportion, and arrangement obtains final individual character Change recommendation list, be the key obtaining accurately personalized recommendation result.
The present invention, from user interest model, finds targeted customer's each point of interest potential, and extracts these interest All user's score data information that theme is related to, then run user-based collaborative filtering, obtain preliminary each emerging Interest point recommendation results and its prediction scoring pi.Because user is interested in certain point of interest, the operation row related to this point of interest It is more, so the present invention calculates the power of each point of interest of targeted customer according to the number of user behavior record in each point of interest Weight (λi).It is assumed that siFor i-th point of interest of targeted customer, len (si) point of interest siThe behavior record number comprising, then point of interest si Weight (λi) calculation is as shown in Equation 8.
Can be obtained by the point of interest weight of targeted customer by said method, but user interest also has dynamic time-varying Property, point of interest user more remote is lower to the interest-degree of this point of interest.Draw with reference to German psychologist's Chinese mugwort guest's this research great Forgetting curve, find user point of interest equally meet forgetting curve rule.For this present invention according to user interest point Forgetting law, using forgetting function h (t) to point of interest λiWeight is processed.It is assumed that t is last in certain user interest point To the time interval of recommendation time, then the calculation of user interest point memory degree is as shown in Equation 9 for behavior record time of origin.
H (t)=e-t(formula 9)
Wherein, the unit of t is the moon.When last behavior record time of origin is identical with the time of recommendation, representative is spaced apart 0.Then h (0)=1, represents user and does not have started forgetting to this interest.Finally, the preliminary recommendation results of each point of interest of user are entered Row weighted calculation sorts, and obtains point of interest sorted lists p.It is assumed that targeted customer has n point of interest, then each point of interest sorted lists The calculation of p can be expressed as follows formula 10.
P=sort (p11*h(t1),p22*h(t2),p33*h(t3),…,pii*h(ti),…,pnn*h(tn)) (public Formula 10)
In formula 6-6, the preliminary recommendation results prediction scoring of each point of interest of sort () function pair targeted customer, interest power Weight, memory degree are weighted, and end value sequence is processed.piRepresent the preliminary recommendation knot of this i-th point of interest of user Fruit prediction scoring, tiRepresent the interval that this point of interest extremely recommends the time, h (ti) it is the memory degree to point of interest i for the user.Finally, root According to point of interest Sorted list list item calculated value, train value highest point of interest recommendation results are selected to be supplied to targeted customer, thus realizing Consider user's time variation, multi-dimensional nature, the personalized recommendation of the dynamic Characteristic of Interest of discreteness.

Claims (1)

1. a kind of modeling recommendation method based on user behavior data burst cluster is it is characterised in that include:
A. user behavior data customized treatment, specifically includes:
A1. user behavior data collection;When described user behavior data refers to that user passes through internet access electric business platform, electric business The user behavior data that platform is gathered, at least includes logging in, retrieve, browse, buy and evaluating etc. categorical data, simultaneously each Plant the base attribute information that user behavior data all includes the imparting of electric business platform, described base attribute information at least includes session Id, user id, behavior type, content of the act, user ip, logging device and time;
A2. user behavior data burst;Specifically the behavioral data of collection in step a1 is organized by user, then with user Each transaction session to electric business platform is unit, and user behavior data is divided by transaction session, so that each is divided Behavioral data fragment only comprises an affairs theme, and the behavioral data fragment that this user is comprised similar topic word carries out merger Process;Described transaction session refers to create in User logs in electric business platform, and the timeslice destroyed after user terminates to access Section;
B. pass through user behavior data cluster analysis establishment and use public user interest model, specifically include:
B1. after the behavioral data burst to different user for step a, using each behavioral data fragment as a classification, count Calculate the similarity between all categories;Particularly as follows: assuming there is uiAnd ujTwo behavioral data fragments, then their theme set of words Similarity s (ui,uj) computational methods such as following formula:
s ( u i , u j ) = | v ( u i ) &cap; v ( u j ) | | v ( u i ) | | v ( u j ) |
Wherein, s (ui,uj) represent behavioral data fragment uiAnd ujBetween similarity, v (ui) and v (uj) represent behavioral data respectively Fragment uiAnd ujTheme set of words, calculate descriptor intersection of sets collection when, only when searching motif word is identical and has identical During part of speech, just think that two searching motif words are identical;
B2. by two categories combinations of similarity highest of gained be a classification, and using two classifications average similarity As the similarity of new category, repeat step b2 is till obtaining the classification of specified quantity;
B3. extract descriptor from each classification that step b2 finally obtains as interest topic, build public user interest mould Type;
C. electric business platform is recommended to user, method particularly includes:
Electric business platform is analyzed to the behavioral data of targeted customer, looks in the public user interest model obtaining from step b Go out each point of interest of targeted customer, tentatively recommended respectively using collaborative filtering, then each in conjunction with targeted customer The weight of point of interest, memory degree and the prediction scoring of preliminary recommendation results carry out consequently recommended it is assumed that i-th point of interest accounts for target and use The weight of family interest is λi, the computational methods of this weight are: set siFor i-th point of interest of targeted customer, len (si) it is interest Point siIn targeted customer's behavior record number of comprising, then point of interest siAccount for weight λ of targeted customer's interestiCalculation is as follows Shown in formula:
&lambda; i = l e n ( s i ) l e n ( s 1 ) + l e n ( s 2 ) + ... + l e n ( s i ) + ... + l e n ( s n )
According to user interest point forgetting law, using forgetting function h (t) to point of interest λiWeight is processed;It is assumed that t uses for certain In the point of interest of family, last behavior record time of origin is to the time interval of recommendation time, the then meter of user interest point memory degree Calculation mode is shown below:
H (t)=e-t
Wherein, the unit of t is the moon;When last behavior record time of origin is identical with the time of recommendation, represents and be spaced apart 0, then h (0)=1, represent user and forgetting is not had started to this interest, finally, the preliminary recommendation results of each point of interest of user are weighted Calculate sequence, obtain point of interest sorted lists p it is assumed that targeted customer has a n point of interest, i-th point of interest recommendation results pre- Test and appraisal are divided into pi, then the calculation of point of interest sorted lists p can be expressed as follows:
P=sort (p11*h(t1),p22*h(t2),p33*h(t3),…,pii*h(ti),…,pnn*h(tn))
Wherein, the preliminary recommendation results prediction scoring of each point of interest of sort () function pair targeted customer, interest weight, memory degree enter Row weighted calculation, and end value sequence is processed, piRepresent the preliminary recommendation results prediction scoring of this user interest point i, tiRepresent This point of interest extremely recommends the interval of time, h (ti) it is the memory degree to point of interest i for the user, finally, according to point of interest sorted lists Item calculated value, selects train value highest point of interest recommendation results to be supplied to targeted customer, thus realizing considering user's time-varying Property, the personalized recommendation of the dynamic Characteristic of Interest of multi-dimensional nature, discreteness.
CN201610828355.9A 2016-09-18 2016-09-18 Modeling recommendation method based on user behavior data fragmentation cluster Pending CN106339502A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610828355.9A CN106339502A (en) 2016-09-18 2016-09-18 Modeling recommendation method based on user behavior data fragmentation cluster

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610828355.9A CN106339502A (en) 2016-09-18 2016-09-18 Modeling recommendation method based on user behavior data fragmentation cluster

Publications (1)

Publication Number Publication Date
CN106339502A true CN106339502A (en) 2017-01-18

Family

ID=57840108

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610828355.9A Pending CN106339502A (en) 2016-09-18 2016-09-18 Modeling recommendation method based on user behavior data fragmentation cluster

Country Status (1)

Country Link
CN (1) CN106339502A (en)

Cited By (44)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106851349A (en) * 2017-03-21 2017-06-13 上海星红桉数据科技有限公司 Based on magnanimity across the live recommendation method for shielding viewing behavior data
CN106919653A (en) * 2017-01-24 2017-07-04 广西师范学院 Daily record filter method based on user behavior
CN106952130A (en) * 2017-02-27 2017-07-14 华南理工大学 Common user item based on collaborative filtering recommends method
CN106980662A (en) * 2017-03-21 2017-07-25 上海星红桉数据科技有限公司 Based on magnanimity across the user tag sorting technique for shielding viewing behavior data
CN107169821A (en) * 2017-05-02 2017-09-15 杭州泰指尚科技有限公司 Big data inquires about recommendation method and its system
CN107194769A (en) * 2017-05-17 2017-09-22 东莞市华睿电子科技有限公司 A kind of Method of Commodity Recommendation that content is searched for based on user
CN107357835A (en) * 2017-06-22 2017-11-17 电子科技大学 It is a kind of that method for digging and system are predicted based on the interest of topic model and forgetting law
CN107391687A (en) * 2017-07-24 2017-11-24 华中师范大学 A kind of mixing commending system towards local chronicle website
CN107886357A (en) * 2017-11-06 2018-04-06 北京希格斯科技发展有限公司 The method and system of content value is judged based on user behavior data
CN107944485A (en) * 2017-11-17 2018-04-20 西安电子科技大学 The commending system and method, personalized recommendation system found based on cluster group
CN108717654A (en) * 2018-05-17 2018-10-30 南京大学 A kind of more electric business intersection recommendation method based on cluster feature migration
CN108733684A (en) * 2017-04-17 2018-11-02 合信息技术(北京)有限公司 The recommendation method and device of multimedia resource
CN108846698A (en) * 2018-06-14 2018-11-20 安徽鼎龙网络传媒有限公司 A kind of micro- scene management backstage wechat store cloud processing compressibility
CN108874959A (en) * 2018-06-06 2018-11-23 电子科技大学 A kind of user's dynamic interest model method for building up based on big data technology
CN108921670A (en) * 2018-07-04 2018-11-30 重庆大学 A kind of potential interest of fusion user, the Drug trading recommended method of space-time data and classification popularity
WO2018218403A1 (en) * 2017-05-27 2018-12-06 深圳大学 Content pushing method and device
CN109034248A (en) * 2018-07-27 2018-12-18 电子科技大学 A kind of classification method of the Noise label image based on deep learning
CN109543109A (en) * 2018-11-27 2019-03-29 山东建筑大学 A kind of proposed algorithm of time of fusion window setting technique and score in predicting model
CN109684552A (en) * 2018-12-26 2019-04-26 云南宾飞科技有限公司 A kind of intelligent information recommendation system
CN109727056A (en) * 2018-07-06 2019-05-07 平安科技(深圳)有限公司 Financial institution's recommended method, equipment, storage medium and device
CN110060129A (en) * 2019-04-22 2019-07-26 深圳市活力天汇科技股份有限公司 A kind of air ticket intelligent recommendation method
CN110135463A (en) * 2019-04-18 2019-08-16 微梦创科网络科技(中国)有限公司 A kind of commodity method for pushing and device
CN110807052A (en) * 2019-11-05 2020-02-18 佳都新太科技股份有限公司 User group classification method, device, equipment and storage medium
CN110852846A (en) * 2019-11-11 2020-02-28 京东数字科技控股有限公司 Processing method and device for recommended object, electronic equipment and storage medium
WO2020088058A1 (en) * 2018-10-31 2020-05-07 北京字节跳动网络技术有限公司 Information generating method and device
CN111209486A (en) * 2019-12-19 2020-05-29 杭州安恒信息技术股份有限公司 Management platform data recommendation method based on mixed recommendation rule
CN111209474A (en) * 2019-12-27 2020-05-29 广东德诚科教有限公司 Online course recommendation method and device, computer equipment and storage medium
CN111400591A (en) * 2020-03-11 2020-07-10 腾讯科技(北京)有限公司 Information recommendation method and device, electronic equipment and storage medium
CN111506813A (en) * 2020-04-08 2020-08-07 中国电子科技集团公司第五十四研究所 Remote sensing information accurate recommendation method based on user portrait
CN111861526A (en) * 2019-04-30 2020-10-30 京东城市(南京)科技有限公司 Method and device for analyzing object source
CN111984874A (en) * 2020-08-26 2020-11-24 河南科技大学 Parallel recommendation method integrating emotion calculation and network crowdsourcing
CN112036951A (en) * 2020-09-03 2020-12-04 猪八戒股份有限公司 Business opportunity recommendation method, system, electronic device and medium based on CNN model
CN112199455A (en) * 2020-09-14 2021-01-08 汉海信息技术(上海)有限公司 Method and device for sorting geographic information points, electronic equipment and computer medium
CN112765400A (en) * 2020-12-31 2021-05-07 上海众源网络有限公司 Weight updating method of interest tag, content recommendation method, device and equipment
CN112988845A (en) * 2021-04-01 2021-06-18 毕延杰 Data information processing method and information service platform in big data service scene
CN113378065A (en) * 2021-07-09 2021-09-10 小红书科技有限公司 Method for determining content diversity based on sliding spectrum decomposition and method for selecting content
CN114201680A (en) * 2021-12-13 2022-03-18 中数通信息有限公司 Method for recommending marketing product content to user
CN114780606A (en) * 2022-03-30 2022-07-22 欧阳安安 Big data mining method and system
CN114817774A (en) * 2022-05-12 2022-07-29 中国人民解放军国防科技大学 Method for determining social behavior relationship among space-time co-occurrence area, non-public place and user
CN115659046A (en) * 2022-11-10 2023-01-31 果子(青岛)数字技术有限公司 AI big data based technical transaction recommendation system and method
CN116541731A (en) * 2023-05-26 2023-08-04 北京百度网讯科技有限公司 Processing method, device and equipment of network behavior data
CN116821228A (en) * 2023-06-01 2023-09-29 成都亚保科技有限公司 Visual configuration method for insurance products based on data analysis
CN117078362A (en) * 2023-10-17 2023-11-17 北京铭洋商务服务有限公司 Personalized travel route recommendation method and system
CN118396684A (en) * 2024-06-26 2024-07-26 广东省广告集团股份有限公司 User advertisement recommendation method, device and storage medium based on converged neural network

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102402766A (en) * 2011-12-27 2012-04-04 纽海信息技术(上海)有限公司 User interest modeling method based on webpage browsing
CN102542489A (en) * 2011-12-27 2012-07-04 纽海信息技术(上海)有限公司 Recommendation method based on user interest association
CN103106208A (en) * 2011-11-11 2013-05-15 中国移动通信集团公司 Streaming media content recommendation method and system in mobile internet
CN103399883A (en) * 2013-07-19 2013-11-20 百度在线网络技术(北京)有限公司 Method and system for performing personalized recommendation according to user interest points/concerns
CN103678710A (en) * 2013-12-31 2014-03-26 同济大学 Information recommendation method based on user behaviors
CN103927347A (en) * 2014-04-01 2014-07-16 复旦大学 Collaborative filtering recommendation algorithm based on user behavior models and ant colony clustering
CN104809243A (en) * 2015-05-15 2015-07-29 东南大学 Mixed recommendation method based on excavation of user behavior compositing factor
CN105426548A (en) * 2015-12-29 2016-03-23 海信集团有限公司 Video recommendation method and device based on multiple users
CN105512326A (en) * 2015-12-23 2016-04-20 成都品果科技有限公司 Picture recommending method and system

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103106208A (en) * 2011-11-11 2013-05-15 中国移动通信集团公司 Streaming media content recommendation method and system in mobile internet
CN102402766A (en) * 2011-12-27 2012-04-04 纽海信息技术(上海)有限公司 User interest modeling method based on webpage browsing
CN102542489A (en) * 2011-12-27 2012-07-04 纽海信息技术(上海)有限公司 Recommendation method based on user interest association
CN103399883A (en) * 2013-07-19 2013-11-20 百度在线网络技术(北京)有限公司 Method and system for performing personalized recommendation according to user interest points/concerns
CN103678710A (en) * 2013-12-31 2014-03-26 同济大学 Information recommendation method based on user behaviors
CN103927347A (en) * 2014-04-01 2014-07-16 复旦大学 Collaborative filtering recommendation algorithm based on user behavior models and ant colony clustering
CN104809243A (en) * 2015-05-15 2015-07-29 东南大学 Mixed recommendation method based on excavation of user behavior compositing factor
CN105512326A (en) * 2015-12-23 2016-04-20 成都品果科技有限公司 Picture recommending method and system
CN105426548A (en) * 2015-12-29 2016-03-23 海信集团有限公司 Video recommendation method and device based on multiple users

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
HYUNG JUN AHN: ""A new similarity measure for collaborative filtering to alleviate the new user cold-starting problem"", 《ELSEVIER》 *
胡旭 等: ""初始聚类中心优化的K-均值项目聚类推荐算法"", 《空军预警学院学报》 *
胡畔: ""基于关联规则的跨平台个性化推荐算法及实现"", 《中国优秀硕士学位论文全文数据库》 *

Cited By (68)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106919653A (en) * 2017-01-24 2017-07-04 广西师范学院 Daily record filter method based on user behavior
CN106919653B (en) * 2017-01-24 2020-12-15 南宁师范大学 Log filtering method based on user behavior
CN106952130B (en) * 2017-02-27 2020-10-27 华南理工大学 General article recommendation method based on collaborative filtering
CN106952130A (en) * 2017-02-27 2017-07-14 华南理工大学 Common user item based on collaborative filtering recommends method
CN106980662A (en) * 2017-03-21 2017-07-25 上海星红桉数据科技有限公司 Based on magnanimity across the user tag sorting technique for shielding viewing behavior data
CN106851349A (en) * 2017-03-21 2017-06-13 上海星红桉数据科技有限公司 Based on magnanimity across the live recommendation method for shielding viewing behavior data
CN108733684A (en) * 2017-04-17 2018-11-02 合信息技术(北京)有限公司 The recommendation method and device of multimedia resource
CN107169821A (en) * 2017-05-02 2017-09-15 杭州泰指尚科技有限公司 Big data inquires about recommendation method and its system
CN107169821B (en) * 2017-05-02 2020-12-15 杭州泰一指尚科技有限公司 Big data query recommendation method and system
CN107194769A (en) * 2017-05-17 2017-09-22 东莞市华睿电子科技有限公司 A kind of Method of Commodity Recommendation that content is searched for based on user
WO2018218403A1 (en) * 2017-05-27 2018-12-06 深圳大学 Content pushing method and device
CN107357835A (en) * 2017-06-22 2017-11-17 电子科技大学 It is a kind of that method for digging and system are predicted based on the interest of topic model and forgetting law
CN107357835B (en) * 2017-06-22 2020-11-03 电子科技大学 Interest prediction mining method and system based on topic model and forgetting rule
CN107391687A (en) * 2017-07-24 2017-11-24 华中师范大学 A kind of mixing commending system towards local chronicle website
CN107886357A (en) * 2017-11-06 2018-04-06 北京希格斯科技发展有限公司 The method and system of content value is judged based on user behavior data
CN107944485A (en) * 2017-11-17 2018-04-20 西安电子科技大学 The commending system and method, personalized recommendation system found based on cluster group
CN107944485B (en) * 2017-11-17 2020-03-06 西安电子科技大学 Recommendation system and method based on cluster group discovery and personalized recommendation system
CN108717654B (en) * 2018-05-17 2022-03-25 南京大学 Multi-provider cross recommendation method based on clustering feature migration
CN108717654A (en) * 2018-05-17 2018-10-30 南京大学 A kind of more electric business intersection recommendation method based on cluster feature migration
CN108874959A (en) * 2018-06-06 2018-11-23 电子科技大学 A kind of user's dynamic interest model method for building up based on big data technology
CN108874959B (en) * 2018-06-06 2022-03-29 电子科技大学 User dynamic interest model building method based on big data technology
CN108846698A (en) * 2018-06-14 2018-11-20 安徽鼎龙网络传媒有限公司 A kind of micro- scene management backstage wechat store cloud processing compressibility
CN108921670B (en) * 2018-07-04 2022-06-14 重庆大学 Drug transaction recommendation method fusing potential interest, spatio-temporal data and category popularity of user
CN108921670A (en) * 2018-07-04 2018-11-30 重庆大学 A kind of potential interest of fusion user, the Drug trading recommended method of space-time data and classification popularity
CN109727056A (en) * 2018-07-06 2019-05-07 平安科技(深圳)有限公司 Financial institution's recommended method, equipment, storage medium and device
CN109727056B (en) * 2018-07-06 2023-04-18 平安科技(深圳)有限公司 Financial institution recommendation method, device, storage medium and device
CN109034248B (en) * 2018-07-27 2022-04-05 电子科技大学 Deep learning-based classification method for noise-containing label images
CN109034248A (en) * 2018-07-27 2018-12-18 电子科技大学 A kind of classification method of the Noise label image based on deep learning
WO2020088058A1 (en) * 2018-10-31 2020-05-07 北京字节跳动网络技术有限公司 Information generating method and device
CN109543109A (en) * 2018-11-27 2019-03-29 山东建筑大学 A kind of proposed algorithm of time of fusion window setting technique and score in predicting model
CN109543109B (en) * 2018-11-27 2021-06-22 山东建筑大学 Recommendation algorithm integrating time window technology and scoring prediction model
CN109684552A (en) * 2018-12-26 2019-04-26 云南宾飞科技有限公司 A kind of intelligent information recommendation system
CN110135463A (en) * 2019-04-18 2019-08-16 微梦创科网络科技(中国)有限公司 A kind of commodity method for pushing and device
CN110060129A (en) * 2019-04-22 2019-07-26 深圳市活力天汇科技股份有限公司 A kind of air ticket intelligent recommendation method
CN111861526A (en) * 2019-04-30 2020-10-30 京东城市(南京)科技有限公司 Method and device for analyzing object source
CN111861526B (en) * 2019-04-30 2024-05-21 京东城市(南京)科技有限公司 Method and device for analyzing object source
CN110807052B (en) * 2019-11-05 2022-08-02 佳都科技集团股份有限公司 User group classification method, device, equipment and storage medium
CN110807052A (en) * 2019-11-05 2020-02-18 佳都新太科技股份有限公司 User group classification method, device, equipment and storage medium
CN110852846A (en) * 2019-11-11 2020-02-28 京东数字科技控股有限公司 Processing method and device for recommended object, electronic equipment and storage medium
CN111209486B (en) * 2019-12-19 2023-04-11 杭州安恒信息技术股份有限公司 Management platform data recommendation method based on mixed recommendation rule
CN111209486A (en) * 2019-12-19 2020-05-29 杭州安恒信息技术股份有限公司 Management platform data recommendation method based on mixed recommendation rule
CN111209474A (en) * 2019-12-27 2020-05-29 广东德诚科教有限公司 Online course recommendation method and device, computer equipment and storage medium
CN111400591B (en) * 2020-03-11 2023-04-07 深圳市雅阅科技有限公司 Information recommendation method and device, electronic equipment and storage medium
CN111400591A (en) * 2020-03-11 2020-07-10 腾讯科技(北京)有限公司 Information recommendation method and device, electronic equipment and storage medium
CN111506813A (en) * 2020-04-08 2020-08-07 中国电子科技集团公司第五十四研究所 Remote sensing information accurate recommendation method based on user portrait
CN111984874A (en) * 2020-08-26 2020-11-24 河南科技大学 Parallel recommendation method integrating emotion calculation and network crowdsourcing
CN111984874B (en) * 2020-08-26 2022-07-22 河南科技大学 Parallel recommendation method integrating emotion calculation and network crowdsourcing
CN112036951A (en) * 2020-09-03 2020-12-04 猪八戒股份有限公司 Business opportunity recommendation method, system, electronic device and medium based on CNN model
CN112199455A (en) * 2020-09-14 2021-01-08 汉海信息技术(上海)有限公司 Method and device for sorting geographic information points, electronic equipment and computer medium
CN112765400B (en) * 2020-12-31 2024-04-23 上海众源网络有限公司 Weight updating method, content recommending method, device and equipment for interest labels
CN112765400A (en) * 2020-12-31 2021-05-07 上海众源网络有限公司 Weight updating method of interest tag, content recommendation method, device and equipment
CN112988845A (en) * 2021-04-01 2021-06-18 毕延杰 Data information processing method and information service platform in big data service scene
CN112988845B (en) * 2021-04-01 2021-11-16 湖南机械之家信息科技有限公司 Data information processing method and information service platform in big data service scene
CN113378065A (en) * 2021-07-09 2021-09-10 小红书科技有限公司 Method for determining content diversity based on sliding spectrum decomposition and method for selecting content
CN113378065B (en) * 2021-07-09 2023-07-04 小红书科技有限公司 Method for determining content diversity based on sliding spectrum decomposition and method for selecting content
CN114201680A (en) * 2021-12-13 2022-03-18 中数通信息有限公司 Method for recommending marketing product content to user
CN114780606A (en) * 2022-03-30 2022-07-22 欧阳安安 Big data mining method and system
CN114817774A (en) * 2022-05-12 2022-07-29 中国人民解放军国防科技大学 Method for determining social behavior relationship among space-time co-occurrence area, non-public place and user
CN114817774B (en) * 2022-05-12 2023-08-22 中国人民解放军国防科技大学 Method for determining social behavior relationship among space-time co-occurrence area, non-public place and user
CN115659046B (en) * 2022-11-10 2023-03-10 果子(青岛)数字技术有限公司 AI big data based technical transaction recommendation system and method
CN115659046A (en) * 2022-11-10 2023-01-31 果子(青岛)数字技术有限公司 AI big data based technical transaction recommendation system and method
CN116541731A (en) * 2023-05-26 2023-08-04 北京百度网讯科技有限公司 Processing method, device and equipment of network behavior data
CN116541731B (en) * 2023-05-26 2024-07-23 北京百度网讯科技有限公司 Processing method, device and equipment of network behavior data
CN116821228A (en) * 2023-06-01 2023-09-29 成都亚保科技有限公司 Visual configuration method for insurance products based on data analysis
CN117078362A (en) * 2023-10-17 2023-11-17 北京铭洋商务服务有限公司 Personalized travel route recommendation method and system
CN117078362B (en) * 2023-10-17 2023-12-29 北京铭洋商务服务有限公司 Personalized travel route recommendation method and system
CN118396684A (en) * 2024-06-26 2024-07-26 广东省广告集团股份有限公司 User advertisement recommendation method, device and storage medium based on converged neural network
CN118396684B (en) * 2024-06-26 2024-09-20 广东省广告集团股份有限公司 User advertisement recommendation method and device based on fused neural network and model construction method thereof

Similar Documents

Publication Publication Date Title
CN106339502A (en) Modeling recommendation method based on user behavior data fragmentation cluster
CN107577759B (en) Automatic recommendation method for user comments
KR102075833B1 (en) Curation method and system for recommending of art contents
CN104199822B (en) It is a kind of to identify the method and system for searching for corresponding demand classification
CN103914478B (en) Webpage training method and system, webpage Forecasting Methodology and system
CN101246499B (en) Network information search method and system
CN111708740A (en) Mass search query log calculation analysis system based on cloud platform
EP2560111A2 (en) Systems and methods for facilitating the gathering of open source intelligence
US9031944B2 (en) System and method for providing multi-core and multi-level topical organization in social indexes
US20060004753A1 (en) System and method for document analysis, processing and information extraction
CN103455487B (en) The extracting method and device of a kind of search term
CN105426514A (en) Personalized mobile APP recommendation method
CN103425799A (en) Personalized research direction recommending system and method based on themes
WO2001025947A1 (en) Method of dynamically recommending web sites and answering user queries based upon affinity groups
CN101408886A (en) Selecting tags for a document by analyzing paragraphs of the document
CN102609523A (en) Collaborative filtering recommendation algorithm based on article sorting and user sorting
CN103186550A (en) Method and system for generating video-related video list
CN102236646A (en) Personalized item-level vertical pagerank algorithm iRank
JP2011154668A (en) Method for recommending the most appropriate information in real time by properly recognizing main idea of web page and preference of user
CN104484431A (en) Multi-source individualized news webpage recommending method based on field body
Allisio et al. Felicittà: Visualizing and estimating happiness in italian cities from geotagged tweets
CN103678710A (en) Information recommendation method based on user behaviors
CN110717089A (en) User behavior analysis system and method based on weblog
CN108021715A (en) Isomery tag fusion system based on semantic structure signature analysis
CN102567392A (en) Control method for interest subject excavation based on time window

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170118