WO2017005119A1 - Method and device for implementing individualized guidance - Google Patents
Method and device for implementing individualized guidance Download PDFInfo
- Publication number
- WO2017005119A1 WO2017005119A1 PCT/CN2016/087465 CN2016087465W WO2017005119A1 WO 2017005119 A1 WO2017005119 A1 WO 2017005119A1 CN 2016087465 W CN2016087465 W CN 2016087465W WO 2017005119 A1 WO2017005119 A1 WO 2017005119A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- user
- access
- frequent
- access sequence
- frequent access
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
Definitions
- the invention belongs to the technical field of computers, and in particular relates to a method and a device for implementing personalized guidance.
- a customized menu is provided for users on the application webpage.
- the user selects a function button on the webpage, the user pops up a menu for the user to make the next operation selection, and the pop-up menu may also be a translucent mask layer, and the mask layer has a user for the next operation selection. Select items to provide functional guidance to the user.
- the options used in the existing layer are used according to the business experience.
- the user can choose a large number of options, and can not include all the options in the layer. in.
- Even the choices included in the mask are not necessarily the most reasonable choices, and some custom choices may be less common for the user, and the usual ones are not necessarily in the mask.
- the Chinese invention patent with the publication number of CN 103092471A discloses a method and a terminal for realizing a dynamic function menu.
- the method counts the user usage frequency of each function menu, obtains the function menu usage frequency statistics information, and uses the frequency statistics information according to the function menu to
- the system presets the layout or layout form, dynamically adjusting the function menu list or layout order.
- the menu can be properly arranged or adjusted according to the user's usage habits or form requirements, so as to improve the user experience of the menu and facilitate the operation of the user.
- the function menu list is adjusted according to the usage frequency statistical information, that is, the adjustment of the already arranged function menu.
- the user For the user's online operation, the user has more and more options for the next operation target, which can be another website or other sections on the same webpage.
- it is impossible to provide the user with the guidance of the next operation according to the current operation of the user, and the user needs to find the page click operation of the next operation, which wastes a lot of unnecessary time and energy for the user.
- the user operates It is even more inconvenient to do.
- An object of the present invention is to provide a method and a device for implementing personalized guidance, which can provide personalized guidance for the user in the next operation according to the current operation behavior, historical operation information, and user attribute information of the user, which facilitates the operation of the user. Improved user experience.
- a method of implementing personalized guidance including:
- the frequent access mode set is obtained by performing frequent access pattern mining according to a user history access sequence set.
- the performing frequent access pattern mining according to the user history access sequence set includes:
- Frequent access pattern mining for user history access sequence sets results in a set of frequent access patterns.
- the pre-processing according to the web server log file includes data purification, user identification, session identification, and path replenishing steps.
- the frequent access pattern mining is performed on the user history access sequence set, and the prefixspan data mining algorithm is adopted.
- each access sequence in the user access sequence set is matched with the items in the frequent access mode set, and the longest matching selection method is adopted.
- the longest matching selection method sets a frequent access mode that includes a frequent access mode in which the number of webpage nodes in the user access sequence is greater than the first preset.
- the longest matching selection method sets a frequent access mode in which the first N frequent access modes that match the number of webpage nodes in the user access sequence are matched.
- the invention also proposes a device for implementing personalized guidance, which comprises:
- a mining module configured to perform frequent access pattern mining according to a user history access sequence set to obtain a frequent access mode set
- An identification module configured to identify a user access sequence set according to user access information
- a matching module configured to match each access sequence in the user access sequence set with an item in the frequent access mode set to obtain a corresponding rule set
- a boot module that is used to recommend the latter item of all rules in the rule set as a function boot list to the user.
- the mining module when performing the frequent access mode mining, performs the following operations:
- Frequent access pattern mining for user history access sequence sets results in a set of frequent access patterns.
- the mining module performs data purification, user identification, session identification, and path supplement processing on the web server log file during preprocessing; and uses the prefixspan data mining algorithm when performing frequent access pattern mining.
- the matching module uses the longest matching selection method for matching.
- the method and device for implementing the personalized guidance provided by the present invention pre-processes the web log data accessed by the user, excavates the frequent access mode set of the user, and then matches the rule set, and outputs the function guide to the user client browser.
- a list that allows the user to select what will be done next. Provide personalized guidance for the user's next operation, which is convenient for the user's operation and improves the user experience.
- FIG. 1 is a flow chart of an offline processing process in a method for implementing personalized booting according to the present invention
- FIG. 2 is a flow chart of a method for implementing personalized guidance according to the present invention
- FIG. 3 is a schematic structural diagram of an apparatus for implementing personalized guidance according to the present invention.
- the general idea of the present invention is to preprocess the web log data, and then use the prefixspan algorithm to mine the frequent access sequence pattern set from the user's web access log, and then use the longest matching selection method to frequently access according to the obtained user current access sequence.
- the pattern set is matched to obtain the corresponding rule set.
- the web server program processes the rule set and outputs a function boot list to the client browser.
- This example takes the “Yu-Yi Room” Amazon Cloud Public Computing Platform as an example to illustrate.
- the “Yu-Yi Room” Facebook Cloud Public Computing Platform provides precision marketing solutions, merchant research and solutions for Taobao merchants, independent software developers, enterprises and scientific research institutions. Programs, affiliate marketing solutions, enterprise cloud data solutions, etc., there are many user-operable options, and it is necessary to provide users with a function guide based on the user's frequent access mode.
- a method for implementing personalized guidance includes an offline processing process and an online processing process, which are separately described below.
- the offline processing process processes the web server log file through data purification, user identification, session identification, path supplementation, etc., to obtain a user history access sequence set, and then uses the optimized prefixspan data mining algorithm to the user.
- the historical access sequence set performs frequent access pattern mining to obtain a frequent access mode set, that is, frequent access pattern mining for the user historical access sequence set to obtain a frequent access mode set.
- An access sequence is a user's access to a website, which contains all the page nodes that have been visited at one time. The accessed page nodes are arranged in chronological order as access sequences.
- the data cleaning is to extract necessary fields from the web server log file, such as user ID, time, page node ID, etc.; user identification is to add the user ID through the cookie and the user ID correspondence table; The identification is divided into multiple sessions by a predetermined time unit, for example, 60 minutes, that is, the user's access path is divided into multiple access sequences, each session corresponds to one access sequence; finally, for the reason that the server synchronizes data It is possible that the page node that the user has visited has a lost situation, and the lost path is added through the website structure map for subsequent analysis. After the above pre-processing, you can get a user history visit. Ask the sequence collection.
- the specific prefixspan algorithm is implemented as follows:
- the input sequence pattern is equivalent to an access sequence, and the final output is frequently sequenced.
- the mode is the frequent access mode, and all the frequent access modes constitute a frequent access mode set.
- the prefixspan algorithm which is commonly used in data mining
- the core idea of the online processing part is to process the user access information collected by the client through the web server, obtain the user's access path, perform session identification on the user's access path, and obtain the user access sequence set; then each of the access sequence sets An access sequence is used to match the access sequence set with the items in the frequent access mode set by using the longest matching selection method to obtain a corresponding rule set; finally, the latter item of all rules in the rule set is recommended as a function boot list to the user.
- the user logs in the “Yu Yu Fang” Facebook Cloud public computing platform on his computer.
- the “Yu Yu Fang” Facebook Cloud public computing platform collects user access information through the client browser and sends it to the back-end web server of the Facebook Cloud public computing platform.
- the web server performs session identification on the user access information, and divides the user's access path into multiple sessions in a predetermined time unit (for example, 60 minutes), that is, splits into multiple access sequences to obtain a user access sequence set. Different from the offline processing process, what is obtained here is the user's current access sequence set.
- the longest match selection method is used to match each access sequence with an item in the frequent access mode set, and each item in the frequent access mode set is a frequent access mode.
- the longest matching selection method is adopted, that is, the items in the matched frequent access mode set contain most of the webpage nodes in the user access sequence, for example, the items in the frequent access mode set may be set to contain more than 70% of the number of webpage nodes in the user access sequence.
- the item in the frequent access mode set is a match; or the frequent access mode in which the first N frequent access modes match the maximum number of webpage nodes in the user access sequence is matched.
- the frequent access mode set has B 1 , B 2 , and B 3 three frequent access modes respectively being the first three frequent accesses that best match the user access sequence A 1 .
- B 1 contains 10 web page nodes of all user access sequence A 1
- B 2 contains 9 of 10 web page nodes of user access sequence A 1
- B 3 contains 10 web page nodes of user access sequence A 1
- the set N is 3, then B 1 , B 2 , and B 3 are matched frequent access patterns, and their set is the rule set.
- the latter part of the rule refers to the webpage nodes that are not included in the user access list in the rule. These access nodes are webpage nodes that the user may visit later.
- Node B for example, page 15 contains hypothesis, of which 10 have the same user accessing the page sequence node, then the node is the page 5 page node to the user may access.
- the latter item of all rules is recommended to the user as a function guide list, and displayed as a mask on the user's browser for the user to select. Therefore, the user can directly select the webpage node in the mask layer that he wants to access, and realize direct access.
- the frequent pattern of the user is mined and stored in the database.
- the user's current access sequence ie, the current operation path
- the next operation set is found from the user's frequent access mode set according to the “longest matching mode”, such as “go to the authorization center” and transmitted to the user operation page.
- the page dynamically changes the link in the navigation tab, and the selection items such as “Go to the authorization center” are displayed in the navigation tab for the user to select, and the navigation label is displayed on the user browser in the form of a layer.
- the device for implementing the personalized guidance based on the foregoing method, as shown in FIG. 3, includes:
- a mining module configured to perform frequent access pattern mining according to a user history access sequence set to obtain a frequent access mode set
- An identification module configured to identify a user access sequence set according to user access information
- a matching module configured to match each access sequence in the user access sequence set with an item in the frequent access mode set to obtain a corresponding rule set
- a boot module that is used to recommend the latter item of all rules in the rule set as a function boot list to the user.
- Frequent access pattern mining for user history access sequence sets results in a set of frequent access patterns.
- the mining module needs to perform data purification, user identification, session identification, and path supplement processing on the web server log file during the pre-processing; when performing frequent access pattern mining, the prefixspan data mining algorithm is adopted.
- the matching module uses the longest matching selection method to perform matching.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Transfer Between Computers (AREA)
Abstract
A method and device for implementing individualized guidance. The method comprises: according to a historical access sequence set of a user, conducting frequent access pattern mining to obtain a frequent access pattern set, and then, according to current access information about the user, recognizing a user access sequence set (S1); matching each access sequence in the user access sequence set with items in the frequent access sequence set, so as to obtain a corresponding rule set (S2); and recommending to the user the latter items of all the rules in the rule set as a function guidance list(S3). Accordingly, the device comprises a mining module, a recognition module, a matching module and a guidance module. The method and device can provide individualized guidance for the next operation of a user, thereby facilitating the operation of the user, and improving the user experience.
Description
本发明属于计算机技术领域,尤其涉及一种个性化引导的实现方法及装置。The invention belongs to the technical field of computers, and in particular relates to a method and a device for implementing personalized guidance.
在计算机技术领域,尤其是目前基于互联网提供的各种应用,在应用的网页上为用户提供了订制的菜单。当用户选择网页上的一个功能按钮后,会为用户弹出供用户进行下一个操作选择的菜单,所弹出的菜单也可以是半透明的蒙层,蒙层中具有供用户进行下一个操作选择的选择项,为用户提供功能引导。In the field of computer technology, especially the various applications currently provided based on the Internet, a customized menu is provided for users on the application webpage. When the user selects a function button on the webpage, the user pops up a menu for the user to make the next operation selection, and the pop-up menu may also be a translucent mask layer, and the mask layer has a user for the next operation selection. Select items to provide functional guidance to the user.
然而现有的蒙层中那些用来引导的选择项都是事先根据业务经验设置的,在功能繁多的应用系统中,用户可以选择的选择项非常多,无法将所有选择项都包括在蒙层中。即使包括在蒙层中的选择项也未必是最合理的选择项,有些订制的选择项也许用户不太常用,而常用的未必在蒙层中。However, the options used in the existing layer are used according to the business experience. In a variety of applications, the user can choose a large number of options, and can not include all the options in the layer. in. Even the choices included in the mask are not necessarily the most reasonable choices, and some custom choices may be less common for the user, and the usual ones are not necessarily in the mask.
公开号为CN 103092471A的中国发明专利公开了一种动态功能菜单的实现方法和终端,该方法统计各个功能菜单的用户使用频率,得到功能菜单使用频率统计信息,根据功能菜单使用频率统计信息,以系统预设的排列形式或布局形式,动态调整功能菜单列表或布局的排序。能够根据用户的使用习惯或形式需求,对菜单进行适当的排列或布局调整,以提高菜单的用户体验,方便用户的操作。The Chinese invention patent with the publication number of CN 103092471A discloses a method and a terminal for realizing a dynamic function menu. The method counts the user usage frequency of each function menu, obtains the function menu usage frequency statistics information, and uses the frequency statistics information according to the function menu to The system presets the layout or layout form, dynamically adjusting the function menu list or layout order. The menu can be properly arranged or adjusted according to the user's usage habits or form requirements, so as to improve the user experience of the menu and facilitate the operation of the user.
但是该发明中仅仅对已经具有的各个功能菜单的用户使用频率进行统计,并根据使用频率统计信息调整功能菜单列表,即对已经具有的功能菜单进行排列的调整。而对于用户的上网操作来说,用户下一个操作目标的可选择项越来越多,可以是另一个网站,或是在同一个网页上的其他板块。在现有技术中无法根据用户当前的操作,为用户提供下一操作的引导,需要用户自己来寻找下一步操作的页面点击进入操作,这对于用户来说,浪费了大量不必要的时间和精力,对于功能繁杂的应用系统来说,用户操
作起来更加不便。However, in the invention, only the user usage frequency of each function menu that has been already performed is counted, and the function menu list is adjusted according to the usage frequency statistical information, that is, the adjustment of the already arranged function menu. For the user's online operation, the user has more and more options for the next operation target, which can be another website or other sections on the same webpage. In the prior art, it is impossible to provide the user with the guidance of the next operation according to the current operation of the user, and the user needs to find the page click operation of the next operation, which wastes a lot of unnecessary time and energy for the user. For a complicated application system, the user operates
It is even more inconvenient to do.
发明内容Summary of the invention
本发明的目的是提供一种个性化引导的实现方法及装置,能够根据用户的当前操作行为、历史操作信息、以及用户属性信息,为用户下一步操作提供个性化引导,方便了用户的操作,提升了用户体验。An object of the present invention is to provide a method and a device for implementing personalized guidance, which can provide personalized guidance for the user in the next operation according to the current operation behavior, historical operation information, and user attribute information of the user, which facilitates the operation of the user. Improved user experience.
为了实现上述目的,本发明技术方案如下:In order to achieve the above object, the technical solution of the present invention is as follows:
一种个性化引导的实现方法,包括:A method of implementing personalized guidance, including:
根据用户访问信息,识别出用户访问序列集;Identifying a user access sequence set based on user access information;
将用户访问序列集中的每一个访问序列,与频繁访问模式集合中的项进行匹配,获得相应的规则集;Matching each access sequence in the user access sequence set with the items in the frequent access mode set to obtain a corresponding rule set;
将规则集中所有规则的后项作为功能引导列表推荐给用户;Recommend the latter item of all rules in the rule set as a function guide list to the user;
其中,所述频繁访问模式集合是根据用户历史访问序列集合进行频繁访问模式挖掘得到的。The frequent access mode set is obtained by performing frequent access pattern mining according to a user history access sequence set.
进一步地,所述根据用户历史访问序列集合进行频繁访问模式挖掘,包括:Further, the performing frequent access pattern mining according to the user history access sequence set includes:
根据Web服务器日志文件进行预处理得到用户历史访问序列集;Preprocessing according to the web server log file to obtain a user history access sequence set;
对用户历史访问序列集进行频繁访问模式挖掘得到频繁访问模式集合。Frequent access pattern mining for user history access sequence sets results in a set of frequent access patterns.
其中,所述根据Web服务器日志文件进行预处理,包括数据净化、用户识别、会话识别、路径补充步骤。所述对用户历史访问序列集进行频繁访问模式挖掘,采用prefixspan数据挖掘算法。The pre-processing according to the web server log file includes data purification, user identification, session identification, and path replenishing steps. The frequent access pattern mining is performed on the user history access sequence set, and the prefixspan data mining algorithm is adopted.
本发明所述将用户访问序列集中的每一个访问序列,与频繁访问模式集合中的项进行匹配,采用最长匹配选择法。According to the invention, each access sequence in the user access sequence set is matched with the items in the frequent access mode set, and the longest matching selection method is adopted.
本发明的一种实现方式,所述最长匹配选择法,设置含有用户访问序列中网页节点数大于第一设定预置的频繁访问模式为匹配的频繁访问模式。In an implementation manner of the present invention, the longest matching selection method sets a frequent access mode that includes a frequent access mode in which the number of webpage nodes in the user access sequence is greater than the first preset.
本发明的另一种实现方式,所述最长匹配选择法,设置与用户访问序列中网页节点数匹配最多的前N个频繁访问模式为匹配的频繁访问模式。
In another implementation manner of the present invention, the longest matching selection method sets a frequent access mode in which the first N frequent access modes that match the number of webpage nodes in the user access sequence are matched.
本发明还同时提出了一种个性化引导的实现装置,包括:The invention also proposes a device for implementing personalized guidance, which comprises:
挖掘模块,用于根据用户历史访问序列集合进行频繁访问模式挖掘得到频繁访问模式集合;a mining module, configured to perform frequent access pattern mining according to a user history access sequence set to obtain a frequent access mode set;
识别模块,用于根据用户访问信息,识别出用户访问序列集;An identification module, configured to identify a user access sequence set according to user access information;
匹配模块,用于将用户访问序列集中的每一个访问序列,与频繁访问模式集合中的项进行匹配,获得相应的规则集;a matching module, configured to match each access sequence in the user access sequence set with an item in the frequent access mode set to obtain a corresponding rule set;
引导模块,用于将规则集中所有规则的后项作为功能引导列表推荐给用户。A boot module that is used to recommend the latter item of all rules in the rule set as a function boot list to the user.
进一步地,所述挖掘模块在进行频繁访问模式挖掘时,执行如下操作:Further, when performing the frequent access mode mining, the mining module performs the following operations:
根据Web服务器日志文件进行预处理得到用户历史访问序列集;Preprocessing according to the web server log file to obtain a user history access sequence set;
对用户历史访问序列集进行频繁访问模式挖掘得到频繁访问模式集合。Frequent access pattern mining for user history access sequence sets results in a set of frequent access patterns.
其中所述挖掘模块在进行预处理时,对Web服务器日志文件进行数据净化、用户识别、会话识别、路径补充处理;在进行频繁访问模式挖掘时,采用prefixspan数据挖掘算法。The mining module performs data purification, user identification, session identification, and path supplement processing on the web server log file during preprocessing; and uses the prefixspan data mining algorithm when performing frequent access pattern mining.
进一步地,所述匹配模块采用最长匹配选择法来进行匹配。Further, the matching module uses the longest matching selection method for matching.
本发明提出的一种个性化引导的实现方法及装置,通过对用户访问的Web日志数据进行预处理,挖掘出用户频繁访问模式集合,然后匹配出规则集,向用户客户端浏览器输出功能引导列表,便于用户选择下一步将要进行的操作。为用户下一步操作提供个性化引导,方便了用户的操作,提升了用户体验。The method and device for implementing the personalized guidance provided by the present invention pre-processes the web log data accessed by the user, excavates the frequent access mode set of the user, and then matches the rule set, and outputs the function guide to the user client browser. A list that allows the user to select what will be done next. Provide personalized guidance for the user's next operation, which is convenient for the user's operation and improves the user experience.
图1为本发明一种个性化引导的实现方法中离线处理过程流程图;1 is a flow chart of an offline processing process in a method for implementing personalized booting according to the present invention;
图2为本发明一种个性化引导的实现方法流程图;2 is a flow chart of a method for implementing personalized guidance according to the present invention;
图3为本发明一种个性化引导的实现装置结构示意图。FIG. 3 is a schematic structural diagram of an apparatus for implementing personalized guidance according to the present invention.
下面结合附图和实施例对本发明技术方案做进一步详细说明,以下实
施例不构成对本发明的限定。The technical solution of the present invention will be further described in detail below with reference to the accompanying drawings and embodiments.
The examples do not constitute a limitation of the invention.
本发明的总体思路是将Web日志数据预处理,然后利用prefixspan算法从用户的web访问日志中挖掘出频繁访问序列模式集合,然后根据获得的用户当前访问序列,利用最长匹配选择法在频繁访问模式集合中进行匹配,从而获得相应的规则集,最后由Web服务器端程序对规则集合进行处理后向客户端浏览器输出功能引导列表。本实施例以“御膳房”阿里云公共计算平台为例来进行说明,“御膳房”阿里云公共计算平台为淘宝商家、独立软件开发商、企业、科研机构提供精准营销解决方案、商家调研解决方案、会员营销解决方案、企业云数据解决方案等等,用户可操作的选择项非常多,需要为用户提供一种基于用户频繁访问模式的功能引导。The general idea of the present invention is to preprocess the web log data, and then use the prefixspan algorithm to mine the frequent access sequence pattern set from the user's web access log, and then use the longest matching selection method to frequently access according to the obtained user current access sequence. The pattern set is matched to obtain the corresponding rule set. Finally, the web server program processes the rule set and outputs a function boot list to the client browser. This example takes the “Yu-Yi Room” Alibaba Cloud Public Computing Platform as an example to illustrate. The “Yu-Yi Room” Alibaba Cloud Public Computing Platform provides precision marketing solutions, merchant research and solutions for Taobao merchants, independent software developers, enterprises and scientific research institutions. Programs, affiliate marketing solutions, enterprise cloud data solutions, etc., there are many user-operable options, and it is necessary to provide users with a function guide based on the user's frequent access mode.
本实施例一种个性化引导的实现方法,包括离线处理过程和在线处理过程,以下分别进行说明。In this embodiment, a method for implementing personalized guidance includes an offline processing process and an online processing process, which are separately described below.
如图1所示,离线处理过程是将Web服务器日志文件经过数据净化、用户识别、会话识别、路径补充等步骤进行处理,得到用户历史访问序列集合,然后利用优化后的prefixspan数据挖掘算法对用户历史访问序列集合进行频繁访问模式挖掘得到频繁访问模式集合,即对用户历史访问序列集合进行频繁访问模式挖掘得到频繁访问模式集合。As shown in FIG. 1 , the offline processing process processes the web server log file through data purification, user identification, session identification, path supplementation, etc., to obtain a user history access sequence set, and then uses the optimized prefixspan data mining algorithm to the user. The historical access sequence set performs frequent access pattern mining to obtain a frequent access mode set, that is, frequent access pattern mining for the user historical access sequence set to obtain a frequent access mode set.
用户在“御膳房”阿里云公共计算平台上的访问都记录在Web服务器日志文件中,将Web服务器日志文件经过数据净化、用户识别、会话识别、路径补充等步骤进行处理后,能够得到用户历史访问序列集合。访问序列是一个用户对网站的一次访问,其中包含一次访问过的所有页面节点,所访问的页面节点按照时间顺序排列为访问序列。The user's access to the "Yu Yu Fang" Alibaba Cloud public computing platform is recorded in the Web server log file, and the Web server log file can be processed by data purification, user identification, session identification, path supplementation, etc. Access the sequence collection. An access sequence is a user's access to a website, which contains all the page nodes that have been visited at one time. The accessed page nodes are arranged in chronological order as access sequences.
具体地,数据净化是从Web服务器日志文件中提取必要字段,如用户ID、时间、页面节点ID等;用户识别是将没有登录的用户,通过cookie和用户ID对应表将用户ID补充上;会话识别是以规定的时间为单位例如60分钟,将用户的访问路径切分为多个会话,即切分为多个访问序列,每个会话对应一个访问序列;最后,对于由于服务器同步数据的原因,有可能用户访问过的页面节点有丢失的情况,通过网站结构图,将丢失的路径补充上,以便后续的分析。在经过上述预处理后,就能够得到用户历史访
问序列集合。Specifically, the data cleaning is to extract necessary fields from the web server log file, such as user ID, time, page node ID, etc.; user identification is to add the user ID through the cookie and the user ID correspondence table; The identification is divided into multiple sessions by a predetermined time unit, for example, 60 minutes, that is, the user's access path is divided into multiple access sequences, each session corresponds to one access sequence; finally, for the reason that the server synchronizes data It is possible that the page node that the user has visited has a lost situation, and the lost path is added through the website structure map for subsequent analysis. After the above pre-processing, you can get a user history visit.
Ask the sequence collection.
然后利用优化后的prefixspan算法对用户历史访问序列集合进行频繁访问模式挖掘得到频繁访问模式集合,具体的prefixspan算法实现如下表:Then, using the optimized prefixspan algorithm to perform frequent access pattern mining on the user history access sequence set, the frequent access pattern set is obtained. The specific prefixspan algorithm is implemented as follows:
其中,输入的序列模式相当于一个访问序列,而最终输出的频繁序列
模式就是频繁访问模式,所有的频繁访问模式组成频繁访问模式集合。关于prefixspan算法,为数据挖掘中常用的算法,在数据挖掘中,常用的算法还很多,例如Apriori算法等,这里不再赘述。Wherein, the input sequence pattern is equivalent to an access sequence, and the final output is frequently sequenced.
The mode is the frequent access mode, and all the frequent access modes constitute a frequent access mode set. As for the prefixspan algorithm, which is commonly used in data mining, there are many commonly used algorithms in data mining, such as Apriori algorithm, which will not be described here.
在线处理部分的核心思想是,将客户端收集的用户访问信息通过Web服务器端处理,得到用户的访问路径,对用户的访问路径进行会话识别,得到用户访问序列集;然后对访问序列集中的每一个访问序列,利用最长匹配选择法对访问序列集与频繁访问模式集合中的项进行匹配,获得相应的规则集;最后把规则集中所有规则的后项作为功能引导列表推荐给用户。The core idea of the online processing part is to process the user access information collected by the client through the web server, obtain the user's access path, perform session identification on the user's access path, and obtain the user access sequence set; then each of the access sequence sets An access sequence is used to match the access sequence set with the items in the frequent access mode set by using the longest matching selection method to obtain a corresponding rule set; finally, the latter item of all rules in the rule set is recommended as a function boot list to the user.
具体地,如图2所示,包括如下步骤:Specifically, as shown in FIG. 2, the following steps are included:
S1、根据用户访问信息,识别出用户访问序列集;S1, identifying a user access sequence set according to user access information;
用户在自己的电脑上登录“御膳房”阿里云公共计算平台,“御膳房”阿里云公共计算平台通过客户端浏览器收集用户访问信息,并发往阿里云公共计算平台的后端Web服务器,Web服务器对用户访问信息进行会话识别,以规定的时间为单位(例如60分钟),将用户的访问路径切分为多个会话,即切分为多个访问序列,得到用户访问序列集。与离线处理过程不同的是,这里得到的是用户当前的访问序列集。The user logs in the “Yu Yu Fang” Alibaba Cloud public computing platform on his computer. The “Yu Yu Fang” Alibaba Cloud public computing platform collects user access information through the client browser and sends it to the back-end web server of the Alibaba Cloud public computing platform. The web server performs session identification on the user access information, and divides the user's access path into multiple sessions in a predetermined time unit (for example, 60 minutes), that is, splits into multiple access sequences to obtain a user access sequence set. Different from the offline processing process, what is obtained here is the user's current access sequence set.
S2、将用户访问序列集中的每一个访问序列,与频繁访问模式集合中的项进行匹配,获得相应的规则集;S2: Match each access sequence in the user access sequence set with an item in the frequent access mode set to obtain a corresponding rule set;
采用最长匹配选择法,将每一个访问序列与频繁访问模式集合中的项进行匹配,频繁访问模式集合中的每一项就是一个频繁访问模式。采用最长匹配选择法,即匹配到的频繁访问模式集合中的项含有用户访问序列中的大部分网页节点,例如可以设置频繁访问模式集合中的项含有用户访问序列中网页节点数大于70%,则该频繁访问模式集合中的项为匹配项;或者设置与用户访问序列中网页节点数匹配最多前N个频繁访问模式为匹配的频繁访问模式。The longest match selection method is used to match each access sequence with an item in the frequent access mode set, and each item in the frequent access mode set is a frequent access mode. The longest matching selection method is adopted, that is, the items in the matched frequent access mode set contain most of the webpage nodes in the user access sequence, for example, the items in the frequent access mode set may be set to contain more than 70% of the number of webpage nodes in the user access sequence. The item in the frequent access mode set is a match; or the frequent access mode in which the first N frequent access modes match the maximum number of webpage nodes in the user access sequence is matched.
例如,假设用户访问序列A1包含有10个网页节点,频繁访问模式集合中有B1,B2,B3三个频繁访问模式分别为与用户访问序列A1最匹配的前三个频繁访问模式,其中B1包含所有用户访问序列A1的10个网页节
点,B2包含用户访问序列A1的10个网页节点中的9个,B3包含用户访问序列A1的10个网页节点中的8个,如果设定的N为3,则B1,B2,B3即为匹配出的频繁访问模式,他们的集合就是规则集。For example, suppose the user access sequence A 1 contains 10 web page nodes, and the frequent access mode set has B 1 , B 2 , and B 3 three frequent access modes respectively being the first three frequent accesses that best match the user access sequence A 1 . Mode, where B 1 contains 10 web page nodes of all user access sequence A 1 , B 2 contains 9 of 10 web page nodes of user access sequence A 1 , B 3 contains 10 web page nodes of user access sequence A 1 Of the eight, if the set N is 3, then B 1 , B 2 , and B 3 are matched frequent access patterns, and their set is the rule set.
S3、将规则集中所有规则的后项作为功能引导列表推荐给用户。S3. Recommend the latter item of all the rules in the rule set as a function guide list to the user.
规则的后项,是指规则中用户访问列表所不包含的网页节点,这些访问节点都是用户后续可能会去访问的网页节点。例如B1包含的网页节点假设有15个,其中有10个与用户访问序列中的网页节点相同,而后5个网页节点就是用户可能会去访问的网页节点。The latter part of the rule refers to the webpage nodes that are not included in the user access list in the rule. These access nodes are webpage nodes that the user may visit later. Node B 1, for example, page 15 contains hypothesis, of which 10 have the same user accessing the page sequence node, then the node is the page 5 page node to the user may access.
从而将所有规则的后项作为功能引导列表推荐给用户,在用户的浏览器上显示为蒙层,供用户选择。从而用户能够直接选择蒙层中想要去访问的网页节点,实现直接访问。Therefore, the latter item of all rules is recommended to the user as a function guide list, and displayed as a mask on the user's browser for the user to select. Therefore, the user can directly select the webpage node in the mask layer that he wants to access, and realize direct access.
例如在“御膳房”阿里云公共计算平台的数据引擎区域,用户把鼠标放置在数据开发模块上后,根据该用户的历史操作,挖掘出该用户的频繁模式并存储在数据库中,此时把用户当前访问序列(即当前的操作路径)作为参数,根据“最长匹配模式”从该用户的频繁访问模式集合中找到下一步操作集,如“前往授权中心”,并传给用户操作页面,此时页面动态的改变导航标签中的链接,将“前往授权中心”等选择项显示在导航标签中,供用户选择,导航标签以蒙层的形式在用户浏览器上显示。For example, in the data engine area of the "Yu Yu Fang" Alibaba Cloud Public Computing Platform, after the user places the mouse on the data development module, according to the historical operation of the user, the frequent pattern of the user is mined and stored in the database. The user's current access sequence (ie, the current operation path) is used as a parameter, and the next operation set is found from the user's frequent access mode set according to the “longest matching mode”, such as “go to the authorization center” and transmitted to the user operation page. At this time, the page dynamically changes the link in the navigation tab, and the selection items such as “Go to the authorization center” are displayed in the navigation tab for the user to select, and the navigation label is displayed on the user browser in the form of a layer.
本实施例基于上述方法的一种个性化引导的实现装置,如图3所示,包括:The device for implementing the personalized guidance based on the foregoing method, as shown in FIG. 3, includes:
挖掘模块,用于根据用户历史访问序列集合进行频繁访问模式挖掘得到频繁访问模式集合;a mining module, configured to perform frequent access pattern mining according to a user history access sequence set to obtain a frequent access mode set;
识别模块,用于根据用户访问信息,识别出用户访问序列集;An identification module, configured to identify a user access sequence set according to user access information;
匹配模块,用于将用户访问序列集中的每一个访问序列,与频繁访问模式集合中的项进行匹配,获得相应的规则集;a matching module, configured to match each access sequence in the user access sequence set with an item in the frequent access mode set to obtain a corresponding rule set;
引导模块,用于将规则集中所有规则的后项作为功能引导列表推荐给用户。A boot module that is used to recommend the latter item of all rules in the rule set as a function boot list to the user.
本实施例挖掘模块在进行频繁访问模式挖掘时,执行如下操作:When the mining module of this embodiment performs frequent access mode mining, the following operations are performed:
根据Web服务器日志文件进行预处理得到用户历史访问序列集;
Preprocessing according to the web server log file to obtain a user history access sequence set;
对用户历史访问序列集进行频繁访问模式挖掘得到频繁访问模式集合。Frequent access pattern mining for user history access sequence sets results in a set of frequent access patterns.
其中挖掘模块在进行预处理时,需要对Web服务器日志文件进行数据净化、用户识别、会话识别、路径补充处理;在进行频繁访问模式挖掘时,采用prefixspan数据挖掘算法。The mining module needs to perform data purification, user identification, session identification, and path supplement processing on the web server log file during the pre-processing; when performing frequent access pattern mining, the prefixspan data mining algorithm is adopted.
本实施例中,匹配模块采用最长匹配选择法来进行匹配。In this embodiment, the matching module uses the longest matching selection method to perform matching.
以上实施例仅用以说明本发明的技术方案而非对其进行限制,在不背离本发明精神及其实质的情况下,熟悉本领域的技术人员当可根据本发明作出各种相应的改变和变形,但这些相应的改变和变形都应属于本发明所附的权利要求的保护范围。
The above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to be limiting, and those skilled in the art can make various corresponding changes according to the present invention and without departing from the spirit and scope of the present invention. Modifications, but such corresponding changes and modifications are intended to be included within the scope of the appended claims.
Claims (14)
- 一种个性化引导的实现方法,其特征在于,包括:A method for implementing personalized guidance, comprising:根据用户访问信息,识别出用户访问序列集;Identifying a user access sequence set based on user access information;将用户访问序列集中的每一个访问序列,与频繁访问模式集合中的项进行匹配,获得相应的规则集;Matching each access sequence in the user access sequence set with the items in the frequent access mode set to obtain a corresponding rule set;将规则集中所有规则的后项作为功能引导列表推荐给用户;Recommend the latter item of all rules in the rule set as a function guide list to the user;其中,所述频繁访问模式集合是根据用户历史访问序列集合进行频繁访问模式挖掘得到的。The frequent access mode set is obtained by performing frequent access pattern mining according to a user history access sequence set.
- 根据权利要求1所述的方法,其特征在于,所述根据用户历史访问序列集合进行频繁访问模式挖掘,包括:The method according to claim 1, wherein the performing frequent access pattern mining according to the user history access sequence set comprises:根据Web服务器日志文件进行预处理得到用户历史访问序列集;Preprocessing according to the web server log file to obtain a user history access sequence set;对用户历史访问序列集进行频繁访问模式挖掘得到频繁访问模式集合。Frequent access pattern mining for user history access sequence sets results in a set of frequent access patterns.
- 根据权利要求2所述的方法,其特征在于,所述根据Web服务器日志文件进行预处理,包括数据净化、用户识别、会话识别、路径补充步骤。The method according to claim 2, wherein the pre-processing according to the web server log file comprises data purification, user identification, session identification, and path replenishing steps.
- 根据权利要求2所述的方法,其特征在于,所述对用户历史访问序列集进行频繁访问模式挖掘,采用prefixspan数据挖掘算法。The method according to claim 2, wherein the frequent access pattern mining is performed on the user history access sequence set, and the prefixspan data mining algorithm is adopted.
- 根据权利要求1所述的方法,其特征在于,所述将用户访问序列集中的每一个访问序列,与频繁访问模式集合中的项进行匹配,采用最长匹配选择法。The method according to claim 1, wherein the matching of each access sequence in the user access sequence set with the items in the frequent access mode set uses the longest matching selection method.
- 根据权利要求5所述的方法,其特征在于,所述最长匹配选择法,设置含有用户访问序列中网页节点数大于第一设定预置的频繁访问模式为匹配的频繁访问模式。The method according to claim 5, wherein the longest matching selection method sets a frequent access mode that includes a frequent access mode in which the number of web page nodes in the user access sequence is greater than the first set preset.
- 根据权利要求5所述的方法,其特征在于,所述最长匹配选择法,设置与用户访问序列中网页节点数匹配最多的前N个频繁访问模式为匹配的频繁访问模式。The method according to claim 5, wherein the longest matching selection method sets a frequent access mode in which the first N frequent access modes that match the number of webpage nodes in the user access sequence are matched.
- 一种个性化引导的实现装置,其特征在于,包括:An apparatus for implementing personalized guidance, comprising:挖掘模块,用于根据用户历史访问序列集合进行频繁访问模式挖掘得 到频繁访问模式集合;a mining module for performing frequent access pattern mining based on a user history access sequence set To frequent access mode collections;识别模块,用于根据用户访问信息,识别出用户访问序列集;An identification module, configured to identify a user access sequence set according to user access information;匹配模块,用于将用户访问序列集中的每一个访问序列,与频繁访问模式集合中的项进行匹配,获得相应的规则集;a matching module, configured to match each access sequence in the user access sequence set with an item in the frequent access mode set to obtain a corresponding rule set;引导模块,用于将规则集中所有规则的后项作为功能引导列表推荐给用户。A boot module that is used to recommend the latter item of all rules in the rule set as a function boot list to the user.
- 根据权利要求8所述的装置,其特征在于,所述挖掘模块在进行频繁访问模式挖掘时,执行如下操作:The device according to claim 8, wherein the mining module performs the following operations when performing frequent access mode mining:根据Web服务器日志文件进行预处理得到用户历史访问序列集;Preprocessing according to the web server log file to obtain a user history access sequence set;对用户历史访问序列集进行频繁访问模式挖掘得到频繁访问模式集合。Frequent access pattern mining for user history access sequence sets results in a set of frequent access patterns.
- 根据权利要求9所述的装置,其特征在于,所述挖掘模块在进行预处理时,执行如下操作:The device according to claim 9, wherein the mining module performs the following operations when performing preprocessing:对Web服务器日志文件进行数据净化、用户识别、会话识别、路径补充处理。Data purification, user identification, session identification, and path replenishment processing are performed on the web server log file.
- 根据权利要求9所述的装置,其特征在于,所述挖掘模块在进行频繁访问模式挖掘时,采用prefixspan数据挖掘算法。The apparatus according to claim 9, wherein the mining module uses a prefixspan data mining algorithm when performing frequent access pattern mining.
- 根据权利要求8所述的装置,其特征在于,所述匹配模块采用最长匹配选择法来进行匹配。The apparatus of claim 8 wherein said matching module employs a longest match selection method for matching.
- 根据权利要求12所述的装置,其特征在于,所述最长匹配选择法,设置含有用户访问序列中网页节点数大于第一设定预置的频繁访问模式为匹配的频繁访问模式。The apparatus according to claim 12, wherein said longest matching selection method sets a frequent access mode including a frequent access mode in which the number of web page nodes in the user access sequence is greater than the first set preset.
- 根据权利要求12所述的装置,其特征在于,所述最长匹配选择法,设置与用户访问序列中网页节点数匹配最多的前N个频繁访问模式为匹配的频繁访问模式。 The apparatus according to claim 12, wherein the longest matching selection method sets a frequent access mode in which the first N frequent access modes that match the number of webpage nodes in the user access sequence are matched.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510404313.8A CN106326320A (en) | 2015-07-09 | 2015-07-09 | Method and device for realizing personal guidance |
CN201510404313.8 | 2015-07-09 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017005119A1 true WO2017005119A1 (en) | 2017-01-12 |
Family
ID=57684871
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2016/087465 WO2017005119A1 (en) | 2015-07-09 | 2016-06-28 | Method and device for implementing individualized guidance |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN106326320A (en) |
WO (1) | WO2017005119A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112231598A (en) * | 2020-08-31 | 2021-01-15 | 咪咕文化科技有限公司 | Webpage path navigation method and device, electronic equipment and storage medium |
Families Citing this family (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108334646A (en) * | 2018-04-11 | 2018-07-27 | 焦点科技股份有限公司 | A kind of link structure optimization method based on frequent browsing sequence |
CN109301860A (en) * | 2018-09-14 | 2019-02-01 | 淮南矿业(集团)有限责任公司 | Photovoltaic plant concentrates operation system and its hardware structure |
CN109491498A (en) * | 2018-10-31 | 2019-03-19 | 广州致远电子有限公司 | Man-machine interaction method, system, terminal device and computer readable storage medium |
CN110018871A (en) * | 2019-03-12 | 2019-07-16 | 中国平安财产保险股份有限公司 | The operation indicating method, apparatus and computer readable storage medium of system |
CN111723134A (en) * | 2019-03-19 | 2020-09-29 | 北京京东尚科信息技术有限公司 | Information processing method, information processing device, electronic equipment and storage medium |
CN112559107A (en) * | 2020-12-24 | 2021-03-26 | 平安普惠企业管理有限公司 | Application program guide method and device, computer equipment and storage medium |
CN112632384B (en) * | 2020-12-25 | 2024-07-05 | 北京百度网讯科技有限公司 | Data processing method and device for application program, electronic equipment and medium |
CN114065094A (en) * | 2021-11-24 | 2022-02-18 | 中国银联股份有限公司 | Webpage operation intelligent feedback method, system and device and readable storage medium |
CN114882974B (en) * | 2022-05-27 | 2023-04-18 | 江苏智慧智能软件科技有限公司 | Psychological diagnosis database access artificial intelligence verification system and method |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102929939A (en) * | 2012-09-28 | 2013-02-13 | 北京奇虎科技有限公司 | Personalized information supply method and device |
CN103092471A (en) * | 2013-01-04 | 2013-05-08 | 深圳市中兴移动通信有限公司 | Implement method and terminal for dynamic function menus |
US20140108200A1 (en) * | 2012-10-12 | 2014-04-17 | Alibaba Group Holding Limited | Method and system for recommending search phrases |
CN103885968A (en) * | 2012-12-20 | 2014-06-25 | 北京百度网讯科技有限公司 | Method and device for providing recommended information |
CN104199874A (en) * | 2014-08-20 | 2014-12-10 | 哈尔滨工程大学 | Webpage recommendation method based on user browsing behaviors |
-
2015
- 2015-07-09 CN CN201510404313.8A patent/CN106326320A/en active Pending
-
2016
- 2016-06-28 WO PCT/CN2016/087465 patent/WO2017005119A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102929939A (en) * | 2012-09-28 | 2013-02-13 | 北京奇虎科技有限公司 | Personalized information supply method and device |
US20140108200A1 (en) * | 2012-10-12 | 2014-04-17 | Alibaba Group Holding Limited | Method and system for recommending search phrases |
CN103885968A (en) * | 2012-12-20 | 2014-06-25 | 北京百度网讯科技有限公司 | Method and device for providing recommended information |
CN103092471A (en) * | 2013-01-04 | 2013-05-08 | 深圳市中兴移动通信有限公司 | Implement method and terminal for dynamic function menus |
CN104199874A (en) * | 2014-08-20 | 2014-12-10 | 哈尔滨工程大学 | Webpage recommendation method based on user browsing behaviors |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112231598A (en) * | 2020-08-31 | 2021-01-15 | 咪咕文化科技有限公司 | Webpage path navigation method and device, electronic equipment and storage medium |
CN112231598B (en) * | 2020-08-31 | 2024-06-04 | 咪咕文化科技有限公司 | Webpage path navigation method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN106326320A (en) | 2017-01-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2017005119A1 (en) | Method and device for implementing individualized guidance | |
US10824675B2 (en) | Resource-efficient generation of a knowledge graph | |
US10394854B2 (en) | Inferring entity attribute values | |
US11093854B2 (en) | Emoji recommendation method and device thereof | |
US10210243B2 (en) | Method and system for enhanced query term suggestion | |
KR102472572B1 (en) | Method for profiling user's intention and apparatus therefor | |
CN111800493B (en) | Information content pushing method, information content pushing device, electronic equipment and storage medium | |
US20200110842A1 (en) | Techniques to process search queries and perform contextual searches | |
US9251292B2 (en) | Search result ranking using query clustering | |
US8825620B1 (en) | Behavioral word segmentation for use in processing search queries | |
WO2020238502A1 (en) | Article recommendation method and apparatus, electronic device and storage medium | |
US11748429B2 (en) | Indexing native application data | |
US10073826B2 (en) | Providing action associated with event detected within communication | |
WO2019061664A1 (en) | Electronic device, user's internet surfing data-based product recommendation method, and storage medium | |
CN105069036A (en) | Information recommendation method and apparatus | |
CN112136127A (en) | Action indicator for search operation output element | |
US10289624B2 (en) | Topic and term search analytics | |
CN111563198B (en) | Material recall method, device, equipment and storage medium | |
US11062371B1 (en) | Determine product relevance | |
JP6100832B2 (en) | Method and system for providing recommended search terms based on messenger dialogue content, and recording medium | |
US20190087879A1 (en) | Marketplace listing analysis systems and methods | |
JP7128311B2 (en) | Recommended methods, apparatus, electronic devices, readable storage media and computer program products for document types | |
CN111881669B (en) | Synonymous text acquisition method and device, electronic equipment and storage medium | |
JP6680472B2 (en) | Information processing apparatus, information processing method, and information processing program | |
CN109074552A (en) | Knowledge based figure enhances contact card |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 16820770 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 16820770 Country of ref document: EP Kind code of ref document: A1 |