CN103885935A - Book section abstract generating method based on book reading behaviors - Google Patents
Book section abstract generating method based on book reading behaviors Download PDFInfo
- Publication number
- CN103885935A CN103885935A CN201410090143.6A CN201410090143A CN103885935A CN 103885935 A CN103885935 A CN 103885935A CN 201410090143 A CN201410090143 A CN 201410090143A CN 103885935 A CN103885935 A CN 103885935A
- Authority
- CN
- China
- Prior art keywords
- sentence
- page
- book
- books
- reading
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
本发明公开了一种基于图书阅读行为的图书章节摘要生成方法。基于图书阅读行为的图书章节摘要生成技术本质上是一种文档摘要生成技术,即将用户阅读行为加入文档摘要生成之中,并且应用于工程科教图书资源上。本发明首先采用图书页面量化阅读行为评分机制计算图书章节中每页书页的权重大小,然后将图书章节按句子分割,句子之间的相似度按距离计算并将已有的句子权重值按流行结构传播,最后基于数据重构的思想挑选出最能够代表图书章节内容的句子作为图书章节摘要。本发明将用户阅读行为收集,用于对图书书页的重要性评价中,通过基于数据重构的文档摘要生成思想得到对应的图书章节摘要,进而辅助用户快速了解图书章节内容,提高图书阅读效率。The invention discloses a book chapter abstract generation method based on book reading behavior. Book chapter summary generation technology based on book reading behavior is essentially a document summary generation technology, that is, adding user reading behavior to document summary generation, and applying it to engineering science and education book resources. The present invention first adopts the book page quantitative reading behavior scoring mechanism to calculate the weight of each page in the book chapters, then divides the book chapters into sentences, calculates the similarity between sentences according to the distance and calculates the existing sentence weights according to the popular structure Finally, based on the idea of data reconstruction, the sentence that best represents the content of the book chapter is selected as the book chapter summary. The invention collects user reading behaviors and uses them in the importance evaluation of book pages, obtains corresponding book chapter summaries through the idea of generating document summaries based on data reconstruction, and then assists users to quickly understand book chapter content and improves book reading efficiency.
Description
技术领域technical field
本发明涉及文档摘要生成方法,尤其涉及一种基于图书阅读行为的图书章节摘要生成方法。The invention relates to a method for generating document summaries, in particular to a method for generating book chapter summaries based on book reading behavior.
背景技术Background technique
随着数字图书馆的日益发展,用户在阅读图书前,希望能够快速准确的了解图书章节内容信息,迫切希望数字图书馆中能够提供图书章节摘要的服务。With the increasing development of digital libraries, users hope to quickly and accurately understand the content information of book chapters before reading books, and urgently hope that digital libraries can provide book chapter summary services.
图书章节摘要生成本质上是一种基于阅读行为的文档摘要生成方法,即将用户阅读行为建模,根据行为模型将用户阅读因素加入文档摘要生成算法中,得到受用户阅读影响的摘要结果。如果直接采用传统的文档摘要生成方法,图书章节摘要可能不会从用户阅读角度来准确表达章节内容信息,这样也就无法满足用户的需求。Book chapter summary generation is essentially a document summary generation method based on reading behavior, which is to model user reading behavior, add user reading factors into the document summary generation algorithm according to the behavior model, and obtain summary results affected by user reading. If the traditional document summarization method is directly used, book chapter summaries may not accurately express chapter content information from the user's reading point of view, thus failing to meet the needs of users.
在传统的阅读中,读者阅读的目标对象是简单确定的语言符号。在阅读的开始和阅读的结束,读者仅仅通过文字化的内容信息获取并得到认知,是一个脱离于社会的鼓励的存在。基于网络的社会化阅读的出现,使阅读者从阅读内容选择的开始到阅读内容结束,部分或全部过程都与社会化网络形成了关联。在这种人与人之间相互关联的社会网络中,读者的阅读行为往往就成为需要关注和研究的对象。In traditional reading, the target object of readers' reading is a simple and definite language symbol. At the beginning and end of reading, the reader only obtains and obtains cognition through textual content information, which is an existence separated from the encouragement of society. The emergence of socialized reading based on the network makes readers associate with the social network in part or all of the process from the beginning of reading content selection to the end of reading content. In such an interrelated social network, readers' reading behavior often becomes the object of attention and research.
社会化阅读本身是以内容为核心,以社交关系为纽带,注重分享、交流和互动的阅读新模式。用户在内容阅读的过程中,可以与同样喜好的用户进行互动,阅读结束后,可以与阅读同一内容的大众进行交往联系,甚至形成议题融合的社会化。分享、交流和互动贯穿于社会化阅读的全过程。而在这些互动交流中,产生了大量新的有价值的内容,如评论、摘要、笔记、关联或交叉信息。Socialized reading itself is a new mode of reading with content as the core, social relations as the link, and emphasis on sharing, communication and interaction. In the process of reading content, users can interact with users who have the same preferences. After reading, they can communicate with the public who read the same content, and even form a socialization of fusion of topics. Sharing, communication and interaction run through the whole process of socialized reading. And in these interactive exchanges, a large number of new valuable contents are produced, such as comments, summaries, notes, associations or cross information.
在进行图书章节摘要生成时所采用的基础摘要生成算法是基于数据重构的文档摘要生成算法(DSDR)。基于数据重构的文档摘要生成算法是一种抽取式的方法,该方法认为好的文档摘要应该满足一个特点:从结果摘要能够最大程度的重构原始文档,即的结果摘要能够尽量的覆盖整个文档所表达的内容信息。The basic algorithm used to generate book chapter summaries is the document summarization algorithm based on data reconstruction (DSDR). The document summary generation algorithm based on data reconstruction is an extractive method, which believes that a good document summary should satisfy one characteristic: the original document can be reconstructed from the result summary to the greatest extent, that is, the result summary can cover the entire document as much as possible. The content information expressed by the document.
在基于数据重构的文档摘要生成算法的基础上,把用户在社会化阅读时的各种行为考虑进去,比如阅读的时候用户的重要句子圈画行为,这些被圈画的句子往往被认为有比较高的代表性,与其他没有被圈画的句子相比要具有比较高的影响权重。On the basis of the document summary generation algorithm based on data reconstruction, various behaviors of users during social reading are taken into consideration, such as the user's behavior of circling important sentences during reading. These circled sentences are often considered to be meaningful A relatively high representation has a relatively high influence weight compared with other sentences that are not circled.
发明内容Contents of the invention
本发明的目的是为了提供能够方便用户快速了解图书章节信息的章节摘要,给出了一种基于图书阅读行为的图书章节摘要生成方法。The purpose of the present invention is to provide chapter summaries that can facilitate users to quickly understand book chapter information, and provides a method for generating book chapter summaries based on book reading behavior.
本发明解决其技术问题采用的技术方案如下:The technical scheme that the present invention solves its technical problem adopts is as follows:
基于图书阅读行为的图书章节摘要生成方法的步骤如下:The steps of the book chapter summary generation method based on book reading behavior are as follows:
1)构建图书页面量化阅读行为评分机制:将用户阅读行为按阅读深度由浅到深分为四个层次,分别是浏览层次、收藏层次、浅度阅读层次和深度阅读层次,基于这四个层次得到基于用户阅读行为的图书页面评分机制;1) Construct the quantitative reading behavior scoring mechanism of book pages: divide the user's reading behavior into four levels according to the reading depth from shallow to deep, which are browsing level, collection level, shallow reading level and deep reading level. Based on these four levels, the Book page scoring mechanism based on user reading behavior;
2)句子权重值传播:通过步骤1)的基于用户阅读行为的图书页面评分机制得到图书书页量化得分,将图书章节按句子分割,图书书页量化得分会赋予每个句子初始的权重值,基于句子之间的距离,利用数据流行结构上的排序算法进行句子权重值的传播;2) Sentence weight value propagation: Through the book page scoring mechanism based on user reading behavior in step 1), the quantitative score of the book page is obtained, and the chapters of the book are divided into sentences. The quantitative score of the book page will give each sentence an initial weight value, based on the sentence The distance between, using the sorting algorithm on the data popular structure to spread the sentence weight value;
3)图书章节摘要生成:句子权重值得到传播后,将句子权重值加入基于数据重构的文档摘要生成算法中,从图书章节中挑选重要句子作为章节摘要。3) Book chapter summary generation: After the sentence weight value is propagated, the sentence weight value is added to the document summary generation algorithm based on data reconstruction, and important sentences are selected from the book chapters as the chapter summary.
所述的步骤1)为:Described step 1) is:
2.1将用户阅读某页的行为划分为四个层次,分别是浏览层次、收藏层次、浅度阅读层次和深度阅读层次,不同层次对书页有不同的得分贡献;2.1 Divide the user's behavior of reading a certain page into four levels, which are browsing level, collection level, shallow reading level and deep reading level. Different levels have different score contributions to the page;
2.2使用留存率、流失率和评分指数衰减来衡量阅读到达某个层次的难度,以此来进行评分,图书页面用户留存率是指对于某图书页面来讲,相对于浏览时的用户数,进行到收藏、浅度阅读和深度阅读的留存用户数的比例,图书页面用户流失率是指对于上一步留存用户数,这一步所减少的用户数的比例,2.2 Use the retention rate, churn rate and rating index decay to measure the difficulty of reading to a certain level, and use this to score. The retention rate of book page users refers to the number of users when browsing a book page. The ratio of the number of retained users to bookmarking, shallow reading, and deep reading. The user churn rate of book pages refers to the ratio of the number of users retained in the previous step to the number of users reduced in this step.
建立基于用户阅读行为的评分公式:Create a scoring formula based on user reading behavior:
Vi=[(pi+qi)/pi]exp(1-pi) i=1,2,3,4V i =[(p i +q i )/p i ]exp(1-p i ) i=1,2,3,4
图书页面用户留存率公式:Book page user retention rate formula:
pi=Ui/U1 i=1,2,3,4p i =U i /U 1 i=1,2,3,4
图书页面用户流失率公式:Book page user churn rate formula:
其中:Vi为整个用户群体的阅读行为第i步对图书某页的得分贡献;pi为第i步相对于浏览的留存率;qi为第i步相对于第i-1步的流失率;Ui为进行到第i步的用户数;Among them: V i is the score contribution of the reading behavior of the entire user group to a certain page of the book in the i-th step; p i is the retention rate of the i-th step relative to browsing; q i is the loss of the i-th step relative to the i-1 step rate; U i is the number of users who have reached step i;
2.3图书页面访问时间有先后之分,越先访问并标注该图书页面的用户对该页面的贡献越大,基于图书页面关键行为节点的评分机制可以计算图书页面的重要程度,图书页面的重要程度的综合平分公式如下:2.3 The access time of the book page is divided into sequence. The earlier the user visits and marks the book page, the greater the contribution to the page. Based on the scoring mechanism of the key behavior nodes of the book page, the importance of the book page can be calculated, and the importance of the book page can be calculated. The comprehensive bisection formula is as follows:
上述式子中:sj为图书第j页的评分值;Wuj为用户u对图书第j页的贡献权重;Tj为图书第j页被访问时间的总和;tuj为用户u对图书第j页的第一次访问的时间;tj为图书第j页第一次被访问的时间;Suj为用户u对图书第j页所到达的关键行为步骤的评分值之和,Vij为用户u对图书第j页所达到第i步关键行为步骤的评分值;L为用户u阅读图书第j页所到达的深度及关键步骤数;In the above formula: s j is the score value of page j of the book; W uj is the contribution weight of user u to page j of the book; T j is the sum of the visit time of page j of the book; t uj is the contribution of user u to the book page j The time of the first visit of page j; t j is the time when page j of the book is visited for the first time; Suj is the sum of the score values of the key behavior steps reached by user u on page j of the book, V ij is the scoring value of the i-th key behavior step that user u achieves on page j of the book; L is the depth and number of key steps that user u reaches when reading page j of the book;
2.4根据以上评分机制的方法能够对图书每一页在书中的重要性给出量化的评分,因为图书阅读群体的差异性,为了避免图书书页评分因访问用户数少而评分高的现象,在实际的书页评价过程中,对访问用户数和评分进行归一化处理,得到了最终的图书页面的综合评分公式如下:2.4 According to the method of the above scoring mechanism, a quantitative score can be given for the importance of each page of the book in the book. Because of the differences in the book reading groups, in order to avoid the phenomenon that the score of the book page is high due to the small number of visiting users, in the In the actual book page evaluation process, the number of visiting users and ratings are normalized, and the final comprehensive scoring formula for book pages is obtained as follows:
上式中:uj为图书页面j的浏览用户数,sj为对图书页面j的评分,PageScorej为图书书页的评分,利用与平均值比较的方法可知,只有浏览图书页面的用户数和读者对该页面的评分值都很高的时候,综合评分才会高,根据用户阅读行为在图书阅读中的特点,建立基于用户阅读行为的图书页面重要程度评价体系,通过图书页面阅读的四个层次量化用户行为,通过计算四个层次的评价贡献值来定义用户从浏览层次到深度阅读层次到达的难度,最终通过图书页面上用户群体的阅读行为来计算量化该页面的重要性。In the above formula: u j is the number of browsing users of book page j, s j is the score of book page j, and PageScore j is the score of book page. Using the method of comparing with the average value, it can be known that only the number of users browsing book pages and The overall score will be high when the readers' ratings for the page are very high. According to the characteristics of the user's reading behavior in book reading, an evaluation system for the importance of the book page based on the user's reading behavior is established. Through the four aspects of book page reading Hierarchical quantification of user behavior, by calculating the evaluation contribution value of the four levels to define the difficulty for users to reach from the browsing level to the in-depth reading level, and finally calculating and quantifying the importance of the page through the reading behavior of the user group on the book page.
所述的步骤2)为:Described step 2) is:
3.1在步骤1)中给出了图书页面j的得分PageScorej,这个得分反映了页面j在图书中的重要性,同时需要考虑被划句子在该书页中具有相对重要性,句子的重要性与页面得分的关系如下:3.1 In step 1), the score PageScore j of page j of the book is given. This score reflects the importance of page j in the book. At the same time, it is necessary to consider the relative importance of the marked sentence in the page. The importance of the sentence is related to The relationship between page scores is as follows:
上式中的wi表示句子vi当前的权重值,假设给定文档句子集合为其中vi表示集合V中第i个句子,把被用户用直线划过的句子放在集合的前面,假定前k个句子是用户划过的,通过剩下句子与前k个句子的关系来求句子的权重值;The w i in the above formula represents the current weight value of the sentence v i , assuming that the set of given document sentences is Among them, v i represents the i-th sentence in the set V, put the sentences drawn by the user with a straight line in front of the set, assuming that the first k sentences are drawn by the user, and use the relationship between the remaining sentences and the first k sentences to determine Find the weight value of the sentence;
3.2令dis:表示在集合V上的距离度量方式,则可以得到每对句子vi和句子vj之间的距离dis(vi,vj),令映射表示分配给每个句子vi权重值fi的排序函数,向量f=[f1,...,fn]T,向量w=[w1,...,wn]T,其中,如果句子vi被划过则wi≠0,否则wi=0,wi表示每个句子的初始权重值;3.2 order dis: Indicates the distance measurement method on the set V, then the distance dis(v i , v j ) between each pair of sentence v i and sentence v j can be obtained, let the mapping Represents the ranking function assigned to each sentence v i weight value f i , vector f=[f 1 ,...,f n ] T , vector w=[w 1 ,...,w n ] T , where, If the sentence v i is crossed, then w i ≠0, otherwise w i =0, and w i represents the initial weight value of each sentence;
3.3在数据流形结构上的权重传播算法表示如下:3.3 The weight propagation algorithm on the data manifold structure is expressed as follows:
Step1:计算句子向量两两之间的距离dis(vi,vj),并且升序排列,按升序列表在两两句子向量所对应的节点之间连接一条边直到得到连通图;Step1: Calculate the distance dis(v i , v j ) between pairs of sentence vectors, and arrange them in ascending order, and connect an edge between the nodes corresponding to the pair of sentence vectors in ascending order until a connected graph is obtained;
Step2:定义关联矩阵W,满足:如果句子向量vi和vj对应的点之间存在一条边的话,Wij=exp[-dis2(vi,vj)/2σ2];如果句子向量vi和vj对应的点之间不存在边的话,Wij=0;并且Wii=0;Step3:对关联矩阵W进行对称标准化,得到矩阵S:S=D-1/2WD-1/2,式中D是对角矩阵,对角矩阵D的对角元素项
Step4:迭代计算f(t+1)=aSf(t)+(1-α)w直到收敛,α是一个取值范围在[0,1)的参数;Step4: Iteratively calculate f(t+1)=aSf(t)+(1-α)w until convergence, α is a parameter with a value range of [0, 1);
Step5:令表示序列{fi(t)}的极限,得到句子权重的极限序列为
3.4在Step4中,参数α用来指定邻居节点对该节点的权重值贡献和初始的权重值;由于算法中的矩阵S是一个对角矩阵,所以权重值的传播过程是对称的;而对于序列{f(t)}的收敛值,计算f*=(I-αS)-1w;经过权重值的传播,就得到了图书章节中每个句子的合理权重值。3.4 In Step4, the parameter α is used to specify the weight value contribution of the neighbor node to the node and the initial weight value; since the matrix S in the algorithm is a diagonal matrix, the propagation process of the weight value is symmetrical; and for the sequence For the convergence value of {f(t)}, calculate f * =(I-αS) -1 w; after the propagation of the weight value, a reasonable weight value for each sentence in the book chapter is obtained.
所述步骤3)为:Described step 3) is:
4.1得到图书章节句子vi的权重值权重值反映了句子vi在图书章节中的重要性,将n个权重值作为矩阵F的对角元素,对n个权重值进行对角矩阵化,即得到对角矩阵F,将对角矩阵F加入基于数据重构的文档摘要生成算法;4.1 Get the weight value of the book chapter sentence v i Weights Reflects the importance of the sentence v i in the chapter of the book, and the n weight values As the diagonal elements of the matrix F, diagonally matrix the n weight values, namely Obtain the diagonal matrix F, and add the diagonal matrix F to the document summary generation algorithm based on data reconstruction;
4.2在文档摘要生成过程中重新定义线性非负数据重构算法的目标函数如下:4.2 Redefine the objective function of the linear non-negative data reconstruction algorithm in the process of document summarization as follows:
s.t.βj≥0,aij≥0,and ai∈Rn stβ j ≥ 0, a ij ≥ 0, and a i ∈ R n
上式中,每个句子的挑选过程加入了图书章节句子vi的权重值fi *,其中aij≥0表明该方法只允许集合空间中句子的加法运算,不允许减法运算;同时β=[β1,β2,...,βn]T是一个辅助变量;如果βj=0的话,则所有的a1j,...,anj为0,这意味着第j列的候选句子没有被选中,γ是正则项参数;In the above formula, the weight value f i * of the book chapter sentence v i is added to the selection process of each sentence, where a ij ≥ 0 indicates that the method only allows the addition of sentences in the set space, and does not allow subtraction; at the same time β = [β 1 , β 2 ,..., β n ] T is an auxiliary variable; if β j =0, then all a 1j ,..., a nj are 0, which means that the candidate of column j The sentence is not selected, and γ is the parameter of the regular term;
4.3基于数据重构的文档摘要生成算法的目标函数是一个凸优化问题,可以保证全局最优解,此时,固定ai,令J对β的导数为0,得到β的最小解如下:4.3 The objective function of the document summary generation algorithm based on data reconstruction is a convex optimization problem, which can guarantee the global optimal solution. At this time, fix a i , let the derivative of J to β be 0, and obtain the minimum solution of β as follows:
当得到了β的最小解之后,非负约束下的最小化问题可以用拉格朗日方法求解;When the minimum solution of β is obtained, the minimization problem under non-negative constraints can be solved by Lagrangian method;
4.4令αij为约束条件aij≥0和A=[aij]下的拉格朗日算子,则拉格朗日公式L如下:4.4 Let α ij be the Lagrangian operator under the constraints a ij ≥ 0 and A=[a ij ], then the Lagrangian formula L is as follows:
L=J+Tr[αAT]=Tr[F(V-AV)(V-AV)T+diag(β)-1ATA]+γ||β||1+Tr[αAT],α=[αij]L=J+Tr[ αAT ]=Tr[F(V-AV)(V-AV) T +diag(β) -1 A T A]+γ||β|| 1 +Tr[ αAT ], α=[α ij ]
F是步骤4.1中的对角矩阵,对角矩阵F对角线上的元素项分别为 也是一个对角矩阵,对角矩阵diag(β)对角线上的元素项分别为β1,...,βn;F is the diagonal matrix in step 4.1, and the elements on the diagonal of the diagonal matrix F are It is also a diagonal matrix, and the elements on the diagonal of the diagonal matrix diag(β) are β 1 ,..., β n ;
4.5拉格朗日公式L对A求导结果如下:4.5 Lagrangian formula L to A derivation results are as follows:
令的导数为0,可以得到关于α的表示如下:make The derivative of is 0, and the expression about α can be obtained as follows:
α=2FVVT-2FAVVT-2Adiag(β)-1 α=2FVV T -2FAVV T -2Adiag(β) -1
根据Karush-Kuhn-Tucker条件αijaij=0,对上式各项乘以aij得到如下等式:According to the Karush-Kuhn-Tucker condition α ij a ij =0, multiply the items of the above formula by a ij to get the following equation:
(FVVT)ijaij-(FAVVT)ijaij-(Adiag(β)-1)ijaij=0(FVV T ) ij a ij -(FAVV T ) ij a ij -(Adiag(β) -1 ) ij a ij =0
根据上式得到如下的更新公式:According to the above formula, the following update formula is obtained:
将上述更新公式迭代执行直到收敛,最终得到图书章节的摘要句子。The above update formula is iteratively executed until convergence, and finally the summary sentence of the book chapter is obtained.
本发明方法与现有技术相比具有的有益效果:The inventive method has the beneficial effect compared with prior art:
1.该方法结合了用户阅读行为建模和文档摘要生成方法,将基于数据重构的文档摘要生成算法应用于图书章节摘要生成上,得到图书章节的摘要信息;1. This method combines user reading behavior modeling and document summary generation methods, and applies the document summary generation algorithm based on data reconstruction to the generation of book chapter summary to obtain the summary information of book chapters;
2.该方法对用户阅读行为进行了分析建模,建模方法采用基于阅读深度的思想,对阅读行为进行层次划分,最终给出了图书书页的综合评分体系,以得分高低表示图书书页的重要程度;2. This method analyzes and models the user's reading behavior. The modeling method uses the idea of reading depth to divide the reading behavior into layers. Finally, a comprehensive scoring system for book pages is given, and the score indicates the importance of book pages. degree;
3.该方法以图书章节的句子为单位,根据已有的句子权重值在数据流行空间上进行权重值的传播,最后得到每个句子的合理权重值大小,使得对用户行为的反映更加准确。3. This method takes the sentence of the book chapter as a unit, and propagates the weight value in the data popularity space according to the existing sentence weight value, and finally obtains a reasonable weight value for each sentence, which makes the reflection of user behavior more accurate.
附图说明Description of drawings
图1是基于图书阅读行为的图书章节摘要生成方法系统架构图;Figure 1 is a system architecture diagram of a book chapter summary generation method based on book reading behavior;
图2是本发明的句子权重值传播方法步骤图;Fig. 2 is a step diagram of the sentence weight value propagation method of the present invention;
图3是本发明实施例的图书目录图;Fig. 3 is the book catalog diagram of the embodiment of the present invention;
图4是本发明实施例的第一章节示意图;Fig. 4 is a schematic diagram of the first chapter of an embodiment of the present invention;
图5是本发明实施例的章节摘要生成结果图。Fig. 5 is a diagram of the generation result of the chapter abstract according to the embodiment of the present invention.
具体实施方式Detailed ways
如图1和图2所示,基于图书阅读行为的图书章节摘要生成方法的步骤如下:As shown in Figure 1 and Figure 2, the steps of the book chapter summary generation method based on book reading behavior are as follows:
1)构建图书页面量化阅读行为评分机制:将用户阅读行为按阅读深度由浅到深分为四个层次,分别是浏览层次、收藏层次、浅度阅读层次和深度阅读层次,基于这四个层次得到基于用户阅读行为的图书页面评分机制;1) Construct the quantitative reading behavior scoring mechanism of book pages: divide the user's reading behavior into four levels according to the reading depth from shallow to deep, which are browsing level, collection level, shallow reading level and deep reading level. Based on these four levels, the Book page scoring mechanism based on user reading behavior;
2)句子权重值传播:通过步骤1)的基于用户阅读行为的图书页面评分机制得到图书书页量化得分,将图书章节按句子分割,图书书页量化得分会赋予每个句子初始的权重值,基于句子之间的距离,利用数据流行结构上的排序算法进行句子权重值的传播;2) Sentence weight value propagation: Through the book page scoring mechanism based on user reading behavior in step 1), the quantitative score of the book page is obtained, and the chapters of the book are divided into sentences. The quantitative score of the book page will give each sentence an initial weight value, based on the sentence The distance between, using the sorting algorithm on the data popular structure to spread the sentence weight value;
3)图书章节摘要生成:句子权重值得到传播后,将句子权重值加入基于数据重构的文档摘要生成算法中,从图书章节中挑选重要句子作为章节摘要。3) Book chapter summary generation: After the sentence weight value is propagated, the sentence weight value is added to the document summary generation algorithm based on data reconstruction, and important sentences are selected from the book chapters as the chapter summary.
所述的步骤1)为:Described step 1) is:
2.1将用户阅读某页的行为划分为四个层次,分别是浏览层次、收藏层次、浅度阅读层次和深度阅读层次,不同层次对书页有不同的得分贡献;2.1 Divide the user's behavior of reading a certain page into four levels, which are browsing level, collection level, shallow reading level and deep reading level. Different levels have different score contributions to the page;
2.2使用留存率、流失率和评分指数衰减来衡量阅读到达某个层次的难度,以此来进行评分,评分与留存率之间存在一种指数衰减的关系,评分在某一步的值与上一步的流失率相关,还与初始阶段的留存率相关,这里先给出图书页面用户留存率和流失率定义,图书页面用户留存率是指对于某图书页面来讲,相对于浏览时的用户数,进行到收藏、浅度阅读和深度阅读的留存用户数的比例,图书页面用户流失率是指对于上一步留存用户数,这一步所减少的用户数的比例,2.2 Use the retention rate, loss rate and scoring exponential decay to measure the difficulty of reading to a certain level, and use this to score. There is an exponential decay relationship between the score and the retention rate, and the value of the score at a certain step is the same as that of the previous step. It is related to the churn rate of the book page, and also related to the retention rate in the initial stage. Here we first give the definition of the book page user retention rate and churn rate. The book page user retention rate refers to the number of users when browsing a certain book page. The ratio of the number of retained users to bookmarking, shallow reading, and in-depth reading. The churn rate of book pages refers to the ratio of the number of users retained in the previous step to the number of users reduced in this step.
建立基于用户阅读行为的评分公式:Create a scoring formula based on user reading behavior:
Vi=[(pi+qi)/pi]exp(1-pi) i=1,2,3,4V i =[(p i +q i )/p i ]exp(1-p i ) i=1,2,3,4
图书页面用户留存率公式:Book page user retention rate formula:
pi=Ui/U1 i=1,2,3,4p i =U i /U 1 i=1,2,3,4
图书页面用户流失率公式:Book page user churn rate formula:
其中:Vi为整个用户群体的阅读行为第i步对图书某页的得分贡献;pi为第i步相对于浏览的留存率;qi为第i步相对于第i-1步的流失率;Ui为进行到第i步的用户数;Among them: V i is the score contribution of the reading behavior of the entire user group to a certain page of the book in the i-th step; p i is the retention rate of the i-th step relative to browsing; q i is the loss of the i-th step relative to the i-1 step rate; Ui is the number of users who have reached step i;
2.3图书页面访问时间有先后之分,越先访问并标注该图书页面的用户对该页面的贡献越大,如果第一个访问用户就对某页面进行了深度阅读,则该页面的重要程度相对要高一些,基于图书页面关键行为节点的评分机制可以计算图书页面的重要程度,图书页面的重要程度的综合平分公式如下:2.3 The access time of book pages is different. The earlier the user visits and marks the book page, the greater the contribution to the page. If the first user who visits reads a page in depth, the importance of the page is relatively high. To be higher, based on the scoring mechanism of the key behavior nodes of the book page, the importance of the book page can be calculated. The comprehensive equalization formula for the importance of the book page is as follows:
上述式子中:sj为图书第j页的评分值;Wuj为用户u对图书第j页的贡献权重;Tj为图书第j页被访问时间的总和;tuj为用户u对图书第j页的第一次访问的时间;tj为图书第j页第一次被访问的时间;Suj为用户u对图书第j页所到达的关键行为步骤的评分值之和,Vij为用户u对图书第j页所达到第i步关键行为步骤的评分值;L为用户u阅读图书第j页所到达的深度及关键步骤数;In the above formula: s j is the score value of page j of the book; W uj is the contribution weight of user u to page j of the book; T j is the sum of the visit time of page j of the book; t uj is the contribution of user u to the book page j The time of the first visit of page j; t j is the time when page j of the book is visited for the first time; Suj is the sum of the score values of the key behavior steps reached by user u on page j of the book, V ij is the scoring value of the i-th key behavior step that user u achieves on page j of the book; L is the depth and number of key steps that user u reaches when reading page j of the book;
2.4根据以上评分机制的方法能够对图书每一页在书中的重要性给出量化的评分,因为图书阅读群体的差异性,为了避免图书书页评分因访问用户数少而评分高的现象,在实际的书页评价过程中,对访问用户数和评分进行归一化处理,得到了最终的图书页面的综合评分公式如下:2.4 According to the method of the above scoring mechanism, a quantitative score can be given for the importance of each page of the book in the book. Because of the differences in the book reading groups, in order to avoid the phenomenon that the score of the book page is high due to the small number of visiting users, in the In the actual book page evaluation process, the number of visiting users and ratings are normalized, and the final comprehensive scoring formula for book pages is obtained as follows:
上式中:uj为图书页面j的浏览用户数,sj为对图书页面j的评分,PageScorej为图书书页的评分,利用与平均值比较的方法可知,只有浏览图书页面的用户数和读者对该页面的评分值都很高的时候,综合评分才会高,根据用户阅读行为在图书阅读中的特点,建立基于用户阅读行为的图书页面重要程度评价体系,通过图书页面阅读的四个层次量化用户行为,通过计算四个层次的评价贡献值来定义用户从浏览层次到深度阅读层次到达的难度,最终通过图书页面上用户群体的阅读行为来计算量化该页面的重要性。In the above formula: u j is the number of browsing users of book page j, s j is the score of book page j, and PageScore j is the score of book page. Using the method of comparing with the average value, it can be known that only the number of users browsing book pages and The overall score will be high when the readers' ratings for the page are very high. According to the characteristics of the user's reading behavior in book reading, an evaluation system for the importance of the book page based on the user's reading behavior is established. Through the four aspects of book page reading Hierarchical quantification of user behavior, by calculating the evaluation contribution value of the four levels to define the difficulty for users to reach from the browsing level to the in-depth reading level, and finally calculating and quantifying the importance of the page through the reading behavior of the user group on the book page.
所述的步骤2)为:Described step 2) is:
3.1在步骤1)中给出了图书页面j的得分PageScorej,这个得分反映了页面j在图书中的重要性,同时需要考虑被划句子在该书页中具有相对重要性,句子的重要性与页面得分的关系如下:3.1 In step 1), the score PageScore j of page j of the book is given. This score reflects the importance of page j in the book. At the same time, it is necessary to consider the relative importance of the marked sentence in the page. The importance of the sentence is related to The relationship between page scores is as follows:
上式中的wi表示句子vi当前的权重值,假设给定文档句子集合为 其中vi表示集合V中第i个句子,把被用户用直线划过的句子放在集合的前面,假定前k个句子是用户划过的,通过剩下句子与前k个句子的关系来求句子的权重值;The w i in the above formula represents the current weight value of the sentence v i , assuming that the set of given document sentences is Among them, v i represents the i-th sentence in the set V, put the sentences drawn by the user with a straight line in front of the set, assuming that the first k sentences are drawn by the user, and use the relationship between the remaining sentences and the first k sentences to determine Find the weight value of the sentence;
3.2令dis:表示在集合V上的距离度量方式,则可以得到每对句子vi和句子vj之间的距离dis(vi,vj),令映射表示分配给每个句子vi权重值fi的排序函数,向量f=[f1,...,fn]T,向量w=[w1,...,wn]T,其中,如果句子vi被划过则wi≠0,否则wi=0,wi表示每个句子的初始权重值;3.2 order dis: Indicates the distance measurement method on the set V, then the distance dis(v i , v j ) between each pair of sentence v i and sentence v j can be obtained, let the mapping Represents the ranking function assigned to each sentence v i weight value fi, vector f=[f 1 ,...,f n ] T , vector w=[w 1 ,...,w n ] T , where, if If the sentence vi is crossed, then w i ≠ 0, otherwise w i = 0, and w i represents the initial weight value of each sentence;
3.3在数据流形结构上的权重传播算法表示如下:3.3 The weight propagation algorithm on the data manifold structure is expressed as follows:
Step1:计算句子向量两两之间的距离dis(vi,vj),并且升序排列,按升序列表在两两句子向量所对应的节点之间连接一条边直到得到连通图;Step1: Calculate the distance dis(v i , v j ) between pairs of sentence vectors, and arrange them in ascending order, and connect an edge between the nodes corresponding to the pair of sentence vectors in ascending order until a connected graph is obtained;
Step2:定义关联矩阵W,满足:如果句子向量vi和vj对应的点之间存在一条边的话,Wij=exp[-dis2(vi,vj)/2σ2];如果句子向量vi和vj对应的点之间不存在边的话,Wij=0;并且Wii=0;Step3:对关联矩阵W进行对称标准化,得到矩阵S:S=D-1/2WD-1/2,式中D是对角矩阵,对角矩阵D的对角元素项
Step4:迭代计算f(t+1)=αSf(t)+(1-α)w直到收敛,α是一个取值范围在[0,1)的参数;Step4: Iteratively calculate f(t+1)=αSf(t)+(1-α)w until convergence, α is a parameter with a value range of [0, 1);
Step5:令表示序列{fi(t)}的极限,得到句子权重的极限序列为
3.4在Step4中,参数α用来指定邻居节点对该节点的权重值贡献和初始的权重值;由于算法中的矩阵S是一个对角矩阵,所以权重值的传播过程是对称的;而对于序列{f(t)}的收敛值,计算f*=(I-aS)-1w;经过权重值的传播,就得到了图书章节中每个句子的合理权重值。3.4 In Step4, the parameter α is used to specify the weight value contribution of the neighbor node to the node and the initial weight value; since the matrix S in the algorithm is a diagonal matrix, the propagation process of the weight value is symmetrical; and for the sequence For the convergence value of {f(t)}, calculate f * =(I-aS) -1 w; after the propagation of the weight value, a reasonable weight value for each sentence in the book chapter is obtained.
所述步骤3)为:Described step 3) is:
4.1得到图书章节句子vi的权重值fi *,权重值fi *反映了句子vi在图书章节中的重要性,将n个权重值fi *作为矩阵F的对角元素,对n个权重值进行对角矩阵化,即Fii=fi *,得到对角矩阵F,将对角矩阵F加入基于数据重构的文档摘要生成算法;4.1 Get the weight value f i * of the sentence v i of the book chapter, the weight value f i * reflects the importance of the sentence v i in the book chapter, and take n weight values f i * as the diagonal elements of the matrix F, for n Weight values are diagonally matrixed, that is, F ii =f i * , to obtain a diagonal matrix F, and the diagonal matrix F is added to the document summary generation algorithm based on data reconstruction;
4.2在文档摘要生成过程中重新定义线性非负数据重构算法的目标函数如下:4.2 Redefine the objective function of the linear non-negative data reconstruction algorithm in the process of document summarization as follows:
s.t.βj≥0,aij≥0,and ai∈Rn stβj ≥ 0, a ij ≥ 0, and a i ∈ R n
上式中,每个句子的挑选过程加入了图书章节句子vi的权重值fi *,其中aij≥0表明该方法只允许集合空间中句子的加法运算,不允许减法运算;同时In the above formula, the weight value f i * of the book chapter sentence v i is added to the selection process of each sentence, where a ij ≥ 0 indicates that the method only allows the addition of sentences in the set space, and does not allow subtraction; at the same time
β=[β1,β2,...,βn]T是一个辅助变量;如果βj=0的话,则所有的a1j,...,anj为0,这意味着第j列的候选句子没有被选中,γ是正则项参数;β=[β 1 , β 2 ,..., β n ] T is an auxiliary variable; if β j =0, then all a 1j ,..., a nj are 0, which means that the jth column The candidate sentence of is not selected, γ is the regular term parameter;
4.3基于数据重构的文档摘要生成算法的目标函数是一个凸优化问题,可以保证全局最优解,此时,固定ai,令J对β的导数为0,得到β的最小解如下:4.3 The objective function of the document summary generation algorithm based on data reconstruction is a convex optimization problem, which can guarantee the global optimal solution. At this time, fix a i , let the derivative of J to β be 0, and obtain the minimum solution of β as follows:
当得到了β的最小解之后,非负约束下的最小化问题可以用拉格朗日方法求解;When the minimum solution of β is obtained, the minimization problem under non-negative constraints can be solved by Lagrangian method;
4.4令αij为约束条件aij≥0和A=[aij]下的拉格朗日算子,则拉格朗日公式L如下:4.4 Let α ij be the Lagrangian operator under the constraints a ij ≥ 0 and A=[a ij ], then the Lagrangian formula L is as follows:
L=J+Tr[αAT]=Tr[F(V-AV)(V-AV)T+diag(β)-1ATA]+γ||β||1+Tr[αAT],α=[αij]L=J+Tr[ αAT ]=Tr[F(V-AV)(V-AV) T +diag(β) -1 A T A]+γ||β|| 1 +Tr[ αAT ], α=[α ij ]
F是步骤4.1中的对角矩阵,对角矩阵F对角线上的元素项分别为diag(β)也是一个对角矩阵,对角矩阵diag(β)对角线上的元素项分别为β1,...,βn;F is the diagonal matrix in step 4.1, and the elements on the diagonal of the diagonal matrix F are diag(β) is also a diagonal matrix, and the elements on the diagonal of the diagonal matrix diag(β) are β 1 ,..., β n ;
4.5拉格朗日公式L对A求导结果如下:4.5 Lagrangian formula L to A derivation results are as follows:
令的导数为0,可以得到关于α的表示如下:make The derivative of is 0, and the expression about α can be obtained as follows:
α=2FVVT-2FAVVT-2Adiag(β)-1 α=2FVV T -2FAVV T -2Adiag(β) -1
根据Karush-Kuhn-Tucker条件αijaij=0,对上式各项乘以aij得到如下等式:According to the Karush-Kuhn-Tucker condition α ij a ij =0, multiply the items of the above formula by a ij to get the following equation:
(FVVT)ijaij-(FAVVT)ijaij-(Adiag(β)-1)ijaij=0(FVV T ) ij a ij -(FAVV T ) ij a ij -(Adiag(β) -1 ) ij a ij =0
根据上式得到如下的更新公式:According to the above formula, the following update formula is obtained:
将上述更新公式迭代执行直到收敛,最终得到图书章节的摘要句子。The above update formula is iteratively executed until convergence, and finally the summary sentence of the book chapter is obtained.
实施例Example
如附图3至附图5所示,给出了图书章节摘要生成方法的一个应用实例。下面结合本技术的方法详细说明该实例实施的具体步骤,如下:As shown in accompanying drawings 3 to 5, an application example of the method for generating book chapter summaries is given. Below in conjunction with the method of this technology describe in detail the concrete steps that this example implements, as follows:
(1)在系统已经预处理所有的图书章节,得到图书章节文档内容。假设用户正在阅读图书《分布式计算原理与应用》的第一章“分布式计算简介”的第一节“定义”,想要知道这一节的章节摘要,点击“目录”按钮,双击对应章节,系统首先获取该章节的文本信息和用户的阅读行为等数据。(1) After the system has pre-processed all book chapters, the document content of the book chapters is obtained. Suppose the user is reading the first section "Definition" of the first chapter "Introduction to Distributed Computing" in the book "Principles and Applications of Distributed Computing". If you want to know the chapter summary of this section, click the "Contents" button and double-click the corresponding chapter , the system first obtains data such as the text information of the chapter and the user's reading behavior.
(2)根据用户阅读行为数据分析用户在该章节阅读的类型和层次,根据图书书页的综合评分公式得到图书书页的重要度量化得分。(2) According to the user's reading behavior data, the type and level of the user's reading in the chapter are analyzed, and the important quantitative score of the book page is obtained according to the comprehensive scoring formula of the book page.
(3)将图书该章节的文本数据按句子划分,结合用户阅读画线行为和图书书页的量化得分,得到了被划线句子的初始权重值。(3) The text data of this chapter of the book is divided into sentences, and the initial weight value of the underlined sentence is obtained by combining the user's reading line-drawing behavior and the quantitative score of the book page.
(4)将句子做分词,去除停用词等处理,每个句子构建一个高维空间的向量,根据向量之间的距离得到句子两两之间的相似度。(4) Segment the sentences, remove stop words, etc., construct a vector in a high-dimensional space for each sentence, and obtain the similarity between two sentences according to the distance between the vectors.
(5)通过数据流形空间上的排序方法进行句子初始权重值的传播,最后得到每个句子合理的权重值。(5) Propagate the initial weight value of the sentence through the sorting method on the data manifold space, and finally obtain the reasonable weight value of each sentence.
(6)将句子权重值矩阵F加入基于数据重构的文档摘要生成算法中,执行算法直到收敛从该图书章节中选取若干句子(视章节长短而定)作为该图书章节的摘要信息,最后返回给用户。(6) Add the sentence weight value matrix F into the document summary generation algorithm based on data reconstruction, and execute the algorithm until it converges. Select several sentences (depending on the length of the chapter) from the book chapter as the summary information of the book chapter, and finally return to the user.
本实例的运行结果在附图3至中显示,用户正在阅读图书,可以通过目录查看对应章节的摘要内容,方便用户更快更详细的了解章节内容,这种图书章节摘要生成方法有良好的使用价值和应用前景。The running results of this example are shown in attached drawings 3 to . The user is reading a book, and can view the summary content of the corresponding chapter through the table of contents, which is convenient for the user to understand the chapter content faster and in more detail. This method of generating book chapter summaries is very useful. value and application prospects.
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410090143.6A CN103885935B (en) | 2014-03-12 | 2014-03-12 | Books chapters and sections abstraction generating method based on books reading behavior |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410090143.6A CN103885935B (en) | 2014-03-12 | 2014-03-12 | Books chapters and sections abstraction generating method based on books reading behavior |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103885935A true CN103885935A (en) | 2014-06-25 |
CN103885935B CN103885935B (en) | 2016-06-29 |
Family
ID=50954830
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410090143.6A Expired - Fee Related CN103885935B (en) | 2014-03-12 | 2014-03-12 | Books chapters and sections abstraction generating method based on books reading behavior |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103885935B (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI549003B (en) * | 2014-08-18 | 2016-09-11 | 葆光資訊有限公司 | Method for automatic sections division |
CN106469176A (en) * | 2015-08-20 | 2017-03-01 | 百度在线网络技术(北京)有限公司 | A kind of method and apparatus for extracting text snippet |
CN107608972A (en) * | 2017-10-24 | 2018-01-19 | 河海大学 | A kind of more text quick abstract methods |
CN108231064A (en) * | 2018-01-02 | 2018-06-29 | 联想(北京)有限公司 | A kind of data processing method and system |
CN109241863A (en) * | 2018-08-14 | 2019-01-18 | 北京万维之道信息技术有限公司 | For splitting the data processing method and device of reading content |
CN111199151A (en) * | 2019-12-31 | 2020-05-26 | 联想(北京)有限公司 | Data processing method and data processing device |
US10929452B2 (en) | 2017-05-23 | 2021-02-23 | Huawei Technologies Co., Ltd. | Multi-document summary generation method and apparatus, and terminal |
CN115048507A (en) * | 2022-05-24 | 2022-09-13 | 维沃移动通信有限公司 | Abstract generation method and electronic equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020138528A1 (en) * | 2000-12-12 | 2002-09-26 | Yihong Gong | Text summarization using relevance measures and latent semantic analysis |
CN1614585A (en) * | 2003-11-07 | 2005-05-11 | 摩托罗拉公司 | Context Generality |
CN102841940A (en) * | 2012-08-17 | 2012-12-26 | 浙江大学 | Document summary extracting method based on data reconstruction |
-
2014
- 2014-03-12 CN CN201410090143.6A patent/CN103885935B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020138528A1 (en) * | 2000-12-12 | 2002-09-26 | Yihong Gong | Text summarization using relevance measures and latent semantic analysis |
CN1614585A (en) * | 2003-11-07 | 2005-05-11 | 摩托罗拉公司 | Context Generality |
CN102841940A (en) * | 2012-08-17 | 2012-12-26 | 浙江大学 | Document summary extracting method based on data reconstruction |
Non-Patent Citations (3)
Title |
---|
ZHANYING HE等: "Document Summarization Based on Data Reconstruction", 《PROCEEDINGS OF THE TWENTY-SIXTY AAAI CONFERENCE ON ARTIFICIAL INTELLIGENCE》 * |
ZHIMING ZHANG等: "《Web-Age Information Management》", 16 June 2013, VERLAG BERLIN HEIDELBERG * |
乔少杰等: "基于中心性和PageRank的网页综合评分方法", 《西南交通大学学报》 * |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI549003B (en) * | 2014-08-18 | 2016-09-11 | 葆光資訊有限公司 | Method for automatic sections division |
CN106469176A (en) * | 2015-08-20 | 2017-03-01 | 百度在线网络技术(北京)有限公司 | A kind of method and apparatus for extracting text snippet |
CN106469176B (en) * | 2015-08-20 | 2019-08-16 | 百度在线网络技术(北京)有限公司 | It is a kind of for extracting the method and apparatus of text snippet |
US10929452B2 (en) | 2017-05-23 | 2021-02-23 | Huawei Technologies Co., Ltd. | Multi-document summary generation method and apparatus, and terminal |
CN107608972A (en) * | 2017-10-24 | 2018-01-19 | 河海大学 | A kind of more text quick abstract methods |
CN108231064A (en) * | 2018-01-02 | 2018-06-29 | 联想(北京)有限公司 | A kind of data processing method and system |
CN109241863A (en) * | 2018-08-14 | 2019-01-18 | 北京万维之道信息技术有限公司 | For splitting the data processing method and device of reading content |
CN111199151A (en) * | 2019-12-31 | 2020-05-26 | 联想(北京)有限公司 | Data processing method and data processing device |
CN115048507A (en) * | 2022-05-24 | 2022-09-13 | 维沃移动通信有限公司 | Abstract generation method and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN103885935B (en) | 2016-06-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Deep learning for aspect-based sentiment analysis | |
CN103885935B (en) | Books chapters and sections abstraction generating method based on books reading behavior | |
Qi et al. | Finding all you need: web APIs recommendation in web of things through keywords search | |
CN111061856B (en) | A knowledge-aware news recommendation method | |
Chen et al. | General functional matrix factorization using gradient boosting | |
CN105117428B (en) | A kind of web comment sentiment analysis method based on word alignment model | |
CN113535984A (en) | A method and device for predicting knowledge graph relationship based on attention mechanism | |
KR20200094627A (en) | Method, apparatus, device and medium for determining text relevance | |
CN109241424A (en) | A kind of recommended method | |
CN108345702A (en) | Entity recommends method and apparatus | |
CN110110062A (en) | Machine intelligence question answering method, device and electronic equipment | |
CN109670039A (en) | Semi-supervised E-commerce Review Sentiment Analysis Method Based on Tripartite Graph and Cluster Analysis | |
CN112528136B (en) | A method, device, electronic device and storage medium for generating opinion labels | |
CN110020176A (en) | A kind of resource recommendation method, electronic equipment and computer readable storage medium | |
CN109145083B (en) | Candidate answer selecting method based on deep learning | |
CN107545033B (en) | A Computational Method for Knowledge Base Entity Classification Based on Representation Learning | |
CN114385930B (en) | A method and system for recommending points of interest | |
CN112861522B (en) | Aspect-level emotion analysis method, system and model based on dual-attention mechanism | |
CN109255012A (en) | A kind of machine reads the implementation method and device of understanding | |
CN111160859A (en) | Human resource post recommendation method based on SVD + + and collaborative filtering | |
CN111858961B (en) | Multi-language knowledge matching method and device for nodes and links in knowledge graph | |
CN113516094A (en) | A system and method for matching review experts for documents | |
Fulmal et al. | The implementation of question answer system using deep learning | |
CN106126567A (en) | Method based on trust data recommendation service | |
Brochier et al. | New datasets and a benchmark of document network embedding methods for scientific expert finding |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160629 |