Nothing Special   »   [go: up one dir, main page]

CN117112794A - Knowledge enhancement-based multi-granularity government service item recommendation method - Google Patents

Knowledge enhancement-based multi-granularity government service item recommendation method Download PDF

Info

Publication number
CN117112794A
CN117112794A CN202310582574.3A CN202310582574A CN117112794A CN 117112794 A CN117112794 A CN 117112794A CN 202310582574 A CN202310582574 A CN 202310582574A CN 117112794 A CN117112794 A CN 117112794A
Authority
CN
China
Prior art keywords
vector
word
government
query
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310582574.3A
Other languages
Chinese (zh)
Inventor
付春雷
李成高
程子游
洪伟
赵义伟
吴冕
鄢萌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University
Original Assignee
Chongqing University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University filed Critical Chongqing University
Priority to CN202310582574.3A priority Critical patent/CN117112794A/en
Publication of CN117112794A publication Critical patent/CN117112794A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • G06N3/0455Auto-encoder networks; Encoder-decoder networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q50/00Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
    • G06Q50/10Services
    • G06Q50/26Government or public services

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Tourism & Hospitality (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Primary Health Care (AREA)
  • General Business, Economics & Management (AREA)
  • Human Resources & Organizations (AREA)
  • Animal Behavior & Ethology (AREA)
  • Economics (AREA)
  • Educational Administration (AREA)
  • Development Economics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a knowledge enhancement-based multi-granularity government service item recommendation method, which comprises the steps of acquiring data, constructing and training a KEMG model, wherein the KEMG model comprises an embedded layer for fusing word granularity information, a CNN enhancement-based coding layer, an interaction layer based on an attention mechanism, and a fusion layer and a plurality of full-connection layers to obtain similarity prediction scores. Training the KEMG model according to the acquired data to obtain an optimal KEMG model. And for a new user, S1 is adopted to obtain a user vector, an optimal KEMG model is input, the optimal KEMG model calculates the association degree of the new user and all service items, the service item sequences TOP-k corresponding to the association degree values are arranged in descending order according to the association degree values, and the service item sequences TOP-k corresponding to the association degree values are output. Experiments prove that the method can ensure the retrieval efficiency and has higher information retrieval accuracy.

Description

Knowledge enhancement-based multi-granularity government service item recommendation method
Technical Field
The invention relates to the field of government service recommendation, in particular to a knowledge-enhancement-based multi-granularity government service item recommendation method.
Background
Government service is a very important public service provided by government for people, and is directly related to the production and life of people, and is an important responsibility of the government of people. With the continuous advancement of informatization, "internet + government services" has become an important direction of government work. In recent years, service item information types and ranges covered by government platforms are wider and wider, and data sources have the characteristic of isomerism. Meanwhile, the platform access amount is increased explosively year by year, taking Chongqing city as an example, according to the working year report of the government website of 2022 years of Chongqing city statistical bureau, the total access amount of independent users of the government website reaches 683934 by the end of 2022 years, the total access amount of the website reaches 2094310, and is respectively increased by 59.8% and 320.7% compared with 2021 years. In the face of huge government information and massive access requests, the traditional data management mode cannot be effectively applied, so that the service quality of a government platform is low.
On the other hand, the service item information retrieval is one of the core demands of users, and the required information can be rapidly acquired through functions such as searching, question answering and the like. In the retrieval process, text similarity matching between an input query and a to-be-retrieved item name is the core. Related researches show that most of the current government platforms still adopt traditional word frequency statistical algorithms to realize matching, and the methods have the advantages of quick calculation and small occupied memory, but have very limited feature mining on text semantics. Under the condition of government service item retrieval, the user culture level is uneven, the query contains spoken contents, the spoken contents have larger difference from the names of the items to be retrieved, and the retrieval requirements cannot be accurately described. If the user inputs "the license is missing for the repair", the item name "the vehicle certificate is missing for re-claim registration" should be returned, but the system returns to other items containing the "repair" key because both lack the same key.
Therefore, the improvement of the experience of people's mass inquiry and government service handling needs to construct a government service item retrieval model which can only be accurate based on deep learning, knowledge graph and other technologies.
With the rapid development of artificial intelligence, the deep learning technology has been widely applied to various fields, and has made a breakthrough in image and natural language processing in particular. As a subtask of a retrieval task in the field of natural language processing, a plurality of text matching models based on deep learning are sequentially proposed, and the effect is obviously improved. However, the deep learning model still has limitations in processing text, such as the inability to truly grasp text deep semantics, the lack of logic in generating text, and the like. Google in 2012 proposes a knowledge graph, and the knowledge graph technology can convert text data into structured graph data so as to better combine with a deep learning model and exert the advantages thereof. Aiming at the application scene of government affair service, the knowledge graph technology can assist in understanding and utilizing a large amount of scattered service item data, deep analysis and mining are realized by converting the data into a structured knowledge graph, more intelligent and accurate government affair service is provided for users, and data support is provided for government decision-making. By deep learning, the rich entity and relation information in the knowledge graph can be used for assisting the deep learning model to train, and the understanding capability of the model on the semantics is improved.
However, related technology in the government service field has little research. On the one hand, the data in the government service field usually exist in a structured and semi-structured form, and are different from the structured data required by the knowledge graph. Meanwhile, government service resources are huge and scattered, various and complex in level, and great difficulty is brought to the construction and maintenance of the knowledge graph. On the other hand, the deep text matching model has a complex structure, and the operation and online retrieval efficiency after actual deployment becomes a problem to be solved when the deep learning technology is applied to the field of government service.
Therefore, based on the atlas technology and the deep learning technology, the government service item retrieval method is researched, the information retrieval accuracy is improved while the retrieval efficiency is ensured, and the method has important significance for improving the government service level.
Disclosure of Invention
Aiming at the problems existing in the prior art, the invention aims to solve the technical problems that: how to improve the information retrieval accuracy while ensuring the retrieval efficiency based on the atlas technology and the deep learning technology.
In order to solve the technical problems, the invention adopts the following technical scheme: a multi-granularity government service item recommendation method based on knowledge enhancement comprises the following steps:
S1: and acquiring a plurality of user input query texts and to-be-matched government affair names, and constructing a government affair knowledge graph.
S2: and constructing and training a KEMG model, wherein the KEMG model comprises an embedded layer fusing word granularity information, a coding layer based on CNN enhancement, an interaction layer based on an attention mechanism, and a fusion layer and a plurality of full-connection layers to obtain similarity prediction scores.
S2-1: embedding word vectors into all user input query texts and government matters names to be matched to obtain query word granularity vectors and government matters granularity vectors, splicing and fusing the query word granularity vectors and the query word granularity vectors to form query final text sentence representation vectors, and splicing and fusing the government matters granularity vectors and government matters granularity vectors to form government final text sentence representation vectors;
s2-2: and (3) firstly adopting ESIM to code context sequence information for the final text sentence representation vector of the query and the final text sentence representation vector of the government affair obtained in the step (S2-1), and then adopting CNN network to code the coded vector for the second time to obtain the characteristic code value of the query keyword and the characteristic code value of the government affair keyword.
S2-3: and using an improved graph attention mechanism to allocate different weights to each node in the government affair knowledge graph, and utilizing the weights to enhance the granularity vector of the government affair words. And interacting the characteristic code value of the query keyword and the characteristic code value of the government keyword by adopting a soft attention mechanism to obtain the weighted enhanced representation of each word, inputting the enhanced representation of each word into a pooling layer to obtain a characteristic vector obtained by a pooling layer for splicing, and respectively obtaining a final representation vector of the user query and a final representation vector of the item name.
S2-4: the fusion layer realizes multi-feature cross fusion through vector alignment subtraction and vector alignment multiplication to obtain the matching degree between query-item names;
s2-5: the matching degree between the obtained inquiry and item names is directly input into a multi-layer fully-connected neural network prediction matching score, the multi-layer fully-connected neural network prediction matching score is arranged according to the prediction matching score in a descending order, and the government item names corresponding to the prediction matching score are input.
And S3, for a new user, obtaining a user vector by adopting the S1, inputting an optimal KEMG model, calculating the association degree of the new user and all service items by the optimal KEMG model, arranging the new user and all service items in descending order according to the association degree value, and outputting a service item sequence TOP-k corresponding to the association degree value.
As an improvement, the process of obtaining the word granularity vector by the S2-1 is as follows:
the Embedding Layer (Embedding Layer) aims to map the text subjected to word segmentation from a high-dimensional space to a low-dimensional space, so that feature augmentation is carried out on the text to participate in vector operation of the neural network. Word Embedding (Word Embedding) is a common method used by the Embedding layer, and sentence vectors can be characterized to a large extent in a short text task through Word vector combination.
Word2vec is adopted to embed Word vectors for the user input query text and the government affair item text to be matched, as shown in equations (1) and (2).
Wherein Q is w And S is w A vector matrix formed by Word vector combination after Word2vec is respectively used for representing the Query (Query) input by a user and the Name (Service Name) of the government Service item to be matched,represents a w-dimensional vector obtained by embedding each word in two texts, d w Representing the dimensionality of the word vector. l (L) a And l b Representing the length of the query and the government service item name to be matched, which are input by the user, respectively.
As an improvement, the process of obtaining the word granularity vector by the S2-1 is as follows:
word embedding of text is achieved by adopting a joint learning word embedding model JWE, as shown in a formula (3).
Wherein the method comprises the steps ofRespectively refer to the context words, the context Chinese characters, the context sub-characters, w i Is a target word, L (w i ) Representing the target word w i Log-likelihood sum of three predictive conditional probabilities of (c).
Prediction probabilityDefined by the Softmax function as shown in equation (4).
Wherein the method comprises the steps ofIs the target word w i Output vector of>Representing the current target word->Any one of the target words in the sum of all the target words representing 1 to N.
Each word in the query input by the user and the name of the government service item to be matched is characterized by a word vector through JWE. The word vector representations corresponding to the user-entered query and the government service item name to be matched are respectively marked as Q j And S is j As shown in equations (5) and (6).
Wherein,word vector representation of i, j-th Chinese characters in inquiry input by user and to-be-matched government service item name respectively, d c Representing the dimension of the word vector. z q ,z s The word numbers of two texts respectively representing the query input by the user and the names of the government service matters to be matched.
As an improvement, the process of splicing and fusing the word granularity vector and the word granularity vector to form the final text pair embedded characterization in S2-1 is as follows:
new combined vector Q for the ith term in the query entered by the user i The calculation formula is shown as (7).
Wherein the method comprises the steps ofRepresenting feature vectors of all word granularity vectors forming the ith word in the text after pooling operation;
The final text sentence characterization vector Q is queried, and the calculation formula is shown as (8).
Q=(Q 1 ,...,Q l ) (8)
Wherein,word vector embedding dimension defining fusion word granularity of model is d=d w +d c
And obtaining the final text sentence characterization vector S of the government affair by adopting the same method as the Q acquisition.
As an improvement, the step of obtaining the keyword feature codes by the S2-2 is as follows:
the ESIM is adopted to encode the context sequence information of the final text sentence characterization vector, and BiLSTM is adopted in the ESIM to acquire the text sequence characteristics, and the calculation process of the encoded characterization vector is shown in formulas (9) and (10).
Wherein,respectively representing new code values obtained in the ith time step and the jth time step after the user inputs the query text and the government affair item name to be matched pass through the BiGRU, d g Represents the hidden layer dimension set in biglu.
Performing a second encoding on a CNN network, performing convolution operation on the CNN network by using convolution kernels with the sizes of 2,3 and 4, wherein the process of extracting text features by using a convolution window is as follows, and firstly, the process is as follows through a formula (11):
wherein,representing word vector +.>Is->Representing a stitching operation. Convolution kernel in convolution operation>It slides over the text in the form of a window of size w to extract new features as shown in equation (12).
Wherein C is i Representing the ith feature extracted by the convolution kernel during the convolution process.
Input vectors that will go out of range(i<1 or i>l) is regarded as zero vector,/->Representing word vector q i ,q i+1 ,q i+2 ,…,q w The stitched matrix, b, represents the bias in the convolutional network.
Application of convolution kernel of size w to sentenceThe extracted feature map with each possible hidden state is shown in formula (13).
q=[q 1 ,q 2 ,...,q l-w+1 ] T (13)
Wherein the method comprises the steps ofInputting the obtained 6 feature maps into a maximum pooling layer to obtain a feature vector representation with fixed length:
Q i =P({q i1 ,q i2 ,...,q i(l-w+1) }) (14)
wherein Q is i Representing the feature vector of the ith feature map after the maximum pooling operation, q i1 The 1 st feature of the i-th feature map is shown. q i(l-w+1) The (l-w+1) th feature of the i-th feature map is represented.
Finally, the feature vectors after pooling are spliced together through a formula (15) to form the final query keyword feature code
Adoption and acquisitionThe government key word feature code +.>
As an improvement, the process of obtaining the final characterization vector of the user query and the final characterization vector of the item name in S2-3 is as follows:
the SIF sentence vector generation method comprises the following steps:
1) Obtaining a preliminary sentence vector: traversing all sentences of the corpus, and calculating a preliminary sentence vector v of the current sentence s through a formula (16) s
Where |s| represents the number of words of sentence s, v w Word vector representing word w in sentence, p (w) being word w in whole languageWord frequency probability, a, in the database is an adjustable parameter.
2) And carrying out principal component analysis on all the preliminary sentence vectors: a first principal component u of the whole preliminary sentence vector is calculated, called a common speech vector, u being a first principal vector of a matrix constituted by word vectors constituting the sentence.
3) Obtaining a target sentence vector: and (3) performing secondary processing on the primary sentence vector through a formula (17) to obtain a target sentence vector.
v s =v s -uu T v s (17)
Wherein the method comprises the steps ofRepresenting the sentence vector value after SIF embedding, < >>w represents a sentence and each word, respectively.
Obtaining sentence vector value of user input query text embedded by SIF by adopting formula (18)Sentence vector value of government affair item name to be matched after SIF embedding>
The improved schematic force calculation formula is shown as (19).
Wherein,representing the representation of the government matters name sentence vector enhanced by fusing the government knowledge graph, and the +.>Representing all government knowledge nodes directly associated with a government event including the government event itself +.>And representing sentence vector values obtained after the relevant government knowledge is embedded by SIF.
The soft attention mechanism interacts the query keyword feature code value and the government keyword feature code value:
First, a correlation score is calculated, and the score value is calculated as shown in formula (20).
In e ij Representing in a query final text sentence token vectorAnd querying +.>And obtaining a relevance score through a point multiplication mode.
The correlation score is normalized, and the correlation score is converted into a weight coefficient for subsequent weighted characterization, and the process is shown in a formula (21).
Wherein,representing a new characterization vector of the query text input by the user, wherein the new characterization vector is a weighted summation result of the ith word in the final text sentence characterization vector and each word in the final government text sentence characterization vector after attention coefficient distribution, and e ik And a word vector representing any word in the final text characterization vector of the government affairs.
For the j-th item in item nameBy the word it is meant,the calculation process and meaning of (a) are the same as shown in the formula (22).
Representing a new characterization vector of the names of the government matters to be matched, wherein the new characterization vector is the result of weighted summation carried out after the j-th word in the final sentence characterization vector of the government matters and each word in the final sentence characterization vector of the inquiry text are subjected to attention coefficient distribution, and e kj A word vector representing any word in the user-entered query text token vector.
Adopt pooling strategy pairAnd->Dimension reduction is performed, and the calculation process is shown in formulas (23) and (24) by using an average pooling mode and a maximum pooling mode.
v q,ave ,v q,max ,v s,ave ,v s,max And 4 vectors obtained by carrying out average pooling and maximum pooling on the query text input by the user and the names of the items to be retrieved are respectively represented.
And then splicing to obtain a characterization vector obtained by soft attention interaction operation of the user input query text and the government affair names to be matched:
wherein,and respectively representing a user query final characterization vector corresponding to the user input query text and an item name final characterization vector corresponding to the to-be-matched government item name, which are obtained after the soft attention interaction operation.d g Representing hidden layer dimensions set in the GRU.
As an improvement, the process of obtaining the matching degree between the query-item names in S2-4 is as follows:
and realizing multi-feature cross fusion through vector alignment subtraction and vector alignment multiplication. Splicing three characteristics of a user input query text and a government affair item name to be matched respectively, and storing the three characteristics in a sequence:
respectively carrying out para-position subtraction and para-position multiplication on two sequences, and then splicing the results of the two sequences and the original sequence value:
m out =[m Q ;m S ;m Q -m S ;m Q ⊙m S ] (29)
through the fusion layer, the matching degree between query-item names is realized through vectors The representation is performed.
As an improvement, the process of calculating the matching score by the S2-5 is as follows: as shown in formula (30):
m score =F s (W·m out +b) (30)
wherein W, b represents parameters of the multi-layer fully-connected layer, F s (. Cndot.) is a Sigmod activation function.
As a modification, margin Ranking Loss introduces a minimum correlation gap between two events for each sample pair when calculating loss. For each sample pair (x i ,x j ) The loss function is as follows:
l(x k ,x j )=max(0,m-(s(x k )-s(x j ))) (31)
wherein m represents a Margin parameter, s (x i ) And s (x) j ) Respectively sample x k And x j Is a correlation score of (2).
The ServiceRank loss function is proposed:
wherein,representing a data set consisting of user input text and a to-be-retrieved item name text, < >>Representing the number of samples in the dataset, s i Representing the number of government affair item names corresponding to the input query text of the ith user,/for>Representing a set of all government matter names corresponding to the ith user input query text, l (x) j ,x k ) Margin Ranking Loss calculated from (31).
Compared with the prior art, the invention has at least the following advantages:
1. the comparison experiment result shows that under the government service item retrieval task, the retrieval effect of the method is superior to that of other non-BERT type models, and the parameter quantity is greatly less than that of the BERT type models, so that the feasibility of on-line deployment is ensured. Loss function strategy comparison experiments show the effectiveness of ServiceRank in constructing a data set.
2. An ablation experiment is performed, and the influence of a plurality of modules such as introduced word embedding, CNN network, government knowledge and the like on model retrieval capability is verified. The four super parameters of the number of neurons of the full-connection layer and the Dropout rate of the last layer are subjected to experimental analysis on the influence of the overall retrieval effect of the model, so that the performance index of the model in the optimal state is determined.
Drawings
FIG. 1 is a schematic diagram of the method of the present invention.
FIG. 2 is a diagram of an example triplet.
FIG. 3 is a knowledge graph hierarchical diagram.
FIG. 4 is a schematic diagram of government knowledge enhanced query-issue pair matching capability.
Fig. 5JWE structure diagram.
FIG. 6 pools fused word vectors.
FIG. 7 is a flow chart for word embedding with converged word granularity.
FIG. 8 is a flow chart for convolution extracting text features.
FIG. 9 is a diagram of a government knowledge fusion process.
Fig. 10 is a schematic diagram of a soft attention operation.
Details of the same problem case are given in fig. 11.
FIG. 12 item subdirectory.
Fig. 13 results comparison of search results, wherein fig. 13 (a) shows comparison of results for different models under ServiceRank, fig. 13 (b) shows comparison of results for the model herein under six loss functions, and fig. 13 (c) and (d) each show comparison of training/testing time for each model.
FIG. 14 model training convergence process. Fig. 14 (a) shows a training set and verification set loss function change chart in a model training iteration, and fig. 14 (b) shows the change condition of three indexes in the training iteration process.
Fig. 15 shows the results of the module ablation experiment, and fig. 15 (a) and 15 (b) show the model single module ablation comparison and the model two-by-two module combination comparison, respectively.
The effect of each parameter on model NDCG in fig. 16, and the effects of parameter a, over-parameter m, neuron number and Dropout rate on model NDCG values in fig. 16 (a) -16 (d), respectively.
Detailed Description
The present invention will be described in further detail below.
Aiming at the defects of ESIM, a knowledge-enhanced multi-granularity government service item retrieval model (Knowledge Enhanced Multi-Granularity Service Item Retrieval Model, KEMG for short) is provided in the text, and the whole structure is shown in figure 1.
In the text embedding layer, the model respectively obtains a text Word vector matrix and a text Word vector matrix through Word2vec and JWE, and the text embedding vector is obtained after splicing. Meanwhile, the model prepares sentence vectors of the query and item names and government knowledge directly connected with the item for fusing government knowledge in a subsequent interaction layer by using an SIF sentence vector generation method to the original text.
In the coding layer, the model first feeds text-embedded vectors into the biglu. The biglu is used to extract context information of the text. In order to capture characteristic information such as keywords of a text, the invention provides the feature information of word granularity captured by adding a CNN structure and utilizing the characteristics of a CNN convolution kernel. And finally, respectively obtaining a sequence coding value output by the BiGRU and a characteristic coding value subjected to CNN secondary coding.
In the interaction stage, the government affair items to be searched and the related government affair knowledge are searched and the corresponding sentence vectors are read. The invention uses the improved graph attention mechanism to assign different weights to each node in government knowledge and uses them to enhance the characterization of item name sentence vectors. And interacting the two text sequence values after sequence coding by adopting a soft attention mechanism, obtaining the enhanced representation of each word after weighting, and inputting the enhanced representation into a pooling layer. The model is used for splicing the secondary coding feature obtained by the convolutional neural network, the text sentence vector obtained by enhancing government knowledge and the feature vector obtained by pooling after soft attention interaction to respectively obtain the final characterization vector of the user query and the item name. The multi-dimensional characteristics of the reserved text extracted by the plurality of modules can be effectively extracted through the interaction strategy of the model.
And finally, the fusion layer comprises operations such as carrying out para-position subtraction and para-position multiplication on the vectors and is used for fusing the characterization vectors of the two texts. And finally obtaining the similarity prediction score through a plurality of full-connection layers.
The process of the government knowledge graph is as follows
Knowledge maps can be logically divided into a pattern layer and a data layer. The pattern layer is the core of the knowledge graph, stores the refined knowledge, and defines and standardizes the data level and class in the field. An ontology library is typically used to manage the schema layer of the graph, and the association between entities, relationships, and types and attributes of entities in the graph is normalized with the capabilities of rules, axioms, constraints, etc. in the ontology library.
The data layer is responsible for the specific storage of specific triples in the knowledge graph, is structurally under the mode layer, and is an actual expression form of the whole knowledge graph. In the data layer, triples are stored in the graph database in the form of two expressions of < entity 1, relationship-entity 2> and < entity, attribute, value >, as shown in fig. 2.
The structure definition of a mode layer and the processing of a data layer form a huge entity relation network to form a knowledge graph. The structure of the mode layer and the data layer is shown in fig. 3.
The knowledge graph construction starts from acquiring the original knowledge data, a series of knowledge processing techniques (automatic or semi-automatic) are adopted to extract the required knowledge elements from the original data, and then the knowledge elements are stored according to the definition constraint of the mode layer and the data layer. The knowledge graph construction method is divided into bottom-up and top-down.
The bottom-up construction mode is that on the basis of a large number of existing texts and data tables, the most basic concepts, attributes and relations are defined based on the service, then triples are extracted and added into a data layer, the concepts with high overlapping attributes/relations are clustered based on the data, a mode layer of a knowledge graph is abstracted, and then subsequent knowledge processing is guided.
The top-down construction method is to define a schema layer according to the domain concept, and then newly add triples into a data layer based on the constraint of the schema layer. The method comprises the following steps: 1) Defining a knowledge graph mode layer; 2) Based on the mode layer constraint, extracting triples from the multi-source heterogeneous data sources; 3) Redundant information in the triplet data is eliminated through knowledge fusion related technology; 4) Storing the washed triples in a knowledge graph. The method is applicable to the field knowledge graph. The domain knowledge graph has higher requirements on entity information, and the domain concepts are concentrated, so that a graph mode layer can be comprehensively and efficiently constructed.
The invention discloses a government service item knowledge graph, which belongs to the field knowledge graph category.
A multi-granularity government service item recommendation method based on knowledge enhancement comprises the following steps:
s1: acquiring a plurality of user input query texts and to-be-matched government affair names, and constructing a government affair knowledge graph;
s2: and constructing and training a KEMG model, wherein the KEMG model comprises an embedded layer fusing word granularity information, a coding layer based on CNN enhancement, an interaction layer based on an attention mechanism, and a fusion layer and a plurality of full-connection layers to obtain similarity prediction scores.
S2-1: embedding word vectors into all user input query texts and government matters names to be matched to obtain query word granularity vectors and government matters granularity vectors, splicing and fusing the query word granularity vectors and the query word granularity vectors to form query final text sentence representation vectors, and splicing and fusing the government matters granularity vectors and government matters granularity vectors to form government final text sentence representation vectors;
S2-2: the user input query and the government event name to be matched are mapped into the vector space after passing through the embedded layer of the model. At this time, the sentence features are only composed of word vectors fused with word granularity, and feature information such as context and keywords is absent, so that further encoding is required to extract sequence features and text keyword information.
The invention improves the feature extraction mode of the context information on the basis of ESIM, introduces a CNN network to carry out secondary coding to capture text features with finer granularity, and greatly reduces the overall parameter quantity while improving the model effect due to the characteristics of the CNN network.
And (3) firstly adopting ESIM to code context sequence information for the final text sentence representation vector of the query and the final text sentence representation vector of the government affair obtained in the step (S2-1), and then adopting CNN network to code the coded vector for the second time to obtain the characteristic code value of the query keyword and the characteristic code value of the government affair keyword.
The convolutional neural network (Convolutional Neural Networks, CNN for short) carries out secondary coding on the basis of BiGRU coding, and combines the advantages of the two, so that the model can capture and retain finer granularity text characteristics, and meanwhile, the overall parameter number of the model is further reduced due to the characteristic of CNN network parameter sharing.
S2-3: and using an improved graph attention mechanism to allocate different weights to each node in the government affair knowledge graph, and utilizing the weights to enhance the granularity vector of the government affair words. And interacting the characteristic code value of the query keyword and the characteristic code value of the government keyword by adopting a soft attention mechanism to obtain the weighted enhanced representation of each word, inputting the enhanced representation of each word into a pooling layer to obtain a characteristic vector obtained by a pooling layer for splicing, and respectively obtaining a final representation vector of the user query and a final representation vector of the item name. The user only inputs the query text, and the text of the item to be searched and the related knowledge text vector are stored at the server side.
The text features already have certain semantic information after model embedding and encoding, but interaction operation between texts is still lacking, and comparison of lexical and syntactic information between texts is lacking, which means that a large amount of semantic information is lost. On the other hand, the semantic information acquired at present is only focused on the names of the inquires and the items to be searched, and the semantic mining of the deeper government field is difficult to realize only by the model.
Fusing government knowledge graphs may help models better understand deep links between user-entered queries and government service matters, such as user-entered "what materials are needed to reimburse surgical fees? The corresponding item is "basic medical insurance expense settlement", and the model searches the related knowledge in the government affair knowledge map centering on the service item entity and uses the related knowledge to strengthen the text understanding ability of the model on the government affair service item name, as shown in fig. 4.
According to the method, materials such as reimbursement receipts and the like are submitted to a social security organization by a transaction person connected by a relationship of acceptance conditions in knowledge centered on items in a government affair knowledge map, and keywords such as reimbursement and the like are included in the acceptance knowledge of the social security organization, and are highly similar to partial words in a sentence input by a user, so that the representation of item names can be enhanced by fusing the keywords into item texts to be matched, and the overall matching capacity of a model is improved.
The prior knowledge fusion method has the peer-to-peer view of each entity node in the knowledge graph, and does not show importance distinction. The government knowledge associated with the government event as a central node does not each contribute to semantic understanding, but rather excessive introduction of government information can cause noise to affect model performance. In the above example, the clause text content connected by the "accepted conditions" relationship contains the most relevant information, while other knowledge such as "office" and "approver" should impair their impact so as not to adversely affect the model.
In order to solve the problems, the invention provides that government affair knowledge is better fused based on a graph attention mechanism (Graph Attention Mechanism). The attention mechanism of the graph is different from methods such as knowledge embedding and knowledge representation, and the like, and the attention mechanism is combined with the graph structure, so that weights can be distributed on other nodes directly connected with the nodes, and the nodes with larger correlation are distributed with more weights to strengthen the representation center node.
S2-4: the fusion layer realizes multi-feature cross fusion through vector alignment subtraction and vector alignment multiplication to obtain the matching degree between query-item names;
s2-5: the matching degree between the obtained inquiry and item names is directly input into a multi-layer fully-connected neural network prediction matching score, the multi-layer fully-connected neural network prediction matching score is arranged according to the prediction matching score in a descending order, and the government item names corresponding to the prediction matching score are input.
And S3, for a new user, obtaining a user vector by adopting the S1, inputting an optimal KEMG model, calculating the association degree of the new user and all service items by the optimal KEMG model, arranging the new user and all service items in descending order according to the association degree value, and outputting a service item sequence TOP-k corresponding to the association degree value.
Specifically, the process of obtaining the word granularity vector by the S2-1 is as follows:
the Embedding Layer (Embedding Layer) aims to map the text subjected to word segmentation from a high-dimensional space to a low-dimensional space, so that feature augmentation is carried out on the text to participate in vector operation of the neural network. Word Embedding (Word Embedding) is a common method used by the Embedding layer, and sentence vectors can be characterized to a large extent in a short text task through Word vector combination.
Word2vec is adopted to embed Word vectors for the user input query text and the government affair item text to be matched, as shown in equations (1) and (2).
Wherein Q is w And S is w A vector matrix formed by Word vector combination after Word2vec is respectively used for representing the Query (Query) input by a user and the Name (Service Name) of the government Service item to be matched,represents a w-dimensional vector obtained by embedding each word in two texts, d w Representing the dimensionality of the word vector. l (L) a And l b Respectively representing queries entered by usersAnd the length of the names of the government service matters to be matched, and the length of the two texts is uniformly defined as the length l for the convenience of operation.
Specifically, the process of obtaining the word granularity vector by the S2-1 is as follows:
in natural language processing tasks, word granularity features are typically better characterizations than word granularity features. However, simply using word embedding as a token vector matrix for sentences has problems under the task of the present invention. For Chinese text, word segmentation is needed to be performed for the next operation. Because of the specificity of Chinese, word segmentation results of different word segmentation tools (such as jieba, a Paddy Ding Jieniu word segmentation package and the like) may be inconsistent, and have differences from the original meaning of a text, so that the model effect is affected. Secondly, due to the fact that the policy changes in real time, new government service words can be generated, and therefore the model effect is poor. Finally, from the perspective of semantic information, chinese characters contain certain semantic information, and only word embedding is used for losing semantic information of word granularity.
The concept of a sub-character is defined in JWE, and the sub-character refers to a component in a single Chinese character, for example, a "photo" has three sub-components of a "day" and a "knife" and a "mouth" besides having a component of a "", and the JWE uses these components as the sub-character of the "photo". JWE predicts the target word using the average of the context word vectors, the average of the context character vectors, and the average of the context sub-character vectors based on the CBOW model to co-learn the embedding of the three. The model structure is shown in fig. 5.
Wherein w is i Is a target word, w i-1 And w i+1 Is a word located on the left and right sides of the target word in the text, c i-1 And c i+1 Representing Chinese characters in context, s i-1 Sum s i+1 Representing the sub-characters in the context, and s i Then the target word w is represented i Is a sub-character of (c).
Word embedding of text is achieved by adopting joint learning word embedding model JWE, and JWE aims at maximizing target word w i The log-likelihood sum of the three predicted conditional probabilities of (2) is shown in equation (3).
Wherein the method comprises the steps ofRespectively refer to the context words, the context Chinese characters, the context sub-characters, w i Is a target word, L (w i ) Representing the target word w i Log-likelihood sum of three predictive conditional probabilities of (c).
Prediction probability Defined by the Softmax function as shown in equation (4).
Wherein the method comprises the steps ofIs the target word w i The output vector of (2) is herein defined by the softmax formula,/-, and>representing the current target word->Any one of the target words in the sum of all the target words representing 1 to N.
Each word in the query input by the user and the name of the government service item to be matched is characterized by a word vector through JWE. By means of a JWE model, the invention realizes word granularity embedding of the names of the government service items to be searched and inquired. The word vector representations corresponding to the user-entered query and the government service item name to be matched are respectively marked as Q j And S is j As shown in equations (5) and (6).
Wherein,word vector representation of i, j-th Chinese characters in inquiry input by user and to-be-matched government service item name respectively, d c Representing the dimension of the word vector. z q ,z s The word numbers of two texts respectively representing the query input by the user and the names of the government service matters to be matched.
Specifically, the process of splicing and fusing the word granularity vector and the word granularity vector to form the final text pair embedded characterization in the S2-1 is as follows:
after the word granularity vector is obtained, the word granularity representation and the word granularity representation are required to be spliced and fused to form a final text pair embedded representation.
One word in Chinese text can be regarded as being composed of a plurality of words, so that direct splicing of vectors with two granularities can lead to the fact that the fused vectors cannot show the word-word containing relationship and semantic information is lost.
And adopting a pooling strategy to fuse feature vectors of the granularity of words, specifically adopting maximum pooling or average pooling to process the word vectors and then splicing. The vector fusion process using a pooling strategy is shown below, taking the word "transacted" in the user query q as an example, as shown in FIG. 6.
The Word "transacted" is characterized by Word2vec as a Word vector. Splitting the handling into handling and processing words, obtaining respective word vectors through a JWE model, and performing maximum pooling or average pooling operation after splicing the word vectors to obtain new combined vectors, such asAs shown.
Wherein c j The j-th word representing the composition word is subjected to JWE to obtain a word granularity vector, and w represents the number of words constituting the composition word.And the feature vectors of all word granularity vectors which form the ith word composition in the text are subjected to pooling operation. P represents an average pooling operation or a maximum pooling operation. Finally, the word granularity vector and the word granularity combination vector of the words are spliced according to the columns to obtain a word handling final new combination vector. / >
New combined vector Q for the ith term in the query entered by the user i The calculation formula is shown as (7).
Wherein the method comprises the steps ofRepresenting feature vectors of all word granularity vectors forming the ith word in the text after pooling operation;
the final text sentence characterization vector Q is queried, and the calculation formula is shown as (8).
Q=(Q 1 ,...,Q l ) (8)
Wherein,word vector embedding dimension defining fusion word granularity of model is d=d w +d c
And obtaining the final text sentence characterization vector S of the government affair by adopting the same method as the Q acquisition.
The embedding layer of the model is constructed, word2vec is adopted to obtain Word vector matrixes after text Word segmentation, and Word vectors of each Word in the text are obtained based on the JWE Word granularity embedding model. And processing the word vectors by adopting a pooling strategy to fuse the word vector matrixes to obtain a final word embedding matrix, and obtaining a characterization vector matrix of the sentence by utilizing new word embedding in the text. Taking the example of the user entering the query "transacting overseas child school keeping" the embedded vector transformation process is shown in fig. 7.
Specifically, the step of obtaining the keyword feature code by the S2-2 is as follows:
the ESIM is adopted to encode the context sequence information of the final text sentence characterization vector, and BiLSTM is adopted in the ESIM to acquire the text sequence characteristics, and the calculation process of the encoded characterization vector is shown in formulas (9) and (10).
Wherein,respectively representing new code values obtained in the ith time step and the jth time step after the user inputs the query text and the government affair item name to be matched pass through the BiGRU, d g Represents the hidden layer dimension set in biglu.
The second coding is carried out on the CNN network, and convolution kernels with the sizes of 2,3 and 4 are respectively used for carrying out convolution operation on the CNN network, and the detailed process is shown in figure 8.
The left side of the figure is a vector matrix for inquiring about 'handling the procedure of students on the outside children', the model regards the matrix as an original pixel point in the image, and multi-level characteristic information in the text is extracted through convolution. And then, carrying out convolution operation on one-dimensional convolution layers with the matrix input convolution kernel sizes of 2,3 and 4 respectively, wherein each size of convolution kernel is provided with two filters (filters), namely the number of channels is 2, and finally outputting a total of 6 feature maps.
The process of extracting text features by the convolution window is as follows, and the process is as follows, namely, the process is as follows in the formula (11):
wherein,representing word vector +.>Is->Representing a stitching operation. Convolution kernel in convolution operation>It slides over the text in the form of a window of size w to extract new features as shown in equation (12).
Wherein C is i Representing the ith feature extracted by the convolution kernel during the convolution process. A special fill mark is also set for sentence text because the window may be outside the sentence boundary (l) when it slides to near the boundary. Thus processing out-of-range input vectors (i<1 or i>l) is regarded as a zero vector.Representing word vector q i ,q i+1 ,q i+2 ,…,q w The stitched matrix, b, represents the bias in the convolutional network. />
Application of convolution kernel of size w to sentenceThe extracted feature map with each possible hidden state is shown in formula (13).
q=[q 1 ,q 2 ,...,q l-w+1 ] T (13)
Wherein the method comprises the steps ofInputting the obtained 6 feature maps into a maximum pooling layer to obtain a feature vector representation with fixed length:
Q i =P({q i1 ,q i2 ,...,q i(l-w+1) }) (14)
wherein Q is i Representing the feature vector of the ith feature map after the maximum pooling operation, q i1 The 1 st feature of the i-th feature map is shown. q i(l-w+1) The (l-w+1) th feature of the i-th feature map is represented. A feature map has (l-w+1) features, w represents a convolution kernel with a window size of w, and l is the total number of times after text segmentation.
Finally, the feature vectors after pooling are spliced together through a formula (15) to form the final query keyword feature code
Adoption and acquisitionThe government key word feature code +.>
Two types of text coding characteristic values can be obtained through the enhanced coding layer, one type is a sequence coding value obtained through BiGRUAnd->The other is keyword feature code value obtained by secondary coding using CNN>And->
Specifically, the process of obtaining the final characterization vector of the user query and the final characterization vector of the item name in S3-3 is as follows:
SIF sentence vectors (Smooth Inverse Frequency) and improved methods of diagramming are presented. In order to ensure that the characteristics are not lost to the greatest extent, the SIF sentence vector generation method is adopted.
The SIF sentence vector generation method comprises the following steps:
1) Obtaining a preliminary sentence vector: traversing all sentences of the corpus, and calculating a preliminary sentence vector v of the current sentence s through a formula (16) s
Where |s| represents the number of words of sentence s, v w The word vector representing the word w in the sentence, p (w) is the word frequency probability of the word w in the corpus, and a is an adjustable parameter. The core idea of the formula keeps the idea of weighted average, and the weight calculation uses brand new word frequency probability plus an adjustable parameter.
2) And carrying out principal component analysis on all the preliminary sentence vectors: the first principal component u of the overall preliminary sentence vector, called the common speech vector (Common Discourse Vector), is calculated to represent the "meaning" common to all words of the sentence, and can be understood as the core "component" in the sentence, which represents the meaning of the sentence to some extent. u is the first principal vector (the first principal component in PCA) of the matrix of word vectors that make up the sentence.
3) Obtaining a target sentence vector: and (3) performing secondary processing on the primary sentence vector through a formula (17) to obtain a target sentence vector. It is understood that deleting the "common portion" u of the word vectors retains the characteristics each word vector possesses.
v s =v s -uu T v s (17)
Wherein the method comprises the steps ofRepresenting the sentence vector value after SIF embedding, < >>w represents a sentence and each word, respectively.
Obtaining sentence vector value of user input query text embedded by SIF by adopting formula (18)Sentence vector value of government affair item name to be matched after SIF embedding>Here means +.>And->Are all calculated by (18).
On the other hand, in fusing government knowledge, the knowledge functions to bring item names closer to queries in similarity. However, according to the original formula, the model calculates the relevance of the item name text and its associated knowledge to give a knowledge node weight to further characterize itself, and there is a contradiction. Therefore, aiming at the unreasonable problem of the selection of the attention center node of the graph, the invention provides an improved graph attention calculation formula, uses the name of the query substitute item to calculate the attention correlation, and more accurately captures the influence of government knowledge on the item.
The improved schematic force calculation formula is shown as (19).
Wherein,representing the representation of the government affair item name sentence vector after the enhancement of the fused government affair knowledge graph, and the ++>Representing all government knowledge nodes directly associated with a government event including the government event itself +. >And representing sentence vector values obtained after the relevant government knowledge is embedded by SIF. Taking the example of "which materials are needed to reimburse the surgical fee", fig. 9 illustrates the process by which the model fuses service event knowledge with event names.
Enhancing item name characterization based on graph attention fusion government knowledge, and interacting with ESIM by adopting soft attention, so as to further enhance item retrieval capability of the model;
the soft attention mechanism interacts the query keyword feature code value and the government keyword feature code value:
the query and item name interaction operations are implemented using soft attention. Soft attention emphasizes that although the weights of different words are different, each word is taken into account and the highly relevant word is assigned a greater weight. The weighted average of all the input information is calculated at output. In text matching tasks, a word is usually associated with multiple words in a relative sentence, and different associations can affect the word itself to different extents, which is very well suited to soft-attention implementations.
In the soft attention interaction step, the model performs an attention operation based on the query keyword feature code value and the government keyword feature code value acquired in the code layer. First, a correlation score is calculated, and the score value is calculated as shown in formula (20).
In e ij Representing in a query final text sentence token vectorAnd querying +.>And obtaining a relevance score through a point multiplication mode. Specifically, the greater the correlation between the two terms, the greater the value that can be derived via equation (20), and an example of a soft attention operation based on equation (20) is shown in FIG. 10.
The text interaction matrix in the figure shows the details of the interaction operation between the inquiry of how the manufacturer's certificate is lost and the transaction name of "the second-level manufacturer's registration certificate is lost and reimbursed". Wherein the color shade in each lattice intuitively reflects the correlation between two words, and the number represents the correlation score. As can be seen by the examples in the figures, the terms "missing" - "missing" and "Do" - "make-up" have a high relevance score, indicating that they have a high degree of similarity. In contrast, words such as "lost" - "registered", "how" - "certificate" and the like have low correlation scores, which means that the similarity between them is correspondingly low.
The correlation score is normalized, and the correlation score is converted into a weight coefficient for subsequent weighted characterization, and the process is shown in a formula (21).
Wherein,representing a new characterization vector of the query text input by the user, wherein the new characterization vector is a weighted summation result of the ith word in the final text sentence characterization vector and each word in the final government text sentence characterization vector after attention coefficient distribution, and e ik Word vector representing any word in final text characterization vector of government affairs, < >>Representing a new representation of the i-th term weighted by all terms in the item name.
For the j-th word in the item name,the calculation process and meaning of (a) are the same as shown in the formula (22).
Representing a new characterization vector of the names of the government matters to be matched, wherein the new characterization vector is the result of weighted summation carried out after the j-th word in the final sentence characterization vector of the government matters and each word in the final sentence characterization vector of the inquiry text are subjected to attention coefficient distribution, and e kj A word vector representing any word in the user-entered query text token vector.
After alignment through a soft attention mechanism, obtaining an interactive user input query text and a new characterization vector of the government affair names to be matchedAnd->Adopt pooling strategy pair->And->Dimension reduction is performed for subsequent fusion with other features. Using the average pooling in combination with the maximum pooling, the calculation is shown in equations (23) and (24).
v q,ave ,v q,max ,v s,ave ,v s,max And 4 vectors obtained by carrying out average pooling and maximum pooling on the query text input by the user and the names of the items to be retrieved are respectively represented.
And then splicing to obtain a characterization vector obtained by soft attention interaction operation of the user input query text and the government affair names to be matched:
wherein,and respectively representing a user query final characterization vector corresponding to the user input query text and an item name final characterization vector corresponding to the to-be-matched government item name, which are obtained after the soft attention interaction operation.d g Representing hidden layer dimensions set in the GRU.
Specifically, the step of obtaining the matching degree between the query-item names in the step S2-4 is as follows:
before predicting query-item name matching scores, in order to more effectively combine the characteristics acquired by each module of the model, the invention designs a fusion layer to fuse the obtained CNN secondary coding value, the name sentence vector enhanced by government knowledge (SIF sentence vector for query) and the sequence characteristic value characterized by soft attention interaction. And realizing multi-feature cross fusion through vector alignment subtraction and vector alignment multiplication. Splicing three characteristics of a user input query text and a government affair item name to be matched respectively [ namely, fusing a query keyword characteristic coding value of the user input query text, a sentence vector value of the user input query text after SIF embedding and a final characterization vector of the user query; and finally, carrying out vector fusion on the government key word characteristic coding value of the government item name to be matched, the sentence vector value of the government item name to be matched after SIF embedding, and the item name. Stored in a sequence:
Respectively carrying out para-position subtraction and para-position multiplication on two sequences, and then splicing the results of the two sequences and the original sequence value:
m out =[m Q ;m S ;m Q -m S ;m Q ⊙m S ] (29)
through the fusion layer, the matching degree between query-item names is realized through vectorsThe representation is performed.
Specifically, the process of calculating the matching score in S2-5 is as follows: as shown in formula (30):
m score =F s (W·m out +b) (30)
wherein W, b represents parameters of the multi-layer fully-connected layer, F s (. Cndot.) is a Sigmod activation function.
The fusion layer carries out cross fusion on the multi-granularity characteristics of the text, so that the semantic matching capability of the model is further improved.
Specifically, the invention introduces a training mode of optimizing a model by Learning to Rank (LTR for short), and improves the retrieval capability of the government service item retrieval model. Rank learning refers to the application of machine learning methods to rank tasks, the primary goal of rank learning being to learn a ranking function that maps query and document pairs to a score for ranking related documents in front and unrelated documents in back.
The invention provides a ServiceRank loss function based on a Pairwise design. For a great number of consistency matters of correlation corresponding to one query, a certain bias is introduced when positive and negative samples are randomly selected, so that the correlation difference between the positive samples and the negative samples is ensured to be large enough.
Margin Ranking Loss when calculating the loss, a minimum correlation gap (Margin) between two events is introduced for each sample pair. For each sample pair (x i ,x j ) The loss function is as follows:
l(x k ,x j )=max(0,m-(s(x k )-s(x j ))) (31)
wherein m represents a Margin parameter, s (x i ) And s (x) j ) Respectively sample x k And x j Is a correlation score of (2). If x i The correlation of (2) is greater than x j S (x) i )>s(x j ) And vice versa. If the correlation difference between the two items is less than m, the loss is 0, otherwise the loss is m minus the correlation difference between the two items.
The method can avoid the problems caused by randomly selecting positive and negative samples, and encourages models to learn the correlation gap between documents. On the other hand, the operation increment is very small, the non-convex loss function is avoided, and the overall efficiency is better than that of the FRank.
On the other hand, aiming at the problem of uneven distribution of the number of the corresponding matters of the query, the invention improves based on the matter retrieval data set on the basis of the IR-SVM, and provides a ServiceRank loss function by directly introducing the number of the matters corresponding to each query as a weight coefficient to balance different related terms and combining Margin Ranking Loss:
wherein,representing a complete dataset consisting of user input text and to-be-retrieved item name text, one sample being one user input and two name texts, +. >Representing the number of samples in the dataset, s i Representing the number of government affair item names corresponding to the input query text of the ith user,/for>Representing a set of all government matter names corresponding to the ith user input query text, l (x) j ,x k ) Margin Ranking Loss calculated from (31).
Compared with the original method, the weighting method increases the weighting coefficient s i For minimizing the impact of different query correspondence number imbalances on the model, taking into account the number of items contained by each query.
Experimental design and results analysis
1. Experimental data set
To verify the effect of KEMG on government service event retrieval tasks, an experiment needs to be performed by applying a model to the dataset. Because the government service item data set for matching or searching is not disclosed in the current academy temporarily, the invention tries to construct the government service item retrieval data set based on the 'common problem' plate data of the Chongqing business service platform.
The "common questions" board is located in the item detail page, each question is assigned to the corresponding item, and the invention refers to this as a relevance marking reference. The statistics after the original data are cleaned up by eliminating empty characters, empty fields and the like comprise 5158 question-answer plate service matters, and 19084 corresponding questions and answers. In the information retrieval data set in the ordering learning field, one query corresponds to n documents and is manually marked with multi-level relevance. Because the data such as the background click log of the government website cannot be obtained, and meanwhile, manual labeling of a large number of multi-stage labels is high in cost and temporarily cannot be realized, the invention selectively constructs the Pairwise format retrieval data set, and the specific steps are as follows:
(1) The Pairwise format data contains four parts, query, service1, service2, and a two-level relative relevance tag (1 is the front document relevance is greater than the rear document, and vice versa). The single sample form is shown in the table.
Table 1 data set single sample format
(2) The same problem exists under different business terms, such as "mechanic school logoff" and "mechanic school setup" both contain the problem "what business a new setup mechanic school can apply for? "for such cases, the questions and corresponding multiple issue groups are decimated, a total of 403 groups are counted, and the average number of events per group is 3.2. Manually labeling the special samples, and defining three-level classification labels: 0 (uncorrelated) 1 (certainly correlated) 2 (very correlated), fig. 11 shows a partially labeled sample. And then writing a script according to the correlation size to construct a table 1 format sample. A set of queries containing n items may be constructedStrip samples. When the number of items is larger than the number of tag types, the number of items is smaller, and if the correlation tags are the same, the number of items conforms to the original relative position.
(3) The conventional problem-transaction pair is a many-to-one relationship except for the case in (2). In light of the above, the relevant positive instance in the data sample is the name of the issue belonging to the problem. For negative sampling, unlike random sampling, the invention adopts a special sampling method aiming at the characteristic that subdirectory exists in government website matters (see figure 12).
Items under one large category are located under the same directory, have a certain similarity in names, and the generalization capability of the model can be improved by using the items as negative examples to construct samples. Here, the judgment is added: the problem-issue pair, if conflicting with (2), is discarded to avoid duplicate construction of the sample. For a directory containing m sub-items, k x (m-1) x m samples can be constructed assuming that each sub-item contains k questions and answers on average. For items not containing subdirectories, a random negative sampling is used for obtaining negative instance construction Pairwise format samples, and the number of samples corresponding to one query refers to the average number of samples corresponding to each query which is finally counted.
Finally, use 6 after scrambling: 2: the scale of 2 divides the data set into a training set, a validation set, and a test set. The information statistics of the government service item retrieval data set constructed by the invention are shown in the table 2 and the table 3:
table 2 data aggregate information
TABLE 3 dataset detail information
2. Model evaluation index
In the text retrieval field, commonly used evaluation methods include indexes such as MAP (Mean Average Precision), MRR (Mean Reciprocal Rank) and NDCG (Normalized Discounted Cumulative Gain). The invention adopts MAP, MRR and NDCG (only the first 5 elements are considered) as model retrieval performance evaluation indexes.
①MAP
MAP based on AP (Average Precision), AP is an index in the field of information retrieval for evaluating the quality of ranking individual query results. For a single user query sentence, the AP refers to the reciprocal of the average position of the retrieved related items in the ranking result, and is defined as follows:
where R represents the total number of related items in the search results, P (k) represents the accuracy of the first k search results, rel k The correlation of the kth search result is represented, the correlation is 1, and the uncorrelation is 0.
The MAP comprehensively considers all search result sorting quality and averages all queried AP values:
where |q| represents the number of queries, i.e., the total number of queries that need to be evaluated.
②MRR
The MRR is used for measuring the average reciprocal of the position where the first related event appears in the ordered list returned by an information retrieval system, and is defined as follows:
where |Q| represents the total number of queries, rank i Representing the rank of the ith query for the occurrence of the first correct result. Unlike MAP, MRR focuses only on where the first correct result occurs, regardless of the order and number of search results, and is therefore better suited to the task of evaluating a single correct answer.
③NDCG
MAP and MRR focus on the ranking of all relevant items to be retrieved, NDCG focuses more on the weighting of the top items. NDCG is based on a break cumulative gain (Discounted Cumulative Gain, DCG), which is an indicator of the performance of a ranking model, defined as follows:
Wherein rel i The correlation of the ith item is shown, and k represents the first k items of the calculated DCG. Molecular in the formulaWeight representing item correlation, log of denominator 2 (i+1) represents a transaction ordering position weight. When an item is top ranked and has high relevance, its contribution value will be greater. On the basis of DCG, dividing DCG by an ideal DCG value, eliminating the effect of the number of transactions, NDCG is defined as follows:
where dcg@k denotes the DCG value of the ranking model in the first k events, idcg@k denotes the DCG value in the ideal case in the first k events. NDCG has a value ranging from 0 to 1.
3. Experimental model
In order to verify the retrieval capability of the KEMG, the invention selects a representative algorithm model in the text matching and retrieval field as a contrast for comparison experiments according to the progress of related engineering research. The comparison model is mainly divided into four types, namely a word frequency statistics-based model, a representation model, an interactive model and a Bert-based model. The method specifically comprises the following steps:
based on word frequency statistics: TF-IDF, BM25
Representation model: ARC-I, DSSM, CDSSM
Interaction model: ARC-II, ESIM, biMPM
BERT-Based: BERT-base, sentence-Bert (S-Bert for short)
On the other hand, in order to verify the effectiveness of the ServiceRank designed for the data set of the present invention, a representative algorithm is selected from three types of rank learning algorithms, and is applied to all the comparison models, and mainly includes:
Pointwise:Cross Entropy
Pairwise:RankNet、LambdaRank
Listwise:ApproxNDCG、ListRank
4 experimental setup and result comparison analysis
(1) Experimental setup
The experimental parameters of the model were set as follows: sequence length 64; the embedded layer Word2vec static Word vector dimension 200; JWE embeds the word granularity vector dimension 100; query, item name, and related government knowledge SIF embedded sentence vector dimension 300; coding layer BiLSTM Units is 128; the CNN convolution kernel sizes are 2,3 and 4 respectively; the full-connection layer is provided with two layers, and the number of the neurons is 256 and 128 respectively; the dropout rate is 0.3; the optimizer uses Adam, the learning rate is set to 1×10 -4 The method comprises the steps of carrying out a first treatment on the surface of the The mini-batch gradient descent is used for improving the training speed and effect during training; the iteration number (epoch) is set to 15 rounds; the batch-size is 64 and the optimal checkpoint (check point) is selected based on the effect of the validation set.
The super-parameter settings of the comparison model are all obtained through the matchzoom built-in dynamic parameter adjustment function.
(2) Model effect comparison results and analysis
5 experiments were performed and the final results averaged. Tables 4 and 5 show three index performances of MAP, MRR and NDCG from two aspects of model comparison and loss function comparison, respectively.
Table 4 comparative experiment results table 1
Table 5 comparative experiment results Table II
Table 6 comparative experiment results Table III
FIG. 13 (a) intuitively shows comparison of model retrieval effects under ServiceRank (TF-IDF and BM25 trained under Pointwise). According to experimental result analysis, the deep learning method generally has stronger retrieval capability thanks to strong text feature extraction capability, and is higher than a statistical algorithm, wherein compared with TF-IDF and BM25, the KEMG provided by the invention is respectively improved by 18.4% and 17.6% on NDCG. In the depth model, the double-tower model has no text interaction, and the extraction capability of fine-grained semantic features is inferior to that of the interaction model and lower than that of the interaction model, the BERT class model and the KEMG in index. The ESIM uses soft attention to perform word-level fine-grained interaction on the text, and the retrieval effect is best in an interaction model. According to the model KEMG provided by the invention, a plurality of modules are introduced on the basis of ESIM to enhance feature extraction of query-item names, and compared with ESIM, MAP, MRR and NDCG, the feature extraction of the text in the government field is improved by 1.4%,1.5% and 1.8% respectively.
The model integrates government knowledge through graph attention, complexity is increased, but overall parameter quantity is reduced through modules such as GRU, CNN and the like, training cost is reduced, the parameter quantity, training time and test time result of each model are shown in a table 7, and time cost index comparison graphs are shown in fig. 13 (c) and (d). The analysis data can obtain KEMG parameters which are slightly less than ESIM and greatly less than BERT models, and compared with BERT-base, the KEMG parameters are respectively improved by 1.1%,1.2% and 0.7% on three indexes. The S-BERT has the best retrieval effect by using the BERT as an encoder and combining a double-tower model architecture, but the parameter quantity of the KEMG is only 7/1000 of that of the S-BERT, the training time and the testing time are 76.4 percent and 93.6 percent on average, the testing time represents the on-line reasoning efficiency of the model to a certain extent, and the feasibility and the effectiveness of the actual deployment and the on-line algorithm of the KEMG industry are verified.
Fig. 13 (b) intuitively shows the comparison result of the search performance of the depth model applied to different loss function schemes (the poindise takes TF-IDF training values, and the other loss function schemes are all applied to KEMG). According to the analysis of experimental results, the Pointwise training mode only considers the relevance of a single query-item name pair, does not learn the relative size sequence of a plurality of related items, and has the worst effect on the retrieval index considering the overall ordering effect of the list. The RankNet and the lambdaRank belong to the Pairwise algorithm, and a pair of partial sequence document pairs are constructed to learn samples, so that the model has the capability of judging the sequence before and after the correlation, and the retrieval capability is greatly improved compared with the Pointwise, and is respectively 7.0%,7.0%,6.3% and 7.1%,7.3% and 7.0%.
Appxndcg and listank belong to the Listwise algorithm, which is directly optimized for the whole returned list, usually better than Pointwise and Pairwise. In the item retrieval task, two Listwise-based algorithms are lower than the Pairwise method in index as a whole, and good experimental results are obtained by the two methods, wherein the index of a RankList in the Listwise is 1.1%,0.8% and 0.5% higher than that of a lambdaRank, and a large number of random selection conditions exist under a list ordering secondary label data set, so that the true relevance ordering cannot be accurately reflected, the fact that the Listwise algorithm cannot fully utilize the integral ordering advantage of the list is verified, and the retrieval capability is affected.
The ServiceRank loss function provided by the invention keeps the second-level label information in a single sample, introduces the super parameter m to judge the correlation threshold value, keeps the supervision information to the greatest extent, introduces the weight to balance, reduces the risk caused by randomly selecting positive and negative examples in the traditional Pairwise algorithm, and solves the problem of uneven sample distribution. Experimental results show that the ServiceRank is superior to the Pairwise algorithm in three indexes, and is respectively improved by 2.2%,2.1% and 2.4% compared with the lambdaRank, so that the effectiveness of the ServiceRank is verified.
(3) Model convergence process
FIG. 14 shows the variation of the three indices on the training set and validation set and the loss function value of the KEMG with self-updating iteration, wherein the abscissa represents the number of rounds (epochs) of the iteration during model training and the ordinate represents the loss function value and the evaluation index value of the model.
(4) Ablation experimental results and analysis
The effectiveness of each module is verified through an Ablation experiment (Ablation Study), and the experimental result is trained based on ServiceRank. The experimental results are shown in table 7.
Table 7 ablation experimental results
In fig. 15, ablation experimental results are compared and displayed from two aspects of a single module and a combined module, so that the contribution condition of each module to the improvement of the overall retrieval capability of the model is more intuitively embodied.
And comparing experimental results from a single module, and analyzing the experimental results, wherein the government knowledge enhancement module improves the model retrieval capability to the greatest extent. The word granularity embedding and CNN module lifting effects in a single module are smaller, and part of indexes are slightly lower than ESIM, which is probably because relevant modules are not introduced in ESIM, noise is increased in the characteristics after the difference calculation in the fusion layer, and the necessity and the effectiveness of the design of the characteristic fusion layer are also laterally illustrated.
And the analysis of the experimental results of the combination module comprises better performance of three combinations incorporating knowledge, and the analysis of the experimental results of the single module is used for obtaining that the enhancement of the retrieval capability of the model integral item based on the government knowledge enhancement module has the greatest influence, so that the effectiveness of the retrieval capability of the knowledge enhancement model service item provided by the invention is proved.
(5) Super parameter experiment and analysis
In order to verify the influence of partial super parameters on the model retrieval capability, experimental effect comparison is performed on m in the SIF parameter a and the ServiceRank, the number of neurons in the full-connection layer and the value of the dropout rate of the last layer (only NDCG index is shown in the experimental result because the trend of the indexes is consistent).
The adjustment range of a in SIF sentence vector generation method is 10 0 To 10 -5 . M in ServiceRank is [0.1,0.6 ]]Fine tuning, the step length is 0.1. The experimental results are shown in fig. 16 (a) and 16 (b). The influence trend of the two on the model is that the model is firstly increased and then gradually decreased, and a=10 respectively -3 And m=0.3, the model NDCG value reached the highest.
The number of neurons is fine-tuned between [16, 256], with a step size of 16. The Dropout rate is fine-tuned between [0.1,1], with a 0 step size of 0.1. The experimental results are shown in fig. 16 (c) and 16 (d). The influence of the number of neurons on the model is first followed by a plateau in the range of 128-192 and reaches the highest, and then a downward trend. The model index is highest at the Dropout rate of 0.3, and then the overall decline trend is presented.
According to the analysis of fig. 16, three items with the largest overall contribution in the fusion-enhanced item name characterization are "accepted condition", "set basis", and "application material" in this order. In combination with knowledge content analysis, a great amount of descriptive information of government matters exists in the acceptance condition and the setting basis. Such as transaction of principal collection, … staff, spouse and … in the acceptance condition of "… repayment of principal loan (including combined loan and interest-bearing loan) and setting up basis of" … (one) purchase, construction, reconstruction and overhaul of living housing of staff in one of the following cases; the keyword information in (two) retired … ″ contains associated transactor situation information and supplementary description information of items other than names, and is easy to be input into a query by a user.
Knowledge information such as "application materials", "transaction places" or "approvers" includes information such as transaction processes. Some users who have a certain knowledge about the items to be transacted may supplement the transacted details in the input query, for example, a query corresponding to the item "job-seeking entrepreneur subsidy" is "can transact student entrepreneur subsidy application on the side of Chunhua dao? "the key word information of" Chunhua Dadadadao "contained in" Chongqing Chongjun Jielian Xiantao street Chunhua Dadao … "in the associated office is associated with the query.
The result of service item knowledge weight distribution not only shows the contribution degree of different event information to enhanced item characterization, but also verifies the rationality of the fused government knowledge for enhancing the retrieval capability of model items.
Finally, it is noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made thereto without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered by the scope of the claims of the present invention.

Claims (9)

1. A multi-granularity government service item recommendation method based on knowledge enhancement is characterized by comprising the following steps of: the method comprises the following steps:
s1: acquiring a plurality of user input query texts and to-be-matched government affair names, and constructing a government affair knowledge graph;
s2: constructing and training a KEMG model, wherein the KEMG model comprises an embedded layer fusing word granularity information, a CNN-enhanced coding layer, an interaction layer based on an attention mechanism, and a fusion layer and a plurality of full-connection layers to obtain similarity prediction scores;
s2-1: embedding word vectors into all user input query texts and government matters names to be matched to obtain query word granularity vectors and government matters granularity vectors, splicing and fusing the query word granularity vectors and the query word granularity vectors to form query final text sentence representation vectors, and splicing and fusing the government matters granularity vectors and government matters granularity vectors to form government final text sentence representation vectors;
S2-2: the method comprises the steps that (1) an ESIM is adopted to encode context sequence information for the query final text sentence representation vector and the government final text sentence representation vector obtained in the step (S2-1), and a CNN network is adopted to encode the encoded vectors for the second time to obtain a query keyword feature encoding value and a government keyword feature encoding value correspondingly;
s2-3: using an improved graph attention mechanism to distribute different weights for each node in a government affair knowledge graph, enhancing government affair word granularity vectors by using the weights, adopting a soft attention mechanism to interact query keyword feature coding values and government affair keyword feature coding values, obtaining enhanced characterization of each word after weighting, inputting the enhanced characterization of each word obtained by a pooling layer, and splicing the feature vectors to obtain final characterization vectors of user query and final characterization vectors of item names respectively;
s2-4: the fusion layer realizes multi-feature cross fusion through vector alignment subtraction and vector alignment multiplication to obtain the matching degree between query-item names;
s2-5: directly inputting the matching degree between the obtained inquiry and item names into a multi-layer fully-connected neural network to predict matching scores, arranging according to the predicting matching scores in a descending order, and inputting government item names corresponding to the predicting matching scores;
And S3, for a new user, obtaining a user vector by adopting the S1, inputting an optimal KEMG model, calculating the association degree of the new user and all service items by the optimal KEMG model, arranging the new user and all service items in descending order according to the association degree value, and outputting a service item sequence TOP-k corresponding to the association degree value.
2. The knowledge-based enhanced multi-granularity government service event recommendation method of claim 1 wherein: the process of obtaining the word granularity vector by the S2-1 is as follows:
word2vec is adopted to embed Word vectors for the user input query text and the government matters text to be matched, as shown in equations (1) and (2):
wherein Q is w And S is w A vector matrix formed by Word vector combination after Word2vec is respectively used for representing the Query (Query) input by a user and the Name (Service Name) of the government Service item to be matched,represents a w-dimensional vector obtained by embedding each word in two texts, d w Representing the dimension of a word vector, l a And l b Representing the length of the query and the government service item name to be matched, which are input by the user, respectively.
3. The knowledge-based enhanced multi-granularity government service event recommendation method of claim 2 wherein: the process of obtaining the word granularity vector by the S2-1 is as follows:
Word embedding of text is realized by adopting a joint learning word embedding model JWE, as shown in a formula (3):
wherein the method comprises the steps ofRespectively refer to the context words, the context Chinese characters, the context sub-characters, w i Is a target word, L (w i ) Representing the target word w i Log-likelihood sum of three predictive conditional probabilities of (a);
prediction probabilityDefined by the Softmax function, as shown in equation (4):
wherein the method comprises the steps ofIs the target word w i Output vector of>Representing the current target word->Any one of the target words in the summation of all target words representing 1 to N;
each word in the query and the to-be-matched government service item name input by the user is characterized by a word vector through JWE, and the word vector representations corresponding to the query and the to-be-matched government service item name input by the user are respectively marked as Q j And S is j As shown in formulas (5) and (6):
wherein q iWord vector representation of i, j-th Chinese characters in inquiry input by user and to-be-matched government service item name respectively, d c Representing the dimension, z, of a word vector q ,z s The word numbers of two texts respectively representing the query input by the user and the names of the government service matters to be matched.
4. The knowledge-based enhanced multi-granularity government service event recommendation method of claim 3 wherein: in the S2-1, the process of splicing and fusing the word granularity vector and the word granularity vector to form the final text pair embedded characterization is as follows:
New combined vector Q for the ith term in the query entered by the user i The calculation formula is shown as (7):
wherein the method comprises the steps ofRepresenting feature vectors of all word granularity vectors forming the ith word in the text after pooling operation;
querying a final text sentence characterization vector Q, wherein a calculation formula is shown in (8):
Q=(Q 1 ,...,Q l ) (8)
wherein,word vector embedding dimension defining fusion word granularity of model is d=d w +d c
And obtaining the final text sentence characterization vector S of the government affair by adopting the same method as the Q acquisition.
5. The knowledge-based enhanced multi-granularity government service event recommendation method of claim 4 wherein: the step of obtaining the keyword feature codes by the S2-2 is as follows:
the ESIM is adopted to encode the context sequence information of the final text sentence characterization vector, and BiLSTM is adopted in the ESIM to acquire the text sequence characteristics, and the calculation process of the encoded characterization vector is shown in formulas (9) and (10):
wherein,respectively representing new code values obtained in the ith time step and the jth time step after the user inputs the query text and the government affair item name to be matched pass through the BiGRU, d g Representing hidden layer dimensions set in the BiGRU;
performing a second encoding on a CNN network, performing convolution operation on the CNN network by using convolution kernels with the sizes of 2,3 and 4, wherein the process of extracting text features by using a convolution window is as follows, and firstly, the process is as follows through a formula (11):
Wherein,representing word vector +.>Is->Representing a concatenation operation in which a convolution kernel is convolvedIt slides over the text in the form of a window of size w to extract new features as shown in equation (12):
wherein C is i Representing the ith feature extracted by the convolution kernel in the convolution process;
input vectors that will go out of range(i<1 or i>l) is regarded as zero vector,/->Representing word vector q i ,q i+1 ,q i+2 ,…,q w The spliced matrix, b, represents the bias in the convolutional network;
application of convolution kernel of size w to sentenceThe extracted feature map with each possible hidden state is shown in formula (13);
wherein the method comprises the steps ofInputting the obtained 6 feature maps into a maximum pooling layer to obtain a solidFixed length eigenvector representation:
Q i =P({q i1 ,q i2 ,...,q i(l-w+1) }) (14)
wherein Q is i Representing the feature vector of the ith feature map after the maximum pooling operation, q i1 1 st feature, q, representing the i-th feature map i(l-w+1) (l-w+1) th feature representing the i-th feature map;
finally, the feature vectors after pooling are spliced together through a formula (15) to form the final query keyword feature code
Adoption and acquisitionThe government key word feature code +.>
6. The knowledge-based enhanced multi-granularity government service event recommendation method of claim 5 wherein: the process of obtaining the final characterization vector of the user query and the final characterization vector of the item name in the S2-3 is as follows:
The SIF sentence vector generation method comprises the following steps:
1) Obtaining a preliminary sentence vector: traversing all sentences of the corpus, and calculating a preliminary sentence vector v of the current sentence s through a formula (16) s
Where |s| represents the number of words of sentence s, v w A word vector representing a word w in a sentence, p (w) is the word frequency probability of the word w in the whole corpus, and a is an adjustable parameter;
2) And carrying out principal component analysis on all the preliminary sentence vectors: calculating a first principal component u of the whole preliminary sentence vector, called a public speaking vector, wherein u is a first principal vector of a matrix formed by word vectors forming the sentence;
3) Obtaining a target sentence vector: performing secondary processing on the primary sentence vector through a formula (17) to obtain a target sentence vector;
wherein the method comprises the steps ofRepresenting the sentence vector value after SIF embedding, < >>w represents a sentence and each word, respectively;
obtaining sentence vector value of user input query text embedded by SIF by adopting formula (18)Sentence vector value of government affair item name to be matched after SIF embedding>
The improved schematic force calculation formula is shown as (19);
wherein,representing the representation of the government matters name sentence vector enhanced by fusing the government knowledge graph, and the +.>Representing all government knowledge nodes directly associated with a government event including the government event itself +. >Representing sentence vector values obtained after the relevant government knowledge is embedded by SIF;
the soft attention mechanism interacts the query keyword feature code value and the government keyword feature code value:
first, a relevance score is calculated, and the calculation mode of the score value is shown in a formula (20):
in e ij Representing in a query final text sentence token vectorAnd querying +.>Obtaining a correlation score through a point multiplication mode;
the correlation score is normalized, and the correlation score is converted into a weight coefficient for subsequent weighted characterization, wherein the process is shown in a formula (21):
wherein,representing a new characterization vector of the query text input by the user, wherein the new characterization vector is a weighted summation result of the ith word in the final text sentence characterization vector and each word in the final government text sentence characterization vector after attention coefficient distribution, and e ik Word vectors representing any word in the final government affair text representation vector;
for the j-th word in the item name,the calculation process and meaning of (a) are the same as shown in the formula (22):
the new characterization vector representing the names of the government matters to be matched is the result of weighted summation carried out after the j-th word in the government final text sentence characterization vector and each word in the query final text sentence characterization vector are subjected to attention coefficient distribution, A word vector representing any word in the query text token vector input by the user;
adopt pooling strategy pairAnd->Performing dimension reduction, and using an average pooling mode and a maximum pooling mode, wherein the calculation process is as shown in formulas (23) and (24):
v q,ave ,v q,max ,v s,avg ,v s,max the method comprises the steps of respectively representing 4 vectors obtained after the user inputs a query text and names of items to be retrieved are subjected to average pooling and maximum pooling;
and then splicing to obtain a characterization vector obtained by soft attention interaction operation of the user input query text and the government affair names to be matched:
wherein,respectively representing a user query final characterization vector corresponding to a user input query text obtained after soft attention interaction operation and an item name final characterization vector corresponding to a government item name to be matched, and +.>d g Representing hidden layer dimensions set in the GRU.
7. The knowledge-based enhanced multi-granularity government service event recommendation method of claim 6 wherein: the S2-4 is used for obtaining the matching degree between the query-item names as follows:
the multi-feature cross fusion is realized by vector alignment subtraction and vector alignment multiplication, and three features of a user input query text and a government affair item name to be matched are spliced and stored in a sequence:
Respectively carrying out para-position subtraction and para-position multiplication on two sequences, and then splicing the results of the two sequences and the original sequence value:
through the fusion layer, the matching degree between query-item names is realized through vectorsThe representation is performed.
8. The knowledge-based enhanced multi-granularity government service event recommendation method of claim 7 wherein: the process of calculating the matching score by the S2-5 is as follows: as shown in formula (30):
m score =F s (W·m out +b) (30)
wherein W, b represents parameters of the multi-layer fully-connected layer, F s (. Cndot.) is a Sigmod activation function.
9. The knowledge-based enhanced multi-granularity government service event recommendation method of claim 8 wherein:
margin Ranking Loss in calculating the loss, a minimum correlation gap between two matters is introduced for each sample pair (x i ,x j ) The loss function is as follows:
l(x k ,x j )=max(0,m-(s(x k )-s(x j ))) (31)
wherein m represents a Margin parameter, s (x i ) And s (x) j ) Respectively sample x k And x j Is a correlation score of (2);
the ServiceRank loss function is proposed:
wherein,representing a data set consisting of user input text and a to-be-retrieved item name text, < >>Representing the number of samples in the dataset, s i Representing the number of government affair item names corresponding to the input query text of the ith user,/for >Representing a set of all government matter names corresponding to the ith user input query text, l (x) j ,x k ) Margin Ranking Loss calculated from (31).
CN202310582574.3A 2023-05-23 2023-05-23 Knowledge enhancement-based multi-granularity government service item recommendation method Pending CN117112794A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310582574.3A CN117112794A (en) 2023-05-23 2023-05-23 Knowledge enhancement-based multi-granularity government service item recommendation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310582574.3A CN117112794A (en) 2023-05-23 2023-05-23 Knowledge enhancement-based multi-granularity government service item recommendation method

Publications (1)

Publication Number Publication Date
CN117112794A true CN117112794A (en) 2023-11-24

Family

ID=88797241

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310582574.3A Pending CN117112794A (en) 2023-05-23 2023-05-23 Knowledge enhancement-based multi-granularity government service item recommendation method

Country Status (1)

Country Link
CN (1) CN117112794A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117435696A (en) * 2023-12-21 2024-01-23 数据空间研究院 Text data retrieval method and device, electronic equipment and storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117435696A (en) * 2023-12-21 2024-01-23 数据空间研究院 Text data retrieval method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
Bhuvaneshwari et al. Spam review detection using self attention based CNN and bi-directional LSTM
AU2021371022B2 (en) Systems and methods for the automatic classification of documents
Feng et al. Enhanced sentiment labeling and implicit aspect identification by integration of deep convolution neural network and sequential algorithm
Zhang et al. A multi-label classification method using a hierarchical and transparent representation for paper-reviewer recommendation
CN110674252A (en) High-precision semantic search system for judicial domain
CN113779264B (en) Transaction recommendation method based on patent supply and demand knowledge graph
Sharma et al. A survey of methods, datasets and evaluation metrics for visual question answering
CN107688870B (en) Text stream input-based hierarchical factor visualization analysis method and device for deep neural network
WO2023035330A1 (en) Long text event extraction method and apparatus, and computer device and storage medium
CN115982338B (en) Domain knowledge graph question-answering method and system based on query path sorting
CN118277538B (en) Legal intelligent question-answering method based on retrieval enhancement language model
CN114048305A (en) Plan recommendation method for administrative penalty documents based on graph convolution neural network
CN118410175A (en) Intelligent manufacturing capacity diagnosis method and device based on large language model and knowledge graph
Zadgaonkar et al. An Approach for analyzing unstructured text data using topic modeling techniques for efficient information extraction
CN117112794A (en) Knowledge enhancement-based multi-granularity government service item recommendation method
Skondras et al. Efficient Resume Classification through Rapid Dataset Creation Using ChatGPT
CN114491079A (en) Knowledge graph construction and query method, device, equipment and medium
CN111581365B (en) Predicate extraction method
Li et al. Approach of intelligence question-answering system based on physical fitness knowledge graph
CN114943216B (en) Case microblog attribute level view mining method based on graph attention network
CN116955818A (en) Recommendation system based on deep learning
CN117077680A (en) Question and answer intention recognition method and device
Lokman et al. A conceptual IR chatbot framework with automated keywords-based vector representation generation
CN114186068A (en) Audit system basis question-answering method based on multi-level attention network
CN113946665A (en) Knowledge base question-answering method for providing background information based on text

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination