Nothing Special   »   [go: up one dir, main page]

CN113569885A - Problem correlation detection method and server - Google Patents

Problem correlation detection method and server Download PDF

Info

Publication number
CN113569885A
CN113569885A CN202010357386.7A CN202010357386A CN113569885A CN 113569885 A CN113569885 A CN 113569885A CN 202010357386 A CN202010357386 A CN 202010357386A CN 113569885 A CN113569885 A CN 113569885A
Authority
CN
China
Prior art keywords
similarity
matching
question
search
statement
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010357386.7A
Other languages
Chinese (zh)
Other versions
CN113569885B (en
Inventor
王筱钊
江汇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202010357386.7A priority Critical patent/CN113569885B/en
Publication of CN113569885A publication Critical patent/CN113569885A/en
Application granted granted Critical
Publication of CN113569885B publication Critical patent/CN113569885B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosed embodiments relate to a question correlation detection method, a search method, a server for question answering service, a search server, a voice interaction method, an electronic device, and a computer-readable storage medium. The problem correlation detection method comprises the following steps: carrying out similarity matching on the answers of the first question and the second question to obtain a first matching tensor of a matching mode representing the answers of the first question and the second question; performing similarity matching on the second question and the answer of the second question to obtain a second matching tensor representing the matching mode of the second question and the answer of the second question; whether the first question and the second question are similar questions is determined based on the degree of similarity of the first matching tensor and the second matching tensor.

Description

Problem correlation detection method and server
Technical Field
The present disclosure relates to computer technologies, and more particularly, to a question correlation detection method, a search method, a server for question and answer service, a search server, a voice interaction method, an electronic device, and a computer-readable storage medium.
Background
The Community Question Answering (Community Question Answering) is a Community-based Question Answering service, and users can ask questions in a Community and answer the questions asked by themselves or others, so that the questions and answers are performed in a user cooperation mode.
In order to increase the response speed, when a user proposes a new question, the community question and answer service provider searches similar questions from archived questions of the community (the archived questions refer to questions with existing answers), and returns the answers of the similar questions to the user as a response to the new question, which is referred to as similar question detection in the industry.
However, since the questions are usually short and the user usually uses spoken natural language to ask questions, the difference of expression between different questions is large, which poses a great challenge to similar question detection.
Therefore, it is necessary to provide a new problem correlation detection method capable of effectively detecting similar problems.
Disclosure of Invention
The embodiment of the disclosure provides a problem correlation detection method, which can effectively detect similar problems.
According to a first aspect of the embodiments of the present disclosure, there is provided a problem correlation detection method, including:
carrying out similarity matching on the answers of the first question and the second question to obtain a first matching tensor of a matching mode representing the answers of the first question and the second question;
performing similarity matching on the second question and the answer of the second question to obtain a second matching tensor representing the matching mode of the second question and the answer of the second question;
whether the first question and the second question are similar questions is determined based on the degree of similarity of the first matching tensor and the second matching tensor.
Optionally, the method further comprises: and in the case that the first question and the second question are similar questions, sending the answer of the second question as the answer of the first question to the user terminal.
Optionally, the method further comprises: carrying out similarity matching on the first question and the second question to obtain a first similarity vector representing the similarity degree of the first question and the second question; determining whether the first question and the second question are similar questions based on the similarity degree of the first matching tensor and the second matching tensor comprises: calculating the similarity degree of the first matching tensor and the second matching tensor to obtain a second similarity vector; aggregating the first similarity vector and the second similarity vector to obtain an aggregated similarity vector; determining whether the first question and the second question are similar questions based on the aggregated similarity vector.
Optionally, the aggregating the first similarity vector and the second similarity vector to obtain an aggregated similarity vector includes: and aggregating the first similarity vector and the second similarity vector by using a gating mechanism to obtain an aggregated similarity vector by taking the first similarity vector as a primary variable and the second similarity vector as a secondary variable.
Optionally, before calculating the similarity degree of the first matching tensor and the second matching tensor, the method further includes: the second similarity vector is compressed such that the dimensions of the second similarity vector are the same as the dimensions of the first similarity vector.
Optionally, calculating a similarity degree of the first matching tensor and the second matching tensor by a similarity function; the similarity function is any one of: point multiplication function, cosine function, distance function.
Optionally, the similarity matching is performed by using a text similarity matching algorithm based on a BERT algorithm.
According to a second aspect of the embodiments of the present disclosure, there is provided a search method including:
carrying out similarity matching on the search results of the first search statement and the second search statement to obtain a first matching tensor representing a matching mode of the search results of the first search statement and the second search statement;
similarity matching is carried out on the second search statement and the search result of the second search statement, and a second matching tensor which represents the matching mode of the second search statement and the search result of the second search statement is obtained;
determining whether the first search sentence and the second search sentence are similar search sentences based on the degree of similarity of the first matching tensor and the second matching tensor.
Optionally, the method further comprises: and under the condition that the first search sentence and the second search sentence are similar search sentences, sending the search result of the second search sentence to the user terminal as the search result of the first search sentence.
Optionally, the method further comprises: carrying out similarity matching on the first search statement and the second search statement to obtain a first similarity vector representing the similarity of the first search statement and the second search statement; the determining whether the first search sentence and the second search sentence are similar search sentences based on the similarity degrees of the first matching tensor and the second matching tensor includes: calculating the similarity degree of the first matching tensor and the second matching tensor to obtain a second similarity vector; aggregating the first similarity vector and the second similarity vector to obtain an aggregated similarity vector; determining whether the first search statement and the second search statement are similar search statements based on the aggregated similarity vector.
Optionally, the aggregating the first similarity vector and the second similarity vector to obtain an aggregated similarity vector includes: and aggregating the first similarity vector and the second similarity vector by using a gating mechanism to obtain an aggregated similarity vector by taking the first similarity vector as a primary variable and the second similarity vector as a secondary variable.
Optionally, before calculating the similarity degree of the first matching tensor and the second matching tensor, the method further includes: the second similarity vector is compressed such that the dimensions of the second similarity vector are the same as the dimensions of the first similarity vector.
Optionally, calculating a similarity degree of the first matching tensor and the second matching tensor by a similarity function; the similarity function is any one of: point multiplication function, cosine function, distance function.
Optionally, the similarity matching is performed by using a text similarity matching algorithm based on a BERT algorithm.
According to a third aspect of the embodiments of the present disclosure, a server for a question and answer service is provided, which includes a processor and a memory, where the memory stores computer instructions, and the computer instructions, when executed by the processor, implement the problem correlation detection method provided by the first aspect of the embodiments of the present disclosure.
According to a fourth aspect of the embodiments of the present disclosure, there is provided a search server comprising a processor and a memory, the memory storing computer instructions, which when executed by the processor, implement the search method provided by the second aspect of the embodiments of the present disclosure.
According to a fifth aspect of the embodiments of the present disclosure, a computer-readable storage medium is provided, on which computer-readable instructions are stored, which when executed by a processor, implement the problem correlation detection method provided by the first aspect of the embodiments of the present disclosure.
According to a sixth aspect of the embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer-readable instructions which, when executed by a processor, implement the search method provided by the second aspect of the embodiments of the present disclosure.
According to a seventh aspect of the embodiments of the present disclosure, there is provided a voice interaction method, including:
acquiring a first statement submitted by a user in a voice mode;
carrying out similarity matching on the response results of the first statement and the second statement to obtain a first matching tensor of a matching mode representing the response results of the first statement and the second statement;
carrying out similarity matching on the second statement and the response result of the second statement to obtain a second matching tensor of a matching mode representing the response result of the second statement and the second statement;
determining whether the first sentence and the second sentence are similar sentences based on the similarity degree of the first matching tensor and the second matching tensor;
and under the condition that the first sentence and the second sentence are similar sentences, playing a response result of the second sentence in a voice mode.
According to an eighth aspect of the embodiments of the present disclosure, there is provided a voice interaction method, including:
acquiring a first statement submitted by a user in a voice mode;
carrying out similarity matching on the response results of the first statement and the second statement to obtain a first matching tensor of a matching mode representing the response results of the first statement and the second statement;
carrying out similarity matching on the second statement and the response result of the second statement to obtain a second matching tensor of a matching mode representing the response result of the second statement and the second statement;
determining whether the first sentence and the second sentence are similar sentences based on the similarity degree of the first matching tensor and the second matching tensor;
and under the condition that the first sentence and the second sentence are similar sentences, sending a response result of the second sentence to the user terminal for playing by the user terminal.
According to a ninth aspect of the embodiments of the present disclosure, there is provided an electronic device with a voice assistant mounted thereon, including a microphone, a speaker, a processor, and a memory; the memory stores computer instructions, and the computer instructions, when executed by the processor, implement the voice interaction method provided by the seventh aspect of the disclosed embodiment.
The question correlation detection method provided by the embodiment of the disclosure determines whether the first question and the second question are similar questions or not based on the similarity matching pattern of the answers to the first question and the second question and the similarity matching pattern of the answers to the second question and can effectively detect the similar questions.
Features of embodiments of the present disclosure and advantages thereof will become apparent from the following detailed description of exemplary embodiments thereof, which is to be read in connection with the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this disclosure, illustrate embodiments of the disclosure and together with the description, serve to explain the principles of the embodiments of the disclosure.
FIG. 1 is a block diagram of a community question-answering system provided by one embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a problem correlation detection model provided by one embodiment of the present disclosure;
FIG. 3 is a flow chart of a problem correlation detection method provided by one embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a problem correlation detection model provided by one embodiment of the present disclosure;
FIG. 5 is a flow chart of a problem correlation detection method provided by one embodiment of the present disclosure;
FIG. 6 is a block diagram of a server for a question-and-answer service provided by one embodiment of the present disclosure;
fig. 7 is a block diagram of a search server provided by one embodiment of the present disclosure.
Detailed Description
Various exemplary embodiments of the present disclosure will now be described in detail with reference to the accompanying drawings.
The following description of at least one exemplary embodiment is merely illustrative in nature and is in no way intended to limit the embodiments of the disclosure, their application, or uses.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, further discussion thereof is not required in subsequent figures.
< Community question-answering System >
The Community Question Answering (Community Question Answering) is a Community-based Question Answering service, and users can ask questions in a Community and answer the questions asked by themselves or others, so that the questions and answers are performed in a user cooperation mode.
Fig. 1 is a block diagram of a community question-answering system provided by an embodiment of the present disclosure. As shown in fig. 1, the community question-answering system includes a server 101 providing a community question-answering service and a plurality of user terminal devices 103, which can communicate with each other via a network 102.
The server 101 may be in the form of a blade server, a rack server, or the like, or may be a server cluster deployed in the cloud. In some embodiments, each server may include hardware, software, or embedded logic components or a combination of two or more such components for performing the appropriate functions supported or implemented by the server.
The configuration of the server 101 may include, but is not limited to: processor 1011, memory 1012, interface 1013, communication device 1014, input device 1015, output device 1016. The processor 1011 may include, but is not limited to, a central processing unit CPU, a microprocessor MCU, or the like. The memory 1012 may include, but is not limited to, a ROM (read only memory), a RAM (random access memory), a nonvolatile memory such as a hard disk, and the like. Interface device 1013 may include, but is not limited to, a USB interface, a serial interface, a parallel interface, and the like. The communication device 1014 is capable of wired or wireless communication, for example, and may specifically include WiFi communication, bluetooth communication, 2G/3G/4G/5G communication, and the like. Input devices 1015 include, but are not limited to, a keyboard, a mouse, and the like. Output device 1016 includes, but is not limited to, a display screen or the like. The configuration of the server 101 may include only some of the above devices.
The terminal device 103 may be, for example, an electronic device installed with an intelligent operating system (e.g., android, IOS, Windows, Linux, etc.) including, but not limited to, a laptop, a desktop computer, a mobile phone, a tablet computer, etc. The configuration of the terminal device 103 includes, but is not limited to, a processor 1031, a memory 1032, an interface device 1033, a communication device 1034, a GPU (Graphics Processing Unit) 1035, a display device 1036, an input device 1037, a speaker 1038, a microphone 1039, and a camera 1030, and the like. The processor 1031 includes, but is not limited to, a central processing unit CPU, a microprocessor MCU, and the like. The memory 1032 includes, but is not limited to, a ROM (read only memory), a RAM (random access memory), a nonvolatile memory such as a hard disk, and the like. Interface device 1033 includes, but is not limited to, a USB interface, a serial interface, a parallel interface, and the like. The communication device 1034 is capable of wired or wireless communication, for example, and specifically may include WiFi communication, bluetooth communication, 2G/3G/4G/5G communication, and the like. The GPU 1035 is used to process the image. The display device 1036 includes, but is not limited to, a liquid crystal screen, a touch screen, and the like. Input devices 1037 include, but are not limited to, a keyboard, a mouse, a touch screen, and the like. The configuration of the terminal device 103 may include only some of the above-described apparatuses.
In one embodiment of the present disclosure, a user may access a community through the terminal device 103, submit a question to the community, and answer the question in the community. The server 101 is used for maintaining a community and providing a community question and answer service. After receiving the questions submitted by the users, the server 101 issues the questions to the community, and the user and other users can answer the questions in the community. The server 101 may also perform similarity retrieval in the archived questions of the community after receiving the new question, where the archived questions refer to the questions with the existing answers. If an archived question similar to the new question can be retrieved, the answer to the similar archived question is returned to the questioner as a response to the new question, i.e., the answer to the similar archived question is returned to the questioner as the answer to the new question. The memory 1012 of the server 101 is used to store instructions for controlling the processor 1011 to operate in support of implementing the problem correlation detection method of any of the embodiments of the present disclosure.
The community question-answering system shown in FIG. 1 is illustrative only and is in no way meant to be limiting of the embodiments of the present disclosure, their applications, or uses. It should be understood by those skilled in the art that although the foregoing describes a plurality of apparatuses of the server 101 and the terminal device 103 providing the community question-and-answer service, the embodiments of the present disclosure may refer to only some of the apparatuses therein. The skilled person can design the instructions according to the disclosed solution of the embodiments of the present disclosure. How the instructions control the operation of the processor is well known in the art and will not be described in detail herein.
< method for detecting correlation between problems >
In order to increase the response speed, when a user proposes a new question, the community question and answer service provider searches similar questions from the archived questions of the community, returns the answers of the similar questions to the user as a response to the new question, and is referred to as similar question detection in the industry.
However, since the questions are usually short and the user usually uses spoken natural language to ask questions, the difference of expression between different questions is large, which poses a great challenge to similar question detection.
The inventors have studied and considered that in order to improve the problem, a more complex text matching model can be constructed to perform similarity problem detection, but still suffers from the sparseness of information from shorter texts. The inventor thinks that answers to problems can be introduced to supplement search information in the process of detecting similar problems so as to reduce the influence of information sparsity of short texts of the problems.
The first way to introduce is to splice the archived questions and the answers to the archived questions and match the spliced contents with the new questions. But since the questions and answers to the questions often do not follow the same distribution assumption simply, even the answers to the questions may be very divergent and not strongly correlated with the questions themselves, which may result in inefficient use of the information in the answers and may even introduce noise due to the bias of archiving the answers to the questions. The following description will be given with reference to a specific example.
"Qu: what is the garbage collection mechanism in JAVA? "
"Qe: should creation of objects in JAVA be avoided? "
"Ae: in fact, due to the memory management policy in JAVA, the created object can be regarded as a free operation in the JVM heap. Another cost associated with an object is that the object logs off. If the JVM heap is large enough, the object logoff process does not actually occur in current garbage collection algorithms, nor does it bring about any stuttering on the program. "
Qu represents a new question, Qe represents an archived question, and Ae represents an answer to the archived question Qe (referred to as an "archived answer").
It can be seen that the archived problem Qe and the new problem Qu are not actually similar.
However, for the archived question Qe, the answer Ae relates to not only the content strongly associated with the archived question Qe ("create object") but also the content weakly associated with the archived question Qe ("garbage collection algorithm"). And comparing the similarity of the spliced content of the archived question Qe and the answer Ae thereof with the similarity of the new question Qu, wherein the spliced content and the new question Qu both have the keyword of 'garbage collection', the archived question Qe and the new question Qu are judged to be similar questions by mistake, and the answer of the archived question Qe is returned to the user as the answer of the new question Qu by mistake. That is, the answers to the archived questions instead introduce noise, resulting in inaccurate similarity detection results.
The second introduction method is to perform a first round of similarity detection on the new problem and the archiving problem to obtain a candidate similarity problem. And then carrying out a second round of similarity detection on the answers of the new question and the candidate similar question, and if the answers of the new question and the candidate similar question are also similar, determining the candidate similar question as the similar question of the new question.
The second introduction uses a counter-fact reasoning assumption, namely: if the questions are different, the answer to one question cannot be used to answer another question; if the answer to one question can answer another question, it indicates that the two questions are similar. Under this assumption, the answers tend to be more rigorous, i.e., only if the new question and the archived question and their answers are respectively similar, the two questions are considered similar, which in some cases may be too rigorous, resulting in a reduced number of recalls and missed calls.
The problem correlation detection method provided by the embodiments of the present disclosure is different from the two aforementioned ways, and is based on a Matching over Matching scheme (Matching), that is, Matching between patterns using question-and-answer Matching.
< example one >
Referring to fig. 2 and fig. 3, a method for problem correlation detection provided by an embodiment of the present disclosure is described. The method comprises steps S202-S214.
S202, similarity matching is conducted on the answers of the first question and the second question, and a first matching tensor of a matching mode representing the answers of the first question and the second question is obtained.
In one specific example, the first question may be a new question, i.e., a question without an answer. The first question may also be a question that currently has only a few answers. The second question may be an archived question, with an answer already.
In an embodiment of the present disclosure, similarity matching is performed using a pre-trained language model to obtain a first matching tensor. The Bert (Bidirectional Encoder from converters) algorithm is a classical pre-trained language model that achieves the best effect at the time on multiple natural language tasks, and builds the model from stacked encoders. The BERT algorithm is an NLP (Natural Language Processing) model. In the embodiment of the present disclosure, the Bert algorithm is used as a pre-training language model for similarity matching. That is, in step S202, similarity matching may be performed using a text similarity matching algorithm based on the BERT algorithm. In embodiments of the present disclosure, a Bert algorithm network may use a stacked encoder structure.
As shown on the right side of fig. 2, answers to the first question and the second question are input into the first text similarity matching network. The first text similarity matching network uses a text similarity matching algorithm based on a BERT algorithm to obtain a matching pattern between answers to the first question and the second question.
In a specific example, the answers of the first question and the second question are spliced and input to the first text similarity matching network. The output of the encoder is divided into two parts based on the character corresponding to the first question and the character corresponding to the answer to the second question, the first part being a vector representation of the encoder for the first question and the second part being a vector representation of the encoder for the answer to the second question. And for each layer of encoder, calculating the inner product of the vector expression of the layer of encoder to the first question and the vector expression of the answer to the second question to obtain the matching tensor corresponding to the layer of encoder. And combining the matching tensors corresponding to the encoders of each layer together to obtain a first matching tensor, wherein the first matching tensor can represent a matching mode of answers of a first question and a second question.
And S204, carrying out similarity matching on the second question and the answer of the second question to obtain a second matching tensor representing the matching mode of the second question and the answer of the second question.
In step S204, a text similarity matching algorithm based on the BERT algorithm may be used for similarity matching.
As shown on the right side of fig. 2, the second question and the answer to the second question are input into the second text similarity matching network. The second text similarity matching network uses a text similarity matching algorithm based on a BERT algorithm to obtain a matching pattern between the second question and an answer to the second question.
In a specific example, the second question and the answer to the second question are spliced and input to the second text similarity matching network. The output of the encoder is divided into two parts based on the character corresponding to the second question and the character corresponding to the answer to the second question, the first part being a vector representation of the encoder for the second question and the second part being a vector representation of the encoder for the answer to the second question. And for each layer of encoder, calculating the inner product of the vector expression of the layer of encoder to the second question and the vector expression of the answer to the second question to obtain the matching tensor corresponding to the layer of encoder. And combining the matching tensors corresponding to the encoders of each layer together to obtain a second matching tensor, wherein the second matching tensor can represent a matching mode of a second question and an answer of the second question.
S206, similarity matching is carried out on the first problem and the second problem, and a first similarity vector representing the similarity degree of the first problem and the second problem is obtained.
In step S206, similarity matching may be performed using a text similarity matching algorithm based on the BERT algorithm.
As shown on the left side of fig. 2, the first question and the second question are input into a third text similarity matching network. The third text similarity matching network detects the degree of similarity of the first question and the second question using a text similarity matching algorithm based on a BERT algorithm.
In a specific example, the first question and the second question are input into a third text similarity matching network, and a first similarity vector representing the similarity degree of the first question and the second question is obtained.
And S208, calculating the similarity degree of the first matching tensor and the second matching tensor to obtain a second similarity vector.
As shown in the right side of fig. 2, the pattern similarity calculation network calculates the degree of similarity between the first matching tensor and the second matching tensor using the similarity function, and obtains a first similarity vector representing the degree of similarity between the first matching tensor and the second matching tensor.
A second similarity vector. The similarity function may be a point-by-point function, a cosine function, a distance function, or the like that can calculate the similarity between vectors.
As shown in the right side of fig. 2, after the second similarity vector is calculated, the second similarity vector may be further compressed in the vector compression network, so that the dimension of the second similarity vector is the same as the dimension of the first similarity vector, so as to facilitate subsequent aggregation of the first similarity vector and the second similarity vector.
S210, aggregating the first similarity vector and the second similarity vector to obtain an aggregated similarity vector.
As shown in fig. 2, in the aggregation network, the first similarity vector is used as a primary variable, the second similarity vector is used as a secondary variable, and the first similarity vector and the second similarity vector are aggregated by using a gating mechanism to obtain an aggregated similarity vector.
For example, the first similarity vector is given a larger weight, the second similarity vector is given a smaller weight, and the two are aggregated using a gating mechanism.
S212, whether the first question and the second question are similar questions is determined based on the aggregation similarity vector.
In step S212, the aggregated similarity vector may be perceived using a multi-layer perceptron network to obtain similarity scores for the first question and the second question.
And determining whether the first question and the second question are similar questions according to the similarity scores of the first question and the second question. In a specific example, if the similarity score of the first question and the second question is greater than or equal to a preset threshold, it is determined that the first question and the second question are similar questions. And if the similarity scores of the first question and the second question are smaller than a preset threshold value, determining that the first question and the second question are not similar questions.
And S214, under the condition that the first question and the second question are similar questions, sending the answer of the second question as the answer of the first question to the user terminal.
Referring to fig. 2, a first text similarity matching network, a second text similarity matching network, a third text similarity matching network, a pattern similarity calculation network, a vector compression network, an aggregation network, and a multi-layer perceptron network together form a problem similarity detection model of an embodiment of the present disclosure. In the training of the problem similarity detection model, an end-to-end (end-to-end) form cross entropy loss function can be adopted.
And inputting the answers of the first question, the second question and the second question into a trained question similarity detection model, and obtaining the similarity score of the first question and the second question through the question similarity detection model.
< example two >
Referring to fig. 4 and 5, a problem correlation detection method of an embodiment of the present disclosure is explained. The method comprises steps S302-S308.
S302, similarity matching is conducted on the answers of the first question and the second question, and a first matching tensor of a matching mode representing the answers of the first question and the second question is obtained.
In one specific example, the first question may be a new question, i.e., a question without an answer. The first question may also be a question that currently has only a few answers. The second question may be an archived question, with an answer already.
In an embodiment of the present disclosure, the Bert algorithm may be used as a pre-training language model for similarity matching. In step S302, similarity matching may be performed using a text similarity matching algorithm based on a BERT (Bidirectional Encoder from converters) algorithm. The BERT algorithm is an NLP (Natural Language Processing) model. In embodiments of the present disclosure, a Bert algorithm network may use a stacked encoder structure.
As shown in fig. 4, answers to the first question and the second question are input into the first text similarity matching network. The first text similarity matching network uses a text similarity matching algorithm based on a BERT algorithm to obtain a matching pattern between answers to the first question and the second question.
In a specific example, the answers of the first question and the second question are spliced and input to the first text similarity matching network. The output of the encoder is divided into two parts based on the character corresponding to the first question and the character corresponding to the answer to the second question, the first part being a vector representation of the encoder for the first question and the second part being a vector representation of the encoder for the answer to the second question. And for each layer of encoder, calculating the inner product of the vector expression of the layer of encoder to the first question and the vector expression of the answer to the second question to obtain the matching tensor corresponding to the layer of encoder. And combining the matching tensors corresponding to the encoders of each layer together to obtain a first matching tensor, wherein the first matching tensor can represent a matching mode of answers of a first question and a second question.
S304, similarity matching is conducted on the second question and the answer of the second question, and a second matching tensor of a matching mode representing the second question and the answer of the second question is obtained.
In step S304, a text similarity matching algorithm based on the BERT algorithm may be used for similarity matching.
As shown in fig. 4, the second question and the answer to the second question are input into the second text similarity matching network. The second text similarity matching network uses a text similarity matching algorithm based on a BERT algorithm to obtain a matching pattern between the second question and an answer to the second question.
In a specific example, the second question and the answer to the second question are spliced and input to the second text similarity matching network. The output of the encoder is divided into two parts based on the character corresponding to the second question and the character corresponding to the answer to the second question, the first part being a vector representation of the encoder for the second question and the second part being a vector representation of the encoder for the answer to the second question. And for each layer of encoder, calculating the inner product of the vector expression of the layer of encoder to the second question and the vector expression of the answer to the second question to obtain the matching tensor corresponding to the layer of encoder. And combining the matching tensors corresponding to the encoders of each layer together to obtain a second matching tensor, wherein the second matching tensor can represent a matching mode of a second question and an answer of the second question.
S306, determining whether the first question and the second question are similar questions or not based on the similarity degree of the first matching tensor and the second matching tensor.
In step S306, the similarity degree between the first matching tensor and the second matching tensor is calculated, and a similarity vector representing the similarity degree between the first matching tensor and the second matching tensor is obtained.
As shown in fig. 4, the pattern similarity calculation network calculates the degree of similarity between the first matching tensor and the second matching tensor using the similarity function, and obtains a similarity vector. The similarity function may be a point-by-point function, a cosine function, a distance function, or the like that can calculate the similarity between vectors.
In step S306, the similarity vector may be perceived using a multi-layered perceptron network to obtain similarity scores of the first question and the second question.
And determining whether the first question and the second question are similar questions according to the similarity scores of the first question and the second question. In a specific example, if the similarity score of the first question and the second question is greater than or equal to a preset threshold, it is determined that the first question and the second question are similar questions. And if the similarity scores of the first question and the second question are smaller than a preset threshold value, determining that the first question and the second question are not similar questions.
And S308, under the condition that the first question and the second question are similar questions, sending the answer of the second question as the answer of the first question to the user terminal.
Referring to fig. 4, a first text similarity matching network, a second text similarity matching network, a pattern similarity calculation network, and a multi-layer perceptron network together form a problem similarity detection model of an embodiment of the present disclosure. In the training of the problem similarity detection model, an end-to-end (end-to-end) form cross entropy loss function can be adopted.
And inputting the answers of the first question, the second question and the second question into a trained question similarity detection model, and obtaining the similarity score of the first question and the second question through the question similarity detection model.
The question correlation detection method of the embodiment of the disclosure compares the similarity of matching patterns between different questions and the same answer, not only introduces answer information, but also can filter noise between question and answer distribution differences, equivalently, performs secondary verification on irrelevant relations, thereby greatly reducing the influence of the irrelevant relations. For example, the "garbage collection algorithm" content in the archived answer Ae may be filtered out.
Experiments have been performed on the source data sets cqadipstack and the querqp, and the experimental results show that the problem correlation detection method of the embodiments of the present disclosure can achieve higher accuracy.
< search method >
Based on a principle similar to the problem correlation detection method of the embodiment of the present disclosure, the embodiment of the present disclosure also provides a search method.
< example one >
The searching method provided by the embodiment of the disclosure comprises steps S402-S414.
S402, carrying out similarity matching on the search results of the first search statement and the second search statement to obtain a first matching tensor representing the matching mode of the search results of the first search statement and the second search statement.
In one specific example, the first search statement may be a search statement newly submitted by a user, without having search results. The first search sentence may also be a search sentence which currently has only a small number of search results. The second search statement already has search results.
In a specific example, the first search statement may include one or more search terms, and the second search statement may include one or more search terms. For example, the search sentence may be "weather forecast" or "weather on tomorrow".
In an embodiment of the present disclosure, the Bert algorithm may be used as a pre-training language model for similarity matching. In step S402, similarity matching may be performed using a text similarity matching algorithm based on a BERT (Bidirectional Encoder from converters) algorithm. The BERT algorithm is an NLP (Natural Language Processing) model. In embodiments of the present disclosure, a Bert algorithm network may use a stacked encoder structure.
And inputting the search results of the first search sentence and the second search sentence into the first text similarity matching network. The first text similarity matching network uses a text similarity matching algorithm based on a BERT algorithm to acquire a matching pattern between search results of the first search sentence and the second search sentence.
In a specific example, the search results of the first search sentence and the second search sentence are spliced and input to the first text similarity matching network. The output of the encoder is divided into two parts based on the characters corresponding to the first search statement and the characters corresponding to the search results of the second search statement, the first part being a vector representation of the encoder for the first search statement and the second part being a vector representation of the encoder for the search results of the second search statement. And for each layer of encoder, calculating the inner product of the vector expression of the layer of encoder for the first search statement and the vector expression of the search result for the second search statement to obtain the matching tensor corresponding to the layer of encoder. And collecting the matching tensors corresponding to the encoders of all layers together to obtain a first matching tensor, wherein the first matching tensor can represent a matching mode of the search results of the first search statement and the second search statement.
S404, similarity matching is carried out on the second search statement and the search result of the second search statement, and a second matching tensor which represents the matching mode of the second search statement and the search result of the second search statement is obtained.
In step S404, similarity matching may be performed using a text similarity matching algorithm based on the BERT algorithm.
And inputting the second search sentence and the search result of the second search sentence into a second text similarity matching network. The second text similarity matching network obtains a matching pattern between the second search sentence and the search result of the second search sentence using a text similarity matching algorithm based on a BERT algorithm.
In a specific example, the second search sentence and the search result of the second search sentence are spliced and input to the second text similarity matching network. The output of the encoder is divided into two parts based on the characters corresponding to the second search statement and the characters corresponding to the search result of the second search statement, the first part being a vector representation of the encoder for the second search statement and the second part being a vector representation of the encoder for the search result of the second search statement. And for each layer of encoder, calculating the inner product of the vector expression of the layer of encoder to the second search statement and the vector expression of the search result to the second search statement, and obtaining the matching tensor corresponding to the layer of encoder. And collecting the matching tensors corresponding to the encoders of all layers together to obtain a second matching tensor, wherein the second matching tensor can represent a second search statement and a matching mode of a search result of the second search statement.
S406, similarity matching is carried out on the first search statement and the second search statement, and a first similarity vector representing the similarity of the first search statement and the second search statement is obtained.
In step S406, similarity matching may be performed using a text similarity matching algorithm based on the BERT algorithm.
And inputting the first search sentence and the second search sentence into a third text similarity matching network. The third text similarity matching network detects a degree of similarity of the first search sentence and the second search sentence using a text similarity matching algorithm based on a BERT algorithm.
In a specific example, the first search sentence and the second search sentence are input into a third text similarity matching network, and a first similarity vector representing the similarity degree of the first search sentence and the second search sentence is obtained.
S408, calculating the similarity degree of the first matching tensor and the second matching tensor to obtain a second similarity vector.
The pattern similarity calculation network calculates the degree of similarity between the first matching tensor and the second matching tensor by using a similarity function, and obtains a first similarity vector representing the degree of similarity between the first matching tensor and the second matching tensor.
A second similarity vector. The similarity function may be a point-by-point function, a cosine function, a distance function, or the like that can calculate the similarity between vectors.
After the second similarity vector is calculated, the second similarity vector can be compressed in the vector compression network, so that the dimension of the second similarity vector is the same as that of the first similarity vector, and the first similarity vector and the second similarity vector are conveniently aggregated subsequently.
S410, aggregating the first similarity vector and the second similarity vector to obtain an aggregated similarity vector.
In the aggregation network, the first similarity vector is used as a primary variable, the second similarity vector is used as a secondary variable, and the first similarity vector and the second similarity vector are aggregated by using a gating mechanism to obtain an aggregation similarity vector.
For example, the first similarity vector is given a larger weight, the second similarity vector is given a smaller weight, and the two are aggregated using a gating mechanism.
S412, determining whether the first search statement and the second search statement are similar search statements or not based on the aggregation similarity vector.
In step S412, the aggregated similarity vector may be perceived using a multi-layer perceptron network, resulting in similarity scores of the first search statement and the second search statement.
And determining whether the first search sentence and the second search sentence are similar search sentences according to the similarity scores of the first search sentence and the second search sentence. In a specific example, if the similarity score of the first search sentence and the second search sentence is greater than or equal to a preset threshold, it is determined that the first search sentence and the second search sentence are similar search sentences. And if the similarity scores of the first search sentence and the second search sentence are smaller than a preset threshold value, determining that the first search sentence and the second search sentence are not similar search sentences.
And S414, under the condition that the first search sentence and the second search sentence are similar search sentences, sending the search result of the second search sentence as the search result of the first search sentence to the user terminal.
The first text similarity matching network, the second text similarity matching network, the third text similarity matching network, the pattern similarity calculation network, the vector compression network, the aggregation network and the multilayer perceptron network together form a search statement similarity detection model of the embodiment of the disclosure. In the process of training the search statement similarity detection model, an end-to-end (end-to-end) form cross entropy loss function can be adopted.
And inputting the search results of the first search sentence, the second search sentence and the second search sentence into the trained search sentence similarity detection model, and obtaining the similarity score of the first search sentence and the second search sentence through the search sentence similarity detection model.
< example two >
The searching method provided by the embodiment of the disclosure comprises steps S502-S508.
S502, similarity matching is conducted on the search results of the first search statement and the second search statement, and a first matching tensor which represents a matching mode of the search results of the first search statement and the second search statement is obtained.
In one specific example, the first search statement may be a search statement newly submitted by a user, without having search results. The first search sentence may also be a search sentence which currently has only a small number of search results. The second search statement already has search results.
In a specific example, the first search statement may include one or more search terms, and the second search statement may include one or more search terms. For example, the search sentence may be "weather forecast" or "weather on tomorrow".
In an embodiment of the present disclosure, the Bert algorithm may be used as a pre-training language model for similarity matching. In step S502, similarity matching may be performed using a text similarity matching algorithm based on a BERT (Bidirectional Encoder from converters) algorithm. The BERT algorithm is an NLP (Natural Language Processing) model. In embodiments of the present disclosure, a Bert algorithm network may use a stacked encoder structure.
And inputting the search results of the first search sentence and the second search sentence into the first text similarity matching network. The first text similarity matching network uses a text similarity matching algorithm based on a BERT algorithm to acquire a matching pattern between search results of the first search sentence and the second search sentence.
In a specific example, the search results of the first search sentence and the second search sentence are spliced and input to the first text similarity matching network. The output of the encoder is divided into two parts based on the characters corresponding to the first search statement and the characters corresponding to the search results of the second search statement, the first part being a vector representation of the encoder for the first search statement and the second part being a vector representation of the encoder for the search results of the second search statement. And for each layer of encoder, calculating the inner product of the vector expression of the layer of encoder for the first search statement and the vector expression of the search result for the second search statement to obtain the matching tensor corresponding to the layer of encoder. And collecting the matching tensors corresponding to the encoders of all layers together to obtain a first matching tensor, wherein the first matching tensor can represent a matching mode of the search results of the first search statement and the second search statement.
S504, similarity matching is conducted on the second search statement and the search result of the second search statement, and a second matching tensor which represents a matching mode of the second search statement and the search result of the second search statement is obtained.
In step S504, a text similarity matching algorithm based on the BERT algorithm may be used for similarity matching.
And inputting the second search sentence and the search result of the second search sentence into a second text similarity matching network. The second text similarity matching network obtains a matching pattern between the second search sentence and the search result of the second search sentence using a text similarity matching algorithm based on a BERT algorithm.
In a specific example, the second search sentence and the search result of the second search sentence are spliced and input to the second text similarity matching network. The output of the encoder is divided into two parts based on the characters corresponding to the second search statement and the characters corresponding to the search result of the second search statement, the first part being a vector representation of the encoder for the second search statement and the second part being a vector representation of the encoder for the search result of the second search statement. And for each layer of encoder, calculating the inner product of the vector expression of the layer of encoder to the second search statement and the vector expression of the search result to the second search statement, and obtaining the matching tensor corresponding to the layer of encoder. And collecting the matching tensors corresponding to the encoders of all layers together to obtain a second matching tensor, wherein the second matching tensor can represent a second search statement and a matching mode of a search result of the second search statement.
S506, determining whether the first search statement and the second search statement are similar search statements or not based on the similarity degree of the first matching tensor and the second matching tensor.
In step S506, a similarity degree between the first matching tensor and the second matching tensor is calculated, and a similarity vector representing the similarity degree between the first matching tensor and the second matching tensor is obtained.
The pattern similarity calculation network calculates the similarity of the first matching tensor and the second matching tensor by using the similarity function to obtain a similarity vector. The similarity function may be a point-by-point function, a cosine function, a distance function, or the like that can calculate the similarity between vectors.
In step S506, the similarity vector may be perceived using a multi-layered perceptron network, resulting in similarity scores of the first search term and the second search term.
And determining whether the first search sentence and the second search sentence are similar search sentences according to the similarity scores of the first search sentence and the second search sentence. In a specific example, if the similarity score of the first search sentence and the second search sentence is greater than or equal to a preset threshold, it is determined that the first search sentence and the second search sentence are similar search sentences. And if the similarity scores of the first search sentence and the second search sentence are smaller than a preset threshold value, determining that the first search sentence and the second search sentence are not similar search sentences.
And S508, under the condition that the first search sentence and the second search sentence are similar search sentences, sending the search result of the second search sentence as the search result of the first search sentence to the user terminal.
The first text similarity matching network, the second text similarity matching network, the pattern similarity calculation network and the multilayer perceptron network together form a search statement similarity detection model of the embodiment of the disclosure. In the process of training the search statement similarity detection model, an end-to-end (end-to-end) form cross entropy loss function can be adopted.
And inputting the search results of the first search sentence, the second search sentence and the second search sentence into the trained search sentence similarity detection model, and obtaining the similarity score of the first search sentence and the second search sentence through the search sentence similarity detection model.
The searching method of the embodiment of the disclosure compares the similarity of matching patterns between different search sentences and the same search result, not only introduces search result information, but also can filter noise between the search sentences and the distribution difference of the search results, and equivalently, the irrelevant relation is verified for the second time, thereby greatly reducing the influence of the irrelevant relation.
Experiments have been performed on the source data sets cqadipstack and the queraqp, and the experimental results show that the search method of the embodiment of the disclosure can achieve higher detection accuracy of similar search sentences.
< Voice interaction method >
Based on a principle similar to the problem correlation detection method of the embodiment of the present disclosure, the embodiment of the present disclosure further provides a voice interaction method.
The embodiment of the disclosure provides a voice interaction method. The voice interaction method may be performed by a user terminal having a microphone and a speaker, and in particular, the user terminal may have a hardware configuration similar to the terminal device 103 in fig. 1.
In a specific example, the voice interaction method may include steps S600-S614.
S600, acquiring a first statement submitted by a user in a voice mode.
Specifically, the user terminal picks up a voice signal of the user through a microphone and recognizes the voice signal as a first sentence.
S602, carrying out similarity matching on the response results of the first statement and the second statement to obtain a first matching tensor of a matching mode representing the response results of the first statement and the second statement.
S604, similarity matching is conducted on the second statement and the response result of the second statement, and a second matching tensor of a matching mode representing the response result of the second statement and the second statement is obtained.
S606, similarity matching is conducted on the first statement and the second statement, and a first similarity vector representing the similarity degree of the first statement and the second statement is obtained.
And S608, calculating the similarity degree of the first matching tensor and the second matching tensor to obtain a second similarity vector.
S610, aggregating the first similarity vector and the second similarity vector to obtain an aggregated similarity vector.
And S612, determining whether the first statement and the second statement are similar statements or not based on the aggregation similarity vector.
And S614, under the condition that the first sentence and the second sentence are similar sentences, playing a response result of the second sentence in a voice mode.
In a specific example, the voice interaction method may include steps S700-S708.
S700, acquiring a first statement submitted by a user in a voice mode.
Specifically, the user terminal picks up a voice signal of the user through a microphone and recognizes the voice signal as a first sentence.
S702, carrying out similarity matching on the response results of the first statement and the second statement to obtain a first matching tensor of a matching mode representing the response results of the first statement and the second statement.
S704, carrying out similarity matching on the second statement and the response result of the second statement to obtain a second matching tensor representing the matching mode of the response result of the second statement and the second statement.
S706, determining whether the first sentence and the second sentence are similar sentences or not based on the similarity degree of the first matching tensor and the second matching tensor.
And S708, under the condition that the first sentence and the second sentence are similar sentences, playing a response result of the second sentence in a voice mode.
Steps S602-S612 are similar to steps S202-S212, and reference may be made to steps S202-S212 for details, which will not be repeated here. Steps S702 to S706 are similar to steps S302 to S306, and specific details can be found in steps S302 to S306, and the description is not repeated here.
In a specific example, the user terminal is provided with a voice assistant, the voice assistant calls a microphone to acquire the first sentence, and the voice assistant calls a loudspeaker to play the response result. The above method may be implemented by a voice assistant. In a specific example, the voice assistant is, for example, a customer service assistant for an application, such as a customer service assistant for a shopping Application (APP). Through the voice assistant, the user can directly carry out human-computer interaction by using natural language. For example, the user makes a booking consultation, a weather forecast inquiry, a medical consultation, a product consultation, and after-sale service acquisition through the voice assistant, and the convenience is brought to the daily life of the user.
The embodiment of the disclosure provides a voice interaction method. The voice interaction method may be performed by a server, which may have a hardware configuration similar to the server 101 in fig. 1.
In a specific example, the voice interaction method may include steps S800-S814.
And S800, acquiring a first statement submitted by a user in a voice mode.
The user can submit the first statement to the user terminal in a voice mode, and the user terminal sends the first statement to the server.
Specifically, the user terminal picks up a voice signal of the user through a microphone and recognizes the voice signal as a first sentence. The user terminal has a microphone and a speaker, and in particular, the user terminal may have a hardware configuration similar to the terminal device 103 in fig. 1.
S802, similarity matching is conducted on the response results of the first statement and the second statement, and a first matching tensor of a matching mode representing the response results of the first statement and the second statement is obtained.
S804, similarity matching is conducted on the second statement and the response result of the second statement, and a second matching tensor of a matching mode representing the response result of the second statement and the second statement is obtained.
S806, similarity matching is conducted on the first statement and the second statement, and a first similarity vector representing the similarity degree of the first statement and the second statement is obtained.
And S808, calculating the similarity degree of the first matching tensor and the second matching tensor to obtain a second similarity vector.
And S810, aggregating the first similarity vector and the second similarity vector to obtain an aggregated similarity vector.
S812, whether the first statement and the second statement are similar statements is determined based on the aggregation similarity vector.
S814, under the condition that the first statement and the second statement are similar statements, sending the response result of the second statement to the user terminal so that the user terminal can play the response result of the second statement in a voice mode.
In a specific example, the voice interaction method may include steps S900-S908.
And S900, acquiring a first statement submitted by a user in a voice mode.
The user can submit the first statement to the user terminal in a voice mode, and the user terminal sends the first statement to the server.
Specifically, the user terminal picks up a voice signal of the user through a microphone and recognizes the voice signal as a first sentence. The user terminal has a microphone and a speaker, and in particular, the user terminal may have a hardware configuration similar to the terminal device 103 in fig. 1.
S902, carrying out similarity matching on the response results of the first statement and the second statement to obtain a first matching tensor of a matching mode representing the response results of the first statement and the second statement.
And S904, carrying out similarity matching on the second statement and the response result of the second statement to obtain a second matching tensor representing the matching mode of the response result of the second statement and the second statement.
S906, determining whether the first sentence and the second sentence are similar sentences or not based on the similarity degree of the first matching tensor and the second matching tensor.
And S908, under the condition that the first statement and the second statement are similar statements, sending the response result of the second statement to the user terminal so that the user terminal can play the response result of the second statement in a voice mode.
Steps S802-S812 are similar to steps S202-S212, and reference may be made to steps S202-S212 for details, which will not be repeated here. Steps S902-S906 are similar to steps S302-S306, and specific details can be found in steps S302-S306, and the description is not repeated here.
In a specific example, the user terminal is provided with a voice assistant, the voice assistant calls a microphone to acquire the first sentence, and the voice assistant calls a loudspeaker to play the response result. In a specific example, the voice assistant is, for example, a customer service assistant for an application, such as a customer service assistant for a shopping Application (APP). Through the voice assistant, the user can directly carry out human-computer interaction by using natural language. For example, the user makes a booking consultation, a weather forecast inquiry, a medical consultation, a product consultation, and after-sale service acquisition through the voice assistant, and the convenience is brought to the daily life of the user.
The voice interaction method of the embodiment of the disclosure compares the similarity of matching patterns between different sentences and the same response result, not only introduces response result information, but also can filter noise between sentence-response result distribution differences, equivalently, performs secondary verification on irrelevant relations, thereby greatly reducing the influence of the irrelevant relations. The voice interaction method disclosed by the embodiment of the disclosure can improve the response speed and the response accuracy.
< Server for question answering >
Referring to fig. 6, an embodiment of the present disclosure further provides a server for a question and answer service. The server 40 of the question-and-answer service includes a processor 41 and a memory 42, and the memory 42 stores computer instructions which, when executed by the processor 41, implement the question correlation detection method of any one of the foregoing embodiments.
< search Server >
Referring to fig. 7, an embodiment of the present disclosure also provides a search server. The search server 50 comprises a processor 51 and a memory 52, the memory 52 storing computer instructions which, when executed by the processor 51, implement the search method of any of the preceding embodiments.
< Server >
Embodiments of the present disclosure also provide a server. The server comprises a processor and a memory, wherein the memory stores computer instructions, and the computer instructions are executed by the processor to realize the voice interaction method of the embodiment.
< electronic apparatus >
The embodiment of the disclosure also provides an electronic device with a voice assistant, which comprises a microphone, a loudspeaker, a processor and a memory; the memory stores computer instructions that, when executed by the processor, implement the voice interaction method of the foregoing embodiments.
< computer-readable Medium >
Embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon computer instructions, which, when executed by a processor, implement the problem correlation detection method of any of the foregoing embodiments.
Embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon computer instructions, which, when executed by a processor, implement the search method of any of the foregoing embodiments.
Embodiments of the present disclosure also provide a computer-readable storage medium having stored thereon computer instructions, which, when executed by a processor, implement the voice interaction method of any of the foregoing embodiments.
The embodiments in the disclosure are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. Especially, as for the embodiments of the apparatus, the device and the server, since they are basically similar to the embodiments of the method, the description is simple, and the relevant points can be referred to the partial description of the embodiments of the method.
The foregoing description of specific embodiments of the present disclosure has been described. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Embodiments of the present disclosure may be systems, methods, and/or computer program products. The computer program product may include a computer-readable storage medium having computer-readable program instructions embodied thereon for causing a processor to implement aspects of embodiments of the disclosure.
The computer readable storage medium may be a tangible device that can hold and store the instructions for use by the instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic memory device, a magnetic memory device, an optical memory device, an electromagnetic memory device, a semiconductor memory device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a Static Random Access Memory (SRAM), a portable compact disc read-only memory (CD-ROM), a Digital Versatile Disc (DVD), a memory stick, a floppy disk, a mechanical coding device, such as punch cards or in-groove projection structures having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media as used herein is not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission medium (e.g., optical pulses through a fiber optic cable), or electrical signals transmitted through electrical wires.
The computer-readable program instructions described herein may be downloaded from a computer-readable storage medium to a respective computing/processing device, or to an external computer or external storage device via a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmission, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. The network adapter card or network interface in each computing/processing device receives computer-readable program instructions from the network and forwards the computer-readable program instructions for storage in a computer-readable storage medium in the respective computing/processing device.
Computer program instructions for carrying out operations for embodiments of the present disclosure may be assembly instructions, Instruction Set Architecture (ISA) instructions, machine related instructions, microcode, firmware instructions, state setting data, or source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer-readable program instructions may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any type of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, the electronic circuitry that can execute the computer-readable program instructions implements aspects of the disclosed embodiments by personalizing the custom electronic circuitry, such as a programmable logic circuit, a Field Programmable Gate Array (FPGA), or a Programmable Logic Array (PLA), with state information of the computer-readable program instructions.
Various aspects of embodiments of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer-readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer-readable program instructions may also be stored in a computer-readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer-readable medium storing the instructions comprises an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. It is well known to those skilled in the art that implementation by hardware, by software, and by a combination of software and hardware are equivalent.
Having described embodiments of the present disclosure, the foregoing description is intended to be exemplary, not exhaustive, and not limited to the disclosed embodiments. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope of the described embodiments. The terms used herein were chosen in order to best explain the principles of the embodiments, the practical application, or technical improvements to the techniques in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (21)

1. A problem correlation detection method, comprising:
carrying out similarity matching on the answers of the first question and the second question to obtain a first matching tensor of a matching mode representing the answers of the first question and the second question;
performing similarity matching on the second question and the answer of the second question to obtain a second matching tensor representing the matching mode of the second question and the answer of the second question;
whether the first question and the second question are similar questions is determined based on the degree of similarity of the first matching tensor and the second matching tensor.
2. The method of claim 1, further comprising:
and in the case that the first question and the second question are similar questions, sending the answer of the second question as the answer of the first question to the user terminal.
3. The method of claim 1, further comprising: carrying out similarity matching on the first question and the second question to obtain a first similarity vector representing the similarity degree of the first question and the second question;
determining whether the first question and the second question are similar questions based on the similarity degree of the first matching tensor and the second matching tensor comprises:
calculating the similarity degree of the first matching tensor and the second matching tensor to obtain a second similarity vector;
aggregating the first similarity vector and the second similarity vector to obtain an aggregated similarity vector;
determining whether the first question and the second question are similar questions based on the aggregated similarity vector.
4. The method of claim 3, wherein the aggregating the first similarity vector and the second similarity vector to obtain an aggregated similarity vector comprises:
and aggregating the first similarity vector and the second similarity vector by using a gating mechanism to obtain an aggregated similarity vector by taking the first similarity vector as a primary variable and the second similarity vector as a secondary variable.
5. The method of claim 3, further comprising, prior to calculating the degree of similarity of the first and second matching tensors:
the second similarity vector is compressed such that the dimensions of the second similarity vector are the same as the dimensions of the first similarity vector.
6. The method of claim 3, calculating a degree of similarity of the first matching tensor and the second matching tensor by a similarity function; the similarity function is any one of:
point multiplication function, cosine function, distance function.
7. The method of claim 3, said similarity matching being performed using a text similarity matching algorithm based on a BERT algorithm.
8. A method of searching, comprising:
carrying out similarity matching on the search results of the first search statement and the second search statement to obtain a first matching tensor representing a matching mode of the search results of the first search statement and the second search statement;
similarity matching is carried out on the second search statement and the search result of the second search statement, and a second matching tensor which represents the matching mode of the second search statement and the search result of the second search statement is obtained;
determining whether the first search sentence and the second search sentence are similar search sentences based on the degree of similarity of the first matching tensor and the second matching tensor.
9. The method of claim 8, further comprising:
and under the condition that the first search sentence and the second search sentence are similar search sentences, sending the search result of the second search sentence to the user terminal as the search result of the first search sentence.
10. The method of claim 8, further comprising: carrying out similarity matching on the first search statement and the second search statement to obtain a first similarity vector representing the similarity of the first search statement and the second search statement;
the determining whether the first search sentence and the second search sentence are similar search sentences based on the similarity degrees of the first matching tensor and the second matching tensor includes:
calculating the similarity degree of the first matching tensor and the second matching tensor to obtain a second similarity vector;
aggregating the first similarity vector and the second similarity vector to obtain an aggregated similarity vector;
determining whether the first search statement and the second search statement are similar search statements based on the aggregated similarity vector.
11. The method of claim 10, wherein the aggregating the first similarity vector and the second similarity vector to obtain an aggregated similarity vector comprises:
and aggregating the first similarity vector and the second similarity vector by using a gating mechanism to obtain an aggregated similarity vector by taking the first similarity vector as a primary variable and the second similarity vector as a secondary variable.
12. The method of claim 10, further comprising, prior to calculating the degree of similarity of the first and second matching tensors:
the second similarity vector is compressed such that the dimensions of the second similarity vector are the same as the dimensions of the first similarity vector.
13. The method of claim 10, calculating a degree of similarity of the first matching tensor and the second matching tensor by a similarity function; the similarity function is any one of:
point multiplication function, cosine function, distance function.
14. The method of claim 10, said similarity matching being performed using a text similarity matching algorithm based on a BERT algorithm.
15. A server for a question-answering service, comprising a processor and a memory, the memory storing computer instructions which, when executed by the processor, implement the method of any one of claims 1 to 7.
16. A search server comprising a processor and a memory, the memory storing computer instructions which, when executed by the processor, implement the method of any one of claims 8 to 14.
17. A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the method of any one of claims 1-7.
18. A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the method of any one of claims 8-14.
19. A method of voice interaction, comprising:
acquiring a first statement submitted by a user in a voice mode;
carrying out similarity matching on the response results of the first statement and the second statement to obtain a first matching tensor of a matching mode representing the response results of the first statement and the second statement;
carrying out similarity matching on the second statement and the response result of the second statement to obtain a second matching tensor of a matching mode representing the response result of the second statement and the second statement;
determining whether the first sentence and the second sentence are similar sentences based on the similarity degree of the first matching tensor and the second matching tensor;
and under the condition that the first sentence and the second sentence are similar sentences, playing a response result of the second sentence in a voice mode.
20. A method of voice interaction, comprising:
acquiring a first statement submitted by a user in a voice mode;
carrying out similarity matching on the response results of the first statement and the second statement to obtain a first matching tensor of a matching mode representing the response results of the first statement and the second statement;
carrying out similarity matching on the second statement and the response result of the second statement to obtain a second matching tensor of a matching mode representing the response result of the second statement and the second statement;
determining whether the first sentence and the second sentence are similar sentences based on the similarity degree of the first matching tensor and the second matching tensor;
and under the condition that the first sentence and the second sentence are similar sentences, sending a response result of the second sentence to the user terminal for playing by the user terminal.
21. An electronic device with a voice assistant is characterized by comprising a microphone, a loudspeaker, a processor and a memory; the memory stores computer instructions that, when executed by the processor, implement the method of claim 19.
CN202010357386.7A 2020-04-29 2020-04-29 Method and server for detecting problem correlation Active CN113569885B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010357386.7A CN113569885B (en) 2020-04-29 2020-04-29 Method and server for detecting problem correlation

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010357386.7A CN113569885B (en) 2020-04-29 2020-04-29 Method and server for detecting problem correlation

Publications (2)

Publication Number Publication Date
CN113569885A true CN113569885A (en) 2021-10-29
CN113569885B CN113569885B (en) 2024-06-07

Family

ID=78157781

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010357386.7A Active CN113569885B (en) 2020-04-29 2020-04-29 Method and server for detecting problem correlation

Country Status (1)

Country Link
CN (1) CN113569885B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107220380A (en) * 2017-06-27 2017-09-29 北京百度网讯科技有限公司 Question and answer based on artificial intelligence recommend method, device and computer equipment
CN109271505A (en) * 2018-11-12 2019-01-25 深圳智能思创科技有限公司 A kind of question answering system implementation method based on problem answers pair
KR20190109656A (en) * 2018-03-08 2019-09-26 주식회사 포티투마루 Artificial intelligence qa method, apparatus and program

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107220380A (en) * 2017-06-27 2017-09-29 北京百度网讯科技有限公司 Question and answer based on artificial intelligence recommend method, device and computer equipment
US20180373782A1 (en) * 2017-06-27 2018-12-27 Beijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for recommending answer to question based on artificial intelligence
KR20190109656A (en) * 2018-03-08 2019-09-26 주식회사 포티투마루 Artificial intelligence qa method, apparatus and program
CN109271505A (en) * 2018-11-12 2019-01-25 深圳智能思创科技有限公司 A kind of question answering system implementation method based on problem answers pair

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
B. FUGLEDE等: ""Jensen-Shannon divergence and Hilbert space embedding"", 《INTERNATIONAL SYMPOSIUM ONINFORMATION THEORY》 *
陈志豪;余翔;刘子辰;邱大伟;顾本刚;: "基于注意力和字嵌入的中文医疗问答匹配方法", 计算机应用, no. 06 *

Also Published As

Publication number Publication date
CN113569885B (en) 2024-06-07

Similar Documents

Publication Publication Date Title
CN112164391B (en) Statement processing method, device, electronic equipment and storage medium
US11238306B2 (en) Generating vector representations of code capturing semantic similarity
CN111612070B (en) Image description generation method and device based on scene graph
US12045578B2 (en) Method for determining text similarity, storage medium and electronic device
CN109145213B (en) Method and device for query recommendation based on historical information
CN114036322B (en) Training method, electronic device and storage medium for search system
CN110781305A (en) Text classification method and device based on classification model and model training method
CN111341308A (en) Method and apparatus for outputting information
CN111386686A (en) Machine reading understanding system for answering queries related to documents
CN117236340A (en) Question answering method, device, equipment and medium
CN110232181B (en) Comment analysis method and device
CN117273019A (en) Training method of dialogue model, dialogue generation method, device and equipment
CN111538830B (en) French searching method, device, computer equipment and storage medium
CN110222144B (en) Text content extraction method and device, electronic equipment and storage medium
JP2023002690A (en) Semantics recognition method, apparatus, electronic device, and storage medium
CN111984765B (en) Knowledge base question-answering process relation detection method and device
CN114328908A (en) Question and answer sentence quality inspection method and device and related products
CN112287096B (en) A method, device and electronic device for generating document summary
CN113569885B (en) Method and server for detecting problem correlation
CN118101978A (en) Interaction method and device for live broadcasting of virtual roles and electronic equipment
CN114298227B (en) Text deduplication method, device, equipment and medium
CN116719918A (en) Question answering method and device, electronic equipment and storage medium
WO2023040545A1 (en) Data processing method and apparatus, device, storage medium, and program product
CN118152000A (en) Code detection method and device and electronic equipment
CN116821327A (en) Text data processing method, apparatus, device, readable storage medium and product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant