Nothing Special   »   [go: up one dir, main page]

CN103377245B - A kind of automatic question-answering method and device - Google Patents

A kind of automatic question-answering method and device Download PDF

Info

Publication number
CN103377245B
CN103377245B CN201210128360.0A CN201210128360A CN103377245B CN 103377245 B CN103377245 B CN 103377245B CN 201210128360 A CN201210128360 A CN 201210128360A CN 103377245 B CN103377245 B CN 103377245B
Authority
CN
China
Prior art keywords
word
centre
answer
frequency
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210128360.0A
Other languages
Chinese (zh)
Other versions
CN103377245A (en
Inventor
路彦雄
贺翔
焦峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Shiji Guangsu Information Technology Co Ltd
Original Assignee
Shenzhen Shiji Guangsu Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Shiji Guangsu Information Technology Co Ltd filed Critical Shenzhen Shiji Guangsu Information Technology Co Ltd
Priority to CN201210128360.0A priority Critical patent/CN103377245B/en
Publication of CN103377245A publication Critical patent/CN103377245A/en
Application granted granted Critical
Publication of CN103377245B publication Critical patent/CN103377245B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a kind of automatic question-answering method, this method includes:The problem of being inputted according to user terminal string obtains relevant existing user's question and answer data;Count the word frequency of the centre word of the abstract part of existing user's question and answer data;According to the inverse document frequency of the word frequency of each centre word and each centre word counted in advance, the word weight of each centre word is calculated, the maximum centre word of word weight is determined as answer word;The answer of the corresponding automatic question answering of described problem string is determined according to the answer word.The invention also discloses a kind of automatic call answering arrangement, this method and device need not establish knowledge base, need not also limit ken, only need to be according to user's question and answer data of existing Ask-Answer Community, you can realize automatic question answering.

Description

A kind of automatic question-answering method and device
Technical field
The present invention relates to web search technical field, more particularly to a kind of automatic question-answering method and device.
Background technology
In current web search, Ask-Answer Community has gradually developed, and Ask-Answer Community, that is, user participates in puing question to and answer, And organize user and data according to this question and answer relationship, for the internet product of user's search.And in Ask-Answer Community, User's enquirement demand is cannot be satisfied to answer a question by user completely, therefore most of Ask-Answer Communities also provide automatically at present Question and answer function provides answer to the problem of user automatically by background server.
There are mainly two types of implementation methods at present for automatic question answering:
1) in specific knowledge field, according to the analysis method of setting, customer problem is automatically analyzed and from existing answer Extract answer.
2) answer is matched in a large amount of predefined knowledge base.
Problem analysis and answer is extracted in specific knowledge field for the first, this method is specific due to being limited to Ken, so having certain limitation.
And answer is matched in a large amount of predefined knowledge base for second, this method problem-solving ability takes Certainly in the size of pre-stored knowledge base data volume, automatic question answering cannot achieve beyond the problem of knowledge base scope.
In short, in the prior art, automatic question answering must rely on specific knowledge field or knowledge base;As long as being led beyond knowledge The problem of domain or knowledge base, it all cannot achieve automatic question answering.
Invention content
In view of this, the present invention provides a kind of automatic question-answering method and device, it can be according to the use of existing Ask-Answer Community Family question and answer data realize automatic question answering.In order to achieve the above object, what technical scheme of the present invention was specifically realized in:
A kind of automatic question-answering method, this method include:
The problem of being inputted according to user terminal string obtains relevant existing user's question and answer data;
Count the word frequency of the centre word of the abstract part of existing user's question and answer data;
According to the inverse document frequency of the word frequency of each centre word and each centre word counted in advance, calculate The maximum centre word of word weight is determined as answer word by the word weight of each centre word;
The answer of the corresponding automatic question answering of described problem string is determined according to the answer word.
Preferably, the problem of being inputted according to user terminal string obtains relevant existing user's question and answer data, including:
It is gone here and there described problem string as retrieval, is input to the search engine of Ask-Answer Community, obtained corresponding with the retrieval string Query result, every query result includes title division and the abstract part with distinctive mark.
Preferably, the word frequency of the centre word of the abstract part of statistics existing user's question and answer data, including:
The centre word word frequency of the abstract part of statistics each query result one by one, until all query results have all counted At;
Wherein, for each query result, part of being made a summary is counted using fullstop cutting as sentence for each sentence The word frequency of wherein each centre word, the word frequency of the centre word in all sentences is added up, all centre words in being made a summary Word frequency.
Preferably, the word frequency by the centre word in all sentences adds up, all centre words in being made a summary Word frequency, including:
If there is the word with distinctive mark in sentence, the word frequency of each centre word presses 3 times of criteria weights in the sentence It is cumulative;If there is the word with distinctive mark before or after the sentence in adjacent sentence, the word of each centre word in the sentence Frequency is cumulative by 2 times of criteria weights;Otherwise, the word frequency of each centre word is cumulative by criteria weights in the sentence, to obtain the sentence The Weighted Term Frequency of all centre words in son.
Preferably, the centre word word frequency of the abstract part of each query result of statistics one by one, until all inquiries As a result all statistics is completed, including:
Compare the similarity between the title division of each query result and described problem string, if current queries result Title and the similarity of described problem string be more than preset threshold value, then the step of executing the statistics centre word word frequency, otherwise The step of skipping the statistics centre word word frequency of current queries result.
Preferably, the word weight for calculating each centre word, including:
The inverse document frequency of the word frequency × centre word of the word weight=centre word of centre word.
Preferably, the answer that the corresponding automatic question answering of described problem string is determined according to answer word, including:
It is found in the abstract of the query result and the most preceding s abstracts of answer word occurs;S is whole more than or equal to 1 Number;
Described s abstract is respectively divided into multiple sentences by fullstop;It is found in these sentences and answer word occurs and user asks Inscribe the largest number of sentences of centre word of string, the answer as the corresponding automatic question answering of described problem string.
A kind of automatic call answering arrangement, the device include:
Question and answer data acquisition module, string obtains relevant existing user's question and answer number the problem of for being inputted according to user terminal According to;
Word frequency statistics module, the word frequency of the centre word of the abstract part for counting existing user's question and answer data;
Answer word determining module, for according to the word frequency of each centre word and each center counted in advance The inverse document frequency of word calculates the word weight of each centre word, the maximum centre word of word weight is determined as answer word;
Automatic question answering answer determining module, for determining the corresponding automatic question answering of described problem string according to the answer word Answer.
Preferably, the question and answer data acquisition module, including:
Retrieval unit is input to the search engine of Ask-Answer Community for being gone here and there described problem string as retrieval;
Acquiring unit, for obtaining query result corresponding with the retrieval string, every query result includes title division With the abstract part with distinctive mark.
Preferably, the word frequency statistics module includes:
Cutting unit, for being directed to each query result, part of being made a summary is using fullstop cutting as sentence;
Statistic unit counts the word frequency of wherein each centre word for each sentence for the cutting unit cutting;
The word frequency of summing elements, the centre word in all sentences for counting the statistic unit adds up, and obtains To the word frequency of all centre words in abstract;
Control unit counts each inquiry knot one by one for controlling the cutting unit, statistic unit and summing elements The centre word word frequency of the abstract part of fruit, until all query results all count completion.
Preferably, the summing elements include:
Identify judgment sub-unit, the distinctive mark in sentence for judging the cutting unit cutting;
Weight adds up subelement, cumulative for carrying out word frequency according to the judgement of the mark judgment sub-unit;If sentence In have the word with distinctive mark, then the word frequency of each centre word is cumulative by 3 times of criteria weights in the sentence;If before the sentence Or having the word with distinctive mark in rear adjacent sentence, then the word frequency of each centre word is tired by 2 times of criteria weights in the sentence Add;Otherwise, the word frequency of each centre word is cumulative by criteria weights in the sentence, to obtain all centre words in the sentence plus Weigh word frequency.
Preferably, the word frequency statistics module further comprises:
Similarity-rough set unit, it is similar between the title division of each query result and described problem string for comparing Degree;
Described control unit is further used for, if the similarity of the title of current queries result and described problem string is more than Preset threshold value then controls the cutting unit, statistic unit and summing elements, executes the step of the statistics centre word word frequency Suddenly, the step of otherwise skipping the statistics centre word word frequency of current queries result.
Preferably, the answer word determining module includes:
Word weight calculation unit, for according to formula:The word frequency of the word weight=centre word of the centre word × centre word Inverse document frequency, calculate the word weight of each centre word;
Answer word determination unit, for the maximum centre word of word weight to be determined as answer word.
Preferably, the automatic question answering answer determining module includes:
There is the most preceding s abstracts of answer word for being found in the abstract of the query result in abstract acquiring unit; S is the integer more than or equal to 1;
Abstract cutting unit, for described s abstract to be respectively divided into multiple sentences by fullstop;
There is answer word and user asks for being found in the sentence of the abstract cutting unit cutting in answer determination unit Inscribe the largest number of sentences of centre word of string, the answer as the corresponding automatic question answering of described problem string.
As seen from the above technical solution, this automatic question-answering method and device of the invention, take full advantage of Ask-Answer Community Existing user's question and answer data, need not establish question and answer knowledge base, need not also limit the ken of customer problem, and according to It is most related that the parameters such as word frequency, inverse document frequency, text similarity find out the problem of being proposed to user from existing question and answer data Answer, realize full-automatic answer.In addition to this, the present invention can be also used for carrying out semantic expansion to general problem or text string Exhibition can be used for classifying or searching for etc..
Description of the drawings
Fig. 1 is the automatic question-answering method flow chart of the embodiment of the present invention;
Fig. 2 is the automatic call answering arrangement structural schematic diagram of the embodiment of the present invention;
Fig. 3 is the question and answer data acquisition module structural schematic diagram of the embodiment of the present invention;
Fig. 4 is the word frequency statistics modular structure schematic diagram of the embodiment of the present invention;
Fig. 5 is the summing elements structural schematic diagram of the embodiment of the present invention;
Fig. 6 is the answer word determining module structural schematic diagram of the embodiment of the present invention;
Fig. 7 is the automatic question answering answer determining module structural schematic diagram of the embodiment of the present invention.
Specific implementation mode
To make the objectives, technical solutions, and advantages of the present invention more comprehensible, develop simultaneously embodiment referring to the drawings, right The present invention is further described.
The present invention mainly utilizes the existing question and answer data in Ask-Answer Community, obtains asking with what user proposed by search engine The relevant question and answer data research result of topic string, and according to word frequency, the parameters such as similarity between inverse document frequency and text chunk, Word candidate is selected from these retrieval results, and calculates the weight to these word candidates and sequence, weight is maximum Word candidate is as answer word, and by the sentence where the answer word, automatic question answering answer that the problem of being proposed as user goes here and there.
Detailed process is as shown in Figure 1, include the following steps:
Step 101, the problem of being inputted according to user terminal string obtains relevant existing user's question and answer data;
It obtains the problem of user terminal proposes to go here and there (being indicated with q), is gone here and there problem string q as retrieval, be input to Ask-Answer Community Search engine, obtain n query result, every result includes that title (uses ti, i=1 | n is indicated) and with distinctive mark Identical word is marked during the problem of abstract, the distinctive mark in abstract is in make a summary, being inputted with user terminal goes here and there Mark, marked in search result when returning to search result by Ask-Answer Community search engine, to prompt user;Usually with red Color font marks, so the abstract with distinctive mark also known as marks red abstract, the red word of abstract acceptance of the bid is actually to pluck Want appearance with retrieval go here and there in identical word.Certainly, according to the difference of search engine, the query result of acquisition may also use Other distinctive marks, as long as getting the abstract with distinctive mark here, the form being specifically identified is arbitrary.
These query results be in Ask-Answer Community with the relevant existing user's question and answer data of problem string input by user, Middle title is the problem related to problem string q input by user, and abstract is then corresponding answer.
Step 102, the word frequency of the centre word of the abstract part of existing user's question and answer data is counted;
After obtaining the i.e. existing user's question and answer data of query result, need to analyze this n query result one by one, and It calculates in these existing user's question and answer data, the word frequency of the centre word for part of making a summary is specific as follows:
Since first query result, i.e. i=1;
Similarity can be excluded not by comparing the similarity between the title division and problem string q of query result first High query result, to reduce the query result quantity for needing analyzing processing, if problem string q and title tiSimilarity be more than Preset threshold value then illustrates that the search result and problem string q are related enough, analyzed, on the contrary then end processing, and carries out The analyzing processing of next query result;
If problem string q and title tiSimilarity be more than preset threshold value, then specific processing procedure is as follows:
By the abstract part a of the query resultiWith fullstop "." cutting be m sentence (use aI, j, j=1 | m is indicated).For Each sentence aI, j, j=1 | m statistics wherein each centre word (centre word does not include stop words, high frequency words and symbol, such as " I ", " ", the remaining word such as " ") word frequency tf, i.e. occurrence number tires out the tf of the centre word in all m sentences Add, obtains aiIn all centre words tf.
Wherein, due to the centre word and problem string q correlation biggers with distinctive mark in abstract, in order to embody centre word With the difference of problem string q degrees of correlation, more accurate rational tf is obtained, weighted calculation can also be used when counting tf;Example Such as, if sentence aI, jIn have the word with distinctive mark, then aI, jIn each centre word word frequency tf it is tired by 3 times of criteria weights Add;If aI, jFront or rear adjacent sentence (aI, j | 1Or aI, j+1) in have the word with distinctive mark, then aI, jIn each centre word Tf it is cumulative by 2 times of criteria weights;Otherwise, aI, jIn each centre word tf it is cumulative by criteria weights, to obtain aI, jIn it is each The Weighted Term Frequency of a centre word.
Word frequency statistics are completed or problem string q and title tiSimilarity be less than or equal to preset threshold value, then terminate this The analysis of query result handles next query result, even i=i+1, and above-mentioned processing procedure is repeated until n items inquire knot Fruit has all been handled.Wherein, problem string q and title tiSimilarity calculating may be used it is similar between existing arbitrary two text The algorithm of degree, such as word neighbour scoring method (Term proximity scoring).
Step 103, the tf of each centre word come out according to the above process and the centre word that counts in advance it is inverse Document frequency (idf) calculates word the weight W, wherein W=tf*idf of all centre words;And to the word weight W of each centre word from Small sequence is arrived greatly, and the maximum centre words of word weight W are determined as answer word.
Wherein, the inverse of inverse document frequency, that is, document frequency, document frequency refer to the document number for occurring some word, can be with It being counted from internet by collecting text in advance, capture range is arbitrary, can be collected from specific website, community, or Person collects in the Ask-Answer Community directly where providing automatic question answering.
Step 104 determines that the problem of user terminal input goes here and there corresponding automatic question answering answer according to determining answer word.
The specific steps are:It is found in the abstract of n query result and most first s of answer word occurs and make a summary that (value of s is Arbitrary integer more than or equal to 1, for example, take 2), by this s abstract respectively press fullstop "." it is divided into several sentences, then in these sentences The largest number of sentences of centre word for answer word and customer problem string q occur, the answer as automatic question answering are found in son.When So, that the sentence containing answer word or the abstract comprising the answer word are directly determined as automatic question answering answer is also possible.
In addition, the present invention also provides a kind of automatic call answering arrangements, as shown in Fig. 2, the device includes:
Question and answer data acquisition module 201, for obtaining relevant existing user's question and answer number according to problem string input by user According to;
Word frequency statistics module 202, the word frequency of the centre word of the abstract part for counting existing user's question and answer data;
Answer word determining module 203, for counting according to the word frequency of each centre word and in advance described each The inverse document frequency of centre word calculates the word weight of each centre word, the maximum centre word of word weight is determined as answer Word;
Automatic question answering answer determining module 204, the answer for determining automatic question answering according to the answer word.
Wherein, the specific structure is shown in FIG. 3 for the question and answer data acquisition module 201, including:
Retrieval unit 301 is input to the search engine of Ask-Answer Community for being gone here and there described problem string as retrieval;
Acquiring unit 302, for obtaining query result corresponding with the retrieval string, every query result includes title portion Point and the abstract part with distinctive mark.
The word frequency statistics module 202 is as shown in figure 4, include:
Cutting unit 401, for being directed to each query result, part of being made a summary is using fullstop cutting as sentence;
Statistic unit 402 counts wherein each centre word for each sentence for 401 cutting of cutting unit Word frequency;
The word frequency of summing elements 403, the centre word in all sentences for counting the statistic unit 402 is tired out Add, the word frequency of all centre words in being made a summary;
Control unit 404 counts one by one for controlling the cutting unit 401, statistic unit 402 and summing elements 403 The centre word word frequency of the abstract part of each query result, until all query results all count completion.
Wherein, the summing elements 403, as shown in figure 5, including:
Identify judgment sub-unit 501, the distinctive mark in sentence for judging 401 cutting of cutting unit;
Weight adds up subelement 502, cumulative for carrying out word frequency according to the judgement of the mark judgment sub-unit 501;Such as There is the word with distinctive mark in fruit sentence, then the word frequency of each centre word is cumulative by 3 times of criteria weights in the sentence;If should There is the word with distinctive mark before or after sentence in adjacent sentence, then the word frequency of each centre word presses 2 times of standards in the sentence Weight is cumulative;Otherwise, the word frequency of each centre word is cumulative by criteria weights in the sentence, to obtain all centers in the sentence The Weighted Term Frequency of word.
As shown in figure 4, as another embodiment, the word frequency statistics module 202 may further include:
Similarity-rough set unit 405, between the title division for comparing each query result and described problem string Similarity;
Described control unit 404 is further used for, if the similarity of the title of current queries result and described problem string More than preset threshold value, then the cutting unit, statistic unit and summing elements are controlled, executes the statistics centre word word frequency Step, the step of otherwise skipping the statistics centre word word frequency of current queries result.
The answer word determining module 203 is as shown in fig. 6, include:
Word weight calculation unit 601, for according to formula:The word frequency of the word weight=centre word of the centre word × center The inverse document frequency of word calculates the word weight of each centre word;
Answer word determination unit 602, for the maximum centre word of word weight to be determined as answer word.
The automatic question answering answer determining module 204 as shown in fig. 7, comprises:
It makes a summary acquiring unit 701, occurs answer word most preceding s for being found in the abstract of the query result and pluck It wants;S is the integer more than or equal to 1;
Abstract cutting unit 702, for described s abstract to be respectively divided into multiple sentences by fullstop;
Answer determination unit 703, for found in the sentence of 702 cutting of abstract cutting unit occur answer word and The largest number of sentences of centre word of customer problem string, as the corresponding automatic question answering answer of problem string.
By the above embodiments as it can be seen that this automatic question-answering method and device of the present invention, take full advantage of Ask-Answer Community Existing user's question and answer data, need not establish question and answer knowledge base, need not also limit the ken of customer problem, and according to It is most related that the parameters such as word frequency, inverse document frequency, text similarity find out the problem of being proposed to user from existing question and answer data Answer, realize full-automatic answer.In addition to this, the present invention can be also used for carrying out semantic expansion to general problem or text string Exhibition can be used for classifying or searching for etc..
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention With within principle, any modification, equivalent substitution, improvement and etc. done should be included within the scope of protection of the invention god.

Claims (14)

1. a kind of automatic question-answering method, which is characterized in that this method includes:
The problem of being inputted according to user terminal string obtains relevant existing user's question and answer data;
Count the word frequency of the centre word of the abstract part of existing user's question and answer data;Wherein, if abstract part has Word with distinctive mark then uses weighted calculation when counting the word frequency;
According to the inverse document frequency of the word frequency of each centre word and each centre word counted in advance, described in calculating The maximum centre word of word weight is determined as answer word by the word weight of each centre word;
The answer of the corresponding automatic question answering of described problem string is determined according to the answer word.
2. automatic question-answering method as described in claim 1, which is characterized in that the problem of being inputted according to user terminal string obtains Relevant existing user's question and answer data are taken, including:
It is gone here and there described problem string as retrieval, is input to the search engine of Ask-Answer Community, obtain look into corresponding with the retrieval string It askes as a result, every query result includes title division and the abstract part with distinctive mark.
3. automatic question-answering method as claimed in claim 2, which is characterized in that the abstract of statistics existing user's question and answer data The word frequency of partial centre word, including:
The centre word word frequency of the abstract part of statistics each query result one by one, until all query results all count completion;
Wherein, for each query result, part of being made a summary is using fullstop cutting as sentence, wherein for each sentence statistics The word frequency of each centre word adds up the word frequency of the centre word in all sentences, the word frequency of all centre words in being made a summary.
4. automatic question-answering method as claimed in claim 3, which is characterized in that the word frequency by the centre word in all sentences It adds up, the word frequency of all centre words in being made a summary, including:
If there is the word with distinctive mark in sentence, the word frequency of each centre word is cumulative by 3 times of criteria weights in the sentence; If there is the word with distinctive mark before or after the sentence in adjacent sentence, the word frequency of each centre word presses 2 in the sentence Times criteria weights are cumulative;Otherwise, the word frequency of each centre word is cumulative by criteria weights in the sentence, to obtain institute in the sentence There is the Weighted Term Frequency of centre word.
5. automatic question-answering method as claimed in claim 3, which is characterized in that statistics each query result one by one is plucked The centre word word frequency of part is wanted, until all query results all count completion, including:
Compare the similarity between the title division of each query result and described problem string, if the mark of current queries result It the step of topic and the similarity of described problem string are more than preset threshold value, then execute the statistics centre word word frequency, otherwise skips The step of statistics centre word word frequency of current queries result.
6. automatic question-answering method as described in claim 1, which is characterized in that the word weight for calculating each centre word, packet It includes:
The inverse document frequency of the word frequency × centre word of the word weight=centre word of centre word.
7. automatic question-answering method as claimed in claim 2, which is characterized in that described to determine described problem string pair according to answer word The answer for the automatic question answering answered, including:
It is found in the abstract of the query result and the most preceding s abstracts of answer word occurs;S is the integer more than or equal to 1;
Described s abstract is respectively divided into multiple sentences by fullstop;It is found in these sentences and answer word and customer problem string occurs The largest number of sentences of centre word, the answer as the corresponding automatic question answering of described problem string.
8. a kind of automatic call answering arrangement, which is characterized in that the device includes:
Question and answer data acquisition module, string obtains relevant existing user's question and answer data the problem of for being inputted according to user terminal;
Word frequency statistics module, the word frequency of the centre word of the abstract part for counting existing user's question and answer data;Wherein, such as There is the word with distinctive mark in abstract part described in fruit, then weighted calculation is used when counting the word frequency;
Answer word determining module, for according to the word frequency of each centre word and each centre word counted in advance Inverse document frequency calculates the word weight of each centre word, the maximum centre word of word weight is determined as answer word;
Automatic question answering answer determining module, for determining answering for the corresponding automatic question answering of described problem string according to the answer word Case.
9. automatic call answering arrangement as claimed in claim 8, which is characterized in that the question and answer data acquisition module, including:
Retrieval unit is input to the search engine of Ask-Answer Community for being gone here and there described problem string as retrieval;
Acquiring unit, for obtaining query result corresponding with the retrieval string, every query result includes title division and band The abstract part for mark of having any different.
10. automatic call answering arrangement as claimed in claim 9, which is characterized in that the word frequency statistics module includes:
Cutting unit, for being directed to each query result, part of being made a summary is using fullstop cutting as sentence;
Statistic unit counts the word frequency of wherein each centre word for each sentence for the cutting unit cutting;
The word frequency of summing elements, the centre word in all sentences for counting the statistic unit adds up, and is plucked The word frequency of all centre words in wanting;
Control unit counts each query result one by one for controlling the cutting unit, statistic unit and summing elements The centre word word frequency of abstract part, until all query results all count completion.
11. automatic call answering arrangement as claimed in claim 10, which is characterized in that the summing elements include:
Identify judgment sub-unit, the distinctive mark in sentence for judging the cutting unit cutting;
Weight adds up subelement, cumulative for carrying out word frequency according to the judgement of the mark judgment sub-unit;If had in sentence Word with distinctive mark, then the word frequency of each centre word is cumulative by 3 times of criteria weights in the sentence;If before or after the sentence There is the word with distinctive mark in adjacent sentence, then the word frequency of each centre word is cumulative by 2 times of criteria weights in the sentence;It is no Then, the word frequency of each centre word is cumulative by criteria weights in the sentence, to obtain the weighted words of all centre words in the sentence Frequently.
12. automatic call answering arrangement as claimed in claim 10, which is characterized in that the word frequency statistics module further comprises:
Similarity-rough set unit, the similarity between the title division for comparing each query result and described problem string;
Described control unit is further used for, and is preset if the similarity of the title of current queries result and described problem string is more than Threshold value, then it is no the step of controlling the cutting unit, statistic unit and summing elements, execute the statistics centre word word frequency The step of then skipping the statistics centre word word frequency of current queries result.
13. automatic call answering arrangement as claimed in claim 8, which is characterized in that the answer word determining module includes:
Word weight calculation unit, for according to formula:Word frequency × the centre word of the word weight=centre word of centre word it is inverse Document frequency calculates the word weight of each centre word;
Answer word determination unit, for the maximum centre word of word weight to be determined as answer word.
14. automatic call answering arrangement as claimed in claim 9, which is characterized in that the automatic question answering answer determining module includes:
There is the most preceding s abstracts of answer word for being found in the abstract of the query result in abstract acquiring unit;S is Integer more than or equal to 1;
Abstract cutting unit, for described s abstract to be respectively divided into multiple sentences by fullstop;
There is answer word and customer problem string for being found in the sentence of the abstract cutting unit cutting in answer determination unit The largest number of sentences of centre word, the answer as the corresponding automatic question answering of described problem string.
CN201210128360.0A 2012-04-27 2012-04-27 A kind of automatic question-answering method and device Active CN103377245B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210128360.0A CN103377245B (en) 2012-04-27 2012-04-27 A kind of automatic question-answering method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210128360.0A CN103377245B (en) 2012-04-27 2012-04-27 A kind of automatic question-answering method and device

Publications (2)

Publication Number Publication Date
CN103377245A CN103377245A (en) 2013-10-30
CN103377245B true CN103377245B (en) 2018-09-11

Family

ID=49462371

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210128360.0A Active CN103377245B (en) 2012-04-27 2012-04-27 A kind of automatic question-answering method and device

Country Status (1)

Country Link
CN (1) CN103377245B (en)

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104375977B (en) * 2013-08-14 2018-11-23 腾讯科技(深圳)有限公司 The processing method and processing device of reply message in Ask-Answer Community
CN104933097B (en) * 2015-05-27 2019-04-16 百度在线网络技术(北京)有限公司 A kind of data processing method and device for retrieval
CN106610932A (en) * 2015-10-27 2017-05-03 中兴通讯股份有限公司 Corpus processing method and device and corpus analyzing method and device
CN105893476B (en) * 2016-03-29 2019-08-16 上海智臻智能网络科技股份有限公司 Intelligent answer method, knowledge base optimization method and device, Intelligence repository
CN105893535B (en) * 2016-03-31 2019-08-02 上海智臻智能网络科技股份有限公司 Intelligent answer method, knowledge base optimization method and device, Intelligence repository
CN108073664B (en) * 2016-11-11 2021-08-31 北京搜狗科技发展有限公司 Information processing method, device, equipment and client equipment
CN108306864B (en) * 2018-01-12 2021-02-26 深圳壹账通智能科技有限公司 Network data detection method and device, computer equipment and storage medium
CN108256056A (en) * 2018-01-12 2018-07-06 广州杰赛科技股份有限公司 Intelligent answer method and system
CN109002434A (en) * 2018-05-31 2018-12-14 青岛理工大学 Customer service question and answer matching method, server and storage medium
CN110096567B (en) * 2019-03-14 2020-12-25 中国科学院自动化研究所 QA knowledge base reasoning-based multi-round dialogue reply selection method and system
CN112101005B (en) * 2020-04-02 2022-08-30 上海迷因网络科技有限公司 Method for generating and dynamically adjusting quick expressive force test questions

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1928864A (en) * 2006-09-22 2007-03-14 浙江大学 FAQ based Chinese natural language ask and answer method
CN101071424A (en) * 2006-06-23 2007-11-14 腾讯科技(深圳)有限公司 Personalized information push system and method
CN101174273A (en) * 2007-12-04 2008-05-07 清华大学 News event detecting method based on metadata analysis
CN101315624A (en) * 2007-05-29 2008-12-03 阿里巴巴集团控股有限公司 Text subject recommending method and device
CN101520802A (en) * 2009-04-13 2009-09-02 腾讯科技(深圳)有限公司 Question-answer pair quality evaluation method and system
CN101593206A (en) * 2009-06-25 2009-12-02 腾讯科技(深圳)有限公司 Searching method and device based on answer in the question and answer interaction platform

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3882048B2 (en) * 2003-10-17 2007-02-14 独立行政法人情報通信研究機構 Question answering system and question answering processing method

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101071424A (en) * 2006-06-23 2007-11-14 腾讯科技(深圳)有限公司 Personalized information push system and method
CN1928864A (en) * 2006-09-22 2007-03-14 浙江大学 FAQ based Chinese natural language ask and answer method
CN101315624A (en) * 2007-05-29 2008-12-03 阿里巴巴集团控股有限公司 Text subject recommending method and device
CN101174273A (en) * 2007-12-04 2008-05-07 清华大学 News event detecting method based on metadata analysis
CN101520802A (en) * 2009-04-13 2009-09-02 腾讯科技(深圳)有限公司 Question-answer pair quality evaluation method and system
CN101593206A (en) * 2009-06-25 2009-12-02 腾讯科技(深圳)有限公司 Searching method and device based on answer in the question and answer interaction platform

Also Published As

Publication number Publication date
CN103377245A (en) 2013-10-30

Similar Documents

Publication Publication Date Title
CN103377245B (en) A kind of automatic question-answering method and device
CN102866990B (en) A kind of theme dialogue method and device
CN103176983B (en) A kind of event method for early warning based on internet information
US9317550B2 (en) Query expansion
CN107341268B (en) Hot searching ranking method and system
CN105138558B (en) The real time individual information collecting method of content is accessed based on user
CN107944035B (en) Image recommendation method integrating visual features and user scores
CN104077407B (en) A kind of intelligent data search system and method
CN110532351B (en) Recommendation word display method, device and equipment and computer readable storage medium
WO2018000557A1 (en) Search results display method and apparatus
CN103106189B (en) A kind of method and apparatus excavating synonym attribute word
TWI677838B (en) Method, device and information providing method and system for estimating click-through rate model
KR20180072167A (en) System for extracting similar patents and method thereof
CN1818908A (en) Feedbakc information use of searcher in search engine
CN110096699A (en) Semantic-based machine reads the candidate answers screening technique understood and system
CN107193883B (en) Data processing method and system
CN104239321B (en) A kind of data processing method and device of Search Engine-Oriented
CN102156746A (en) Method for evaluating performance of search engine
CN105630890A (en) Neologism discovery method and system based on intelligent question-answering system session history
CN106844638A (en) Information retrieval method, device and electronic equipment
CN108509588B (en) Lawyer evaluation method and recommendation method based on big data
CN107688647A (en) A kind of study based on collaborative filtering reviews exam pool and recommends method
JP2006331292A (en) Weblog community search support method, search support device, and recording medium recording program for search support method
JP2011186854A (en) Question recommendation device, method, and program
CN103136256B (en) One realizes method for information retrieval and system in a network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
ASS Succession or assignment of patent right

Owner name: SHENZHEN SHIJI LIGHT SPEED INFORMATION TECHNOLOGY

Free format text: FORMER OWNER: TENGXUN SCI-TECH (SHENZHEN) CO., LTD.

Effective date: 20131021

C41 Transfer of patent application or patent right or utility model
COR Change of bibliographic data

Free format text: CORRECT: ADDRESS; FROM: 518044 SHENZHEN, GUANGDONG PROVINCE TO: 518057 SHENZHEN, GUANGDONG PROVINCE

TA01 Transfer of patent application right

Effective date of registration: 20131021

Address after: 518057 Tencent Building, 16, Nanshan District hi tech park, Guangdong, Shenzhen

Applicant after: Shenzhen Shiji Guangsu Information Technology Co., Ltd.

Address before: Shenzhen Futian District City, Guangdong province 518044 Zhenxing Road, SEG Science Park 2 East Room 403

Applicant before: Tencent Technology (Shenzhen) Co., Ltd.

C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant