CN111737553A - Method and device for selecting enterprise associated words and storage medium - Google Patents
Method and device for selecting enterprise associated words and storage medium Download PDFInfo
- Publication number
- CN111737553A CN111737553A CN202010547677.2A CN202010547677A CN111737553A CN 111737553 A CN111737553 A CN 111737553A CN 202010547677 A CN202010547677 A CN 202010547677A CN 111737553 A CN111737553 A CN 111737553A
- Authority
- CN
- China
- Prior art keywords
- enterprise
- word
- news
- relevant
- words
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 238000004590 computer program Methods 0.000 claims description 9
- 238000005070 sampling Methods 0.000 claims description 6
- 239000002131 composite material Substances 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a method, equipment and a storage medium for selecting enterprise associated words, wherein the method comprises the following steps: acquiring preliminarily screened news and news volume N1 related to a certain enterprise; performing association processing on the news related to the enterprise by using the associated words to be selected to obtain news volume N2 related to the associated words to be selected; and judging whether the relevant word to be selected can be used as a formal enterprise relevant word or not according to the relevance ratio of the relevant word to be selected, wherein the relevance ratio of the relevant word to be selected is N2/N1. Compared with the prior art, the method for selecting the enterprise associated words can monitor and manage the enterprise associated words input by the front end, select the enterprise associated words meeting the requirements to associate the enterprise news, and avoid uncontrollable influence caused by directly using the enterprise associated words input by the front end.
Description
Technical Field
The invention relates to the technical field of internet, in particular to a method, equipment and a storage medium for selecting enterprise associated words.
Background
In the big data age, a variety of news texts are increasing. To acquire related news of a certain enterprise, the association degree between the news text and the enterprise is generally acquired by selecting related words, so that a batch of related news is screened out.
Therefore, the selection of the associated word is critical, and the wrong associated word may cause uncontrollable influence on the appearance of the associated news.
Disclosure of Invention
The invention aims to provide a method, equipment and a storage medium for selecting enterprise associated words.
In order to achieve one of the above objects, an embodiment of the present invention provides a method for selecting an enterprise associated word, where the method includes:
acquiring preliminarily screened news and news volume N1 related to a certain enterprise;
performing association processing on the news related to the enterprise by using the associated words to be selected to obtain news volume N2 related to the associated words to be selected;
and judging whether the relevant word to be selected can be used as a formal enterprise relevant word or not according to the relevance ratio of the relevant word to be selected, wherein the relevance ratio of the relevant word to be selected is N2/N1.
As a further improvement of an embodiment of the present invention, the "determining whether the relevant word to be selected can be used as a formal enterprise relevant word according to the relevance ratio of the relevant word to be selected" specifically includes:
and if the relevance ratio of the relevant word to be selected is lower than a preset lower limit threshold, judging that the relevant word to be selected cannot be used as a formal enterprise relevant word.
As a further refinement of an embodiment of the invention, the predetermined lower threshold is 0.1%.
As a further improvement of an embodiment of the present invention, the "determining whether the relevant word to be selected can be a formal enterprise relevant word according to the relevance ratio of the relevant word to be selected" further includes:
and if the relevance ratio of the relevant words to be selected is greater than or equal to a preset lower limit threshold, the news after the relevance processing is checked in a sampling mode, and if the news exceeding the preset ratio in the news checked in the sampling mode is the news relevant to the enterprise, the relevant words to be selected can be used as formal enterprise relevant words.
As a further improvement of an embodiment of the present invention, the "performing association processing on the news related to the enterprise by using the association word to be selected" specifically includes:
and using the relevant word to be selected as the relevant word of the enterprise, calculating the TFIDF value of the relevant word in each piece of news relevant to the enterprise, and selecting the news with the TFIDF value larger than a set threshold value as the news after the relevant word is subjected to correlation processing.
As a further improvement of an embodiment of the invention, the elastic search recalls all news including the enterprise associated words to obtain preliminarily screened news related to the enterprise, wherein the enterprise associated words include formal enterprise associated words and associated words to be selected;
performing association processing on the news related to the enterprise by using the association words to be selected by the elastic search to obtain an association log;
and calculating the relevance ratio of the relevant word to be selected through the relevance log, and judging whether the relevant word to be selected can be used as a formal enterprise relevant word or not through the relevance ratio, wherein the relevance log comprises the news volume N1 before the relevant word to be selected is used and the news volume N2 after the relevant word is used, and the relevance ratio is N2/N1.
As a further improvement of an embodiment of the present invention, the method further comprises:
taking enterprise associated words input at the front end as associated words to be selected, and adding the enterprise associated words into a blacklist of an ElasticSearch lexicon;
and after a certain relevant word to be selected is judged to be available as a formal enterprise relevant word, the relevant word to be selected is removed from the blacklist and added into a formal enterprise relevant word list.
As a further improvement of an embodiment of the present invention, the related word to be selected is a product name, a brand name, a stock abbreviation or an enterprise abbreviation of an enterprise.
In order to achieve one of the above objects, an embodiment of the present invention provides an electronic device, which includes a memory and a processor, where the memory stores a computer program operable on the processor, and the processor executes the computer program to implement the steps in any one of the above methods for selecting an enterprise related word.
In order to achieve one of the above objects, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps in any one of the above methods for selecting an enterprise-related word.
Compared with the prior art, the method for selecting the enterprise associated words can monitor and manage the enterprise associated words input by the front end, select the enterprise associated words meeting the requirements to associate the enterprise news, and avoid uncontrollable influence caused by directly using the enterprise associated words input by the front end.
Drawings
Fig. 1 is a flow chart of a method for selecting an enterprise associated word according to the present invention.
Detailed Description
The present invention will be described in detail below with reference to specific embodiments shown in the drawings. These embodiments are not intended to limit the present invention, and structural, methodological, or functional changes made by those skilled in the art according to these embodiments are included in the scope of the present invention.
The related words of the enterprise are used for associating the news of the enterprise, and are generally the product name, brand name, stock short name or enterprise short name of the enterprise. The relevant words are generally directly input into the elastic search thesaurus through the front end, but the influence of most input relevant words is unknown, and the direct addition of the relevant words into the thesaurus may bring a large number of false relevant words (the false relevant words refer to relevant words which have no association relation with corresponding enterprises), so that the accuracy of news results of subsequently associated enterprises is directly influenced.
It should be noted that, elastic search, abbreviated as ES, is a search server based on the full text search engine Lucene, and provides a full text search engine with distributed multi-user capability. The ElasticSearch is the prior art, and in China, many Internet companies, E-commerce platforms and the like use the ElasticSearch for retrieval and analysis, so that many practical production problems are solved.
Therefore, as shown in fig. 1, the invention provides a method for selecting enterprise associated words, which can monitor and manage enterprise associated words entered at the front end, select enterprise associated words meeting requirements to associate enterprise news, and avoid uncontrollable influence caused by directly using enterprise associated words entered at the front end. The method comprises the following steps:
step S100: the preliminarily filtered news and news volume N1 related to a certain business is obtained.
For example, all news including the business-related words may be recalled all together using the ElasticSearch, resulting in preliminarily filtered business-related news and newsgroup N1, where the business-related words include formal business-related words and candidate-related words. The formal enterprise associated word refers to an identified associated word having an association relation with the enterprise, and the associated word to be selected refers to an associated word directly obtained through front-end inputting.
Step S200: and performing association processing on the news related to the enterprise by using the associated words to be selected to obtain the news volume N2 related to the associated words to be selected.
The association processing refers to selecting news related to the associated words through an association algorithm. In a preferred embodiment, the step S200 includes: and using the relevant word to be selected as the relevant word of the enterprise, calculating the TFIDF value of the relevant word in each piece of news relevant to the enterprise, and selecting the news with the TFIDF value larger than a set threshold value as the news after the relevant word is subjected to correlation processing.
The TFIDF value is a value calculated according to a TF-IDF (Term Frequency-Inverse Document Frequency) statistical method, and the TF-IDF is called a Term Frequency-Inverse Document Frequency for evaluating the importance degree of a word to a Document set or one of documents in a corpus. The importance of a word increases in proportion to the number of times it appears in a document, but at the same time decreases in inverse proportion to the frequency with which it appears in the corpus. This is prior art and will not be described herein.
In a specific embodiment, the elastic search uses the to-be-selected associated word to perform association processing on the news related to the enterprise to obtain an associated log, where the associated log includes a newsgroup N1 before being associated with a certain to-be-selected associated word and a newsgroup N2 after being associated with the certain to-be-selected associated word.
Step S300: and judging whether the relevant word to be selected can be used as a formal enterprise relevant word or not according to the relevance ratio of the relevant word to be selected, wherein the relevance ratio of the relevant word to be selected is N2/N1.
The formal enterprise associated word refers to an associated word which is determined to have an association relation with the enterprise. The relevance ratio is the proportion of the newsgroup N2 related by using a certain related word to be selected in the newsgroup N1 before the related word is used. The association ratio can reflect the association degree between the relevant word to be selected and the corresponding enterprise, specifically, the higher the association ratio is, the higher the association degree between the relevant word to be selected and the enterprise is, and conversely, the lower the association ratio is, the lower the association degree between the relevant word to be selected and the enterprise is, when the association ratio of a certain relevant word to be selected is lower than a predetermined lower threshold (preferably, the predetermined lower threshold is one thousandth), it may be directly determined that the relevant word to be selected is not associated with the corresponding enterprise, that is, the relevant word to be selected cannot be used as a formal enterprise relevant word.
Further, in order to increase accuracy, for the associated word to be selected with the association ratio greater than or equal to the predetermined lower threshold, news after association processing is checked by sampling, and whether the associated word to be selected can be used as a formal enterprise associated word is determined. Specifically, if news exceeding a predetermined percentage in the news viewed in the sample is news associated with the business, for example, news exceeding a predetermined percentage 50% in the news viewed in the sample 6 is news associated with the business, it is determined that the relevant word to be selected can be used as a formal business relevant word.
In another embodiment, the relevant words to be selected with the relevance ratio exceeding a preset upper threshold are directly selected as the formal enterprise relevant words.
As shown in table 1 below, in a specific embodiment, an enterprise related word entered at the front end is first used as a related word to be selected, and is added to a blacklist of an ElasticSearch thesaurus. In table 1, the related words to be selected from company a to company F are product name 1 (product name of company a), product name 2 (product name of company B), company name 1 (company name of company C), and their respective methods.
Then, the ElasticSearch recalls all news including enterprise associated words of a certain enterprise to obtain preliminarily screened news related to the enterprise, wherein the enterprise associated words comprise formal enterprise associated words and associated words to be selected.
And next, performing association processing on the news related to the enterprise by using the associated word to be selected by the elastic search to obtain an associated log, wherein the associated log comprises a newsgroup N1 before being associated with a certain associated word to be selected and a newsgroup N2 after being associated. As can be seen from table 1, the news volume N1 before company a to company F are associated is: 38508. 16711, 15672, 52519, 49579, 47683. The news volume N2 after company a to company F are associated is: 4295. 1834, 4025, 17, 34, 43.
And then calculating the relevance ratio of the relevant word to be selected through the relevance log, and judging whether the relevant word to be selected can be used as a formal enterprise relevant word or not through the relevance ratio, wherein the relevance ratio is N2/N1. As can be seen from table 1, the relevance ratios of the relevant words to be selected of companies D to F are all lower than one in a thousand, and it can be directly determined that the relevant words to be selected of the three companies are not relevant to the corresponding companies, that is, cannot be used as formal enterprise relevant words of the three companies. The relevance ratio of the relevant words to be selected from the company A to the company C is relatively large, and the relevant words to be selected from the company A to the company C can be used as formal enterprise relevant words of the corresponding company through simple sampling inspection.
And finally, removing the relevant words to be selected from the company A to the company C from the blacklist, and adding the relevant words to the corresponding formal list of the enterprise relevant words.
To-be-selected associated word | Company(s) | Associated Pre-News volume N1 | Associated newsgroup N2 | Correlation ratio |
Product name 1 | Company A | 38508 | 4295 | 0.111535 |
Product name 2 | Company B | 16711 | 1834 | 0.109748 |
Company abbreviation 1 | C Corp Ltd | 15672 | 4025 | 0.256827 |
At one time | Company D | 52519 | 17 | 0.000323692 |
Of course | Company E | 49579 | 34 | 0.000685774 |
Method of producing a composite material | Company F | 47683 | 43 | 0.000901789 |
TABLE 1
The invention further provides an electronic device, which comprises a memory and a processor, wherein the memory stores a computer program capable of running on the processor, and the processor implements any one step of the method for selecting the enterprise associated words when executing the program, that is, implements the step of any one technical scheme of the method for selecting the enterprise associated words.
The invention also provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements any one of the steps in the method for selecting an enterprise related word, that is, implements any one of the steps in the method for selecting an enterprise related word.
It should be understood that although the present description refers to embodiments, not every embodiment contains only a single technical solution, and such description is for clarity only, and those skilled in the art should make the description as a whole, and the technical solutions in the embodiments can also be combined appropriately to form other embodiments understood by those skilled in the art.
The above-listed detailed description is only a specific description of a possible embodiment of the present invention, and they are not intended to limit the scope of the present invention, and equivalent embodiments or modifications made without departing from the technical spirit of the present invention should be included in the scope of the present invention.
Claims (10)
1. A method for selecting enterprise associated words is characterized by comprising the following steps:
acquiring preliminarily screened news and news volume N1 related to a certain enterprise;
performing association processing on the news related to the enterprise by using the associated words to be selected to obtain news volume N2 related to the associated words to be selected;
and judging whether the relevant word to be selected can be used as a formal enterprise relevant word or not according to the relevance ratio of the relevant word to be selected, wherein the relevance ratio of the relevant word to be selected is N2/N1.
2. The method for selecting an enterprise related word according to claim 1, wherein the step of determining whether the related word to be selected can be used as a formal enterprise related word according to the relevance ratio of the related word to be selected specifically comprises the steps of:
and if the relevance ratio of the relevant word to be selected is lower than a preset lower limit threshold, judging that the relevant word to be selected cannot be used as a formal enterprise relevant word.
3. The method for selecting the enterprise related word according to claim 2, wherein:
the predetermined lower threshold is 0.1%.
4. The method for selecting an enterprise related word according to claim 2, wherein the step of determining whether the related word to be selected can be used as an official enterprise related word according to the relevance ratio of the related word to be selected further comprises:
and if the relevance ratio of the relevant words to be selected is greater than or equal to a preset lower limit threshold, the news after the relevance processing is checked in a sampling mode, and if the news exceeding the preset ratio in the news checked in the sampling mode is the news relevant to the enterprise, the relevant words to be selected can be used as formal enterprise relevant words.
5. The method for selecting the enterprise related word according to claim 1, wherein the "performing the association processing on the news related to the enterprise by using the related word to be selected" specifically comprises:
and using the relevant word to be selected as the relevant word of the enterprise, calculating the TFIDF value of the relevant word in each piece of news relevant to the enterprise, and selecting the news with the TFIDF value larger than a set threshold value as the news after the relevant word is subjected to correlation processing.
6. The method for selecting the enterprise relevant word according to claim 1, wherein:
the ElasticSearch recalls all news including enterprise associated words to obtain preliminarily screened news related to the enterprise, wherein the enterprise associated words comprise formal enterprise associated words and to-be-selected associated words;
performing association processing on the news related to the enterprise by using the association words to be selected by the elastic search to obtain an association log;
and calculating the relevance ratio of the relevant word to be selected through the relevance log, and judging whether the relevant word to be selected can be used as a formal enterprise relevant word or not through the relevance ratio, wherein the relevance log comprises the news volume N1 before the relevant word to be selected is used and the news volume N2 after the relevant word is used, and the relevance ratio is N2/N1.
7. The method for selecting the enterprise relevant word according to claim 6, wherein the method further comprises the following steps:
taking enterprise associated words input at the front end as associated words to be selected, and adding the enterprise associated words into a blacklist of an ElasticSearch lexicon;
and after a certain relevant word to be selected is judged to be available as a formal enterprise relevant word, the relevant word to be selected is removed from the blacklist and added into a formal enterprise relevant word list.
8. The method for selecting the enterprise relevant word according to claim 1, wherein:
the associated words to be selected are product names, brand names, stock names or enterprise names of enterprises.
9. An electronic device comprising a memory and a processor, wherein the memory stores a computer program operable on the processor, and wherein the processor executes the program to implement the steps in the method for selecting an enterprise-related word according to any one of claims 1 to 8.
10. A computer-readable storage medium, on which a computer program is stored, wherein the computer program, when being executed by a processor, implements the steps in the method for selecting an enterprise-related word according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010547677.2A CN111737553A (en) | 2020-06-16 | 2020-06-16 | Method and device for selecting enterprise associated words and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010547677.2A CN111737553A (en) | 2020-06-16 | 2020-06-16 | Method and device for selecting enterprise associated words and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111737553A true CN111737553A (en) | 2020-10-02 |
Family
ID=72649341
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010547677.2A Pending CN111737553A (en) | 2020-06-16 | 2020-06-16 | Method and device for selecting enterprise associated words and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111737553A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116777686A (en) * | 2023-04-19 | 2023-09-19 | 深圳昊通技术有限公司 | Enterprise intellectual property classification early warning method, system and storage medium |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101067808A (en) * | 2007-05-24 | 2007-11-07 | 上海大学 | Text key word extracting method |
CN102360358A (en) * | 2011-09-28 | 2012-02-22 | 百度在线网络技术(北京)有限公司 | Keyword recommendation method and system |
US20120117082A1 (en) * | 2010-11-05 | 2012-05-10 | Koperda Frank R | Method and system for document classification or search using discrete words |
CN103218368A (en) * | 2012-01-20 | 2013-07-24 | 深圳市腾讯计算机系统有限公司 | Method and device for discovering hot words |
CN103593350A (en) * | 2012-08-14 | 2014-02-19 | 阿里巴巴集团控股有限公司 | Method and device for recommending promotion keyword price parameters |
CN103942189A (en) * | 2014-03-19 | 2014-07-23 | 百度在线网络技术(北京)有限公司 | Method and device for determining keywords of compositions |
CN103971678A (en) * | 2013-01-29 | 2014-08-06 | 腾讯科技(深圳)有限公司 | Method and device for detecting keywords |
CN105488027A (en) * | 2015-11-30 | 2016-04-13 | 百度在线网络技术(北京)有限公司 | Keyword pushing method and apparatus |
CN106708880A (en) * | 2015-11-16 | 2017-05-24 | 北京国双科技有限公司 | Topic associated word obtaining method and apparatus |
CN108073568A (en) * | 2016-11-10 | 2018-05-25 | 腾讯科技(深圳)有限公司 | keyword extracting method and device |
CN108710664A (en) * | 2018-05-14 | 2018-10-26 | 平安科技(深圳)有限公司 | A kind of hot word analysis method, computer readable storage medium and terminal device |
CN109634983A (en) * | 2018-12-13 | 2019-04-16 | 百度在线网络技术(北京)有限公司 | Recall determination method, apparatus, equipment and the medium of interest point information |
CN109670176A (en) * | 2018-12-19 | 2019-04-23 | 武汉瓯越网视有限公司 | A kind of keyword abstraction method, device, electronic equipment and storage medium |
CN109885753A (en) * | 2019-01-16 | 2019-06-14 | 苏宁易购集团股份有限公司 | A kind of method and device for expanding commercial articles searching and recalling |
CN110489757A (en) * | 2019-08-26 | 2019-11-22 | 北京邮电大学 | A kind of keyword extracting method and device |
CN110750682A (en) * | 2018-07-06 | 2020-02-04 | 武汉斗鱼网络科技有限公司 | Title hot word automatic metering method, storage medium, electronic equipment and system |
-
2020
- 2020-06-16 CN CN202010547677.2A patent/CN111737553A/en active Pending
Patent Citations (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101067808A (en) * | 2007-05-24 | 2007-11-07 | 上海大学 | Text key word extracting method |
US20120117082A1 (en) * | 2010-11-05 | 2012-05-10 | Koperda Frank R | Method and system for document classification or search using discrete words |
US20130144874A1 (en) * | 2010-11-05 | 2013-06-06 | Nextgen Datacom, Inc. | Method and system for document classification or search using discrete words |
CN102360358A (en) * | 2011-09-28 | 2012-02-22 | 百度在线网络技术(北京)有限公司 | Keyword recommendation method and system |
CN103218368A (en) * | 2012-01-20 | 2013-07-24 | 深圳市腾讯计算机系统有限公司 | Method and device for discovering hot words |
CN103593350A (en) * | 2012-08-14 | 2014-02-19 | 阿里巴巴集团控股有限公司 | Method and device for recommending promotion keyword price parameters |
CN103971678A (en) * | 2013-01-29 | 2014-08-06 | 腾讯科技(深圳)有限公司 | Method and device for detecting keywords |
CN103942189A (en) * | 2014-03-19 | 2014-07-23 | 百度在线网络技术(北京)有限公司 | Method and device for determining keywords of compositions |
CN106708880A (en) * | 2015-11-16 | 2017-05-24 | 北京国双科技有限公司 | Topic associated word obtaining method and apparatus |
CN105488027A (en) * | 2015-11-30 | 2016-04-13 | 百度在线网络技术(北京)有限公司 | Keyword pushing method and apparatus |
CN108073568A (en) * | 2016-11-10 | 2018-05-25 | 腾讯科技(深圳)有限公司 | keyword extracting method and device |
CN108710664A (en) * | 2018-05-14 | 2018-10-26 | 平安科技(深圳)有限公司 | A kind of hot word analysis method, computer readable storage medium and terminal device |
CN110750682A (en) * | 2018-07-06 | 2020-02-04 | 武汉斗鱼网络科技有限公司 | Title hot word automatic metering method, storage medium, electronic equipment and system |
CN109634983A (en) * | 2018-12-13 | 2019-04-16 | 百度在线网络技术(北京)有限公司 | Recall determination method, apparatus, equipment and the medium of interest point information |
CN109670176A (en) * | 2018-12-19 | 2019-04-23 | 武汉瓯越网视有限公司 | A kind of keyword abstraction method, device, electronic equipment and storage medium |
CN109885753A (en) * | 2019-01-16 | 2019-06-14 | 苏宁易购集团股份有限公司 | A kind of method and device for expanding commercial articles searching and recalling |
CN110489757A (en) * | 2019-08-26 | 2019-11-22 | 北京邮电大学 | A kind of keyword extracting method and device |
Non-Patent Citations (2)
Title |
---|
BRIAN GAWALT: "Discovering word associations in news media via feature selection and sparse classification", 《PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION RETRIEVAL》, pages 211 - 220 * |
吴睿: "面向微博文本的热词分析技术研究", 《中国优秀硕士学位论文全文数据库 信息科技》, pages 138 - 590 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116777686A (en) * | 2023-04-19 | 2023-09-19 | 深圳昊通技术有限公司 | Enterprise intellectual property classification early warning method, system and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7895235B2 (en) | Extracting semantic relations from query logs | |
US10489399B2 (en) | Query language identification | |
US9519634B2 (en) | Systems and methods for determining lexical associations among words in a corpus | |
US8983963B2 (en) | Techniques for comparing and clustering documents | |
US10169449B2 (en) | Method, apparatus, and server for acquiring recommended topic | |
US8051088B1 (en) | Document analysis | |
US20080208840A1 (en) | Diverse Topic Phrase Extraction | |
US8316026B2 (en) | Method and system for keyword management | |
EP2228737A2 (en) | Improving search effectiveness | |
CN106886512B (en) | Article classification method and device | |
WO2014028860A2 (en) | System and method for matching data using probabilistic modeling techniques | |
CN108363694B (en) | Keyword extraction method and device | |
CN111091883B (en) | Medical text processing method, device, storage medium and equipment | |
CN104361115A (en) | Entry weight definition method and device based on co-clicking | |
CN113656575B (en) | Training data generation method and device, electronic equipment and readable medium | |
CN111737553A (en) | Method and device for selecting enterprise associated words and storage medium | |
Pojanapunya et al. | The influence of the benchmark corpus on keyword analysis | |
CN111104422B (en) | Training method, device, equipment and storage medium of data recommendation model | |
CN113743090A (en) | Keyword extraction method and device | |
CN114175012A (en) | System and method for ranking electronic documents based on query token density | |
CN117076599A (en) | Knowledge graph-based data searching method and device and electronic equipment | |
JP2008282111A (en) | Similar document retrieval method, program and device | |
CN115328945A (en) | Data asset retrieval method, electronic device and computer-readable storage medium | |
CN112115237B (en) | Construction method and device of tobacco science and technology literature data recommendation model | |
CN116932732A (en) | Method, device, electronic equipment and storage medium for determining target keywords |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |