Nothing Special   »   [go: up one dir, main page]

CN111737553A - Method and device for selecting enterprise associated words and storage medium - Google Patents

Method and device for selecting enterprise associated words and storage medium Download PDF

Info

Publication number
CN111737553A
CN111737553A CN202010547677.2A CN202010547677A CN111737553A CN 111737553 A CN111737553 A CN 111737553A CN 202010547677 A CN202010547677 A CN 202010547677A CN 111737553 A CN111737553 A CN 111737553A
Authority
CN
China
Prior art keywords
enterprise
word
news
relevant
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010547677.2A
Other languages
Chinese (zh)
Inventor
龚朝辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Longdong Network Technology Co ltd
Original Assignee
Suzhou Longdong Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Longdong Network Technology Co ltd filed Critical Suzhou Longdong Network Technology Co ltd
Priority to CN202010547677.2A priority Critical patent/CN111737553A/en
Publication of CN111737553A publication Critical patent/CN111737553A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method, equipment and a storage medium for selecting enterprise associated words, wherein the method comprises the following steps: acquiring preliminarily screened news and news volume N1 related to a certain enterprise; performing association processing on the news related to the enterprise by using the associated words to be selected to obtain news volume N2 related to the associated words to be selected; and judging whether the relevant word to be selected can be used as a formal enterprise relevant word or not according to the relevance ratio of the relevant word to be selected, wherein the relevance ratio of the relevant word to be selected is N2/N1. Compared with the prior art, the method for selecting the enterprise associated words can monitor and manage the enterprise associated words input by the front end, select the enterprise associated words meeting the requirements to associate the enterprise news, and avoid uncontrollable influence caused by directly using the enterprise associated words input by the front end.

Description

Method and device for selecting enterprise associated words and storage medium
Technical Field
The invention relates to the technical field of internet, in particular to a method, equipment and a storage medium for selecting enterprise associated words.
Background
In the big data age, a variety of news texts are increasing. To acquire related news of a certain enterprise, the association degree between the news text and the enterprise is generally acquired by selecting related words, so that a batch of related news is screened out.
Therefore, the selection of the associated word is critical, and the wrong associated word may cause uncontrollable influence on the appearance of the associated news.
Disclosure of Invention
The invention aims to provide a method, equipment and a storage medium for selecting enterprise associated words.
In order to achieve one of the above objects, an embodiment of the present invention provides a method for selecting an enterprise associated word, where the method includes:
acquiring preliminarily screened news and news volume N1 related to a certain enterprise;
performing association processing on the news related to the enterprise by using the associated words to be selected to obtain news volume N2 related to the associated words to be selected;
and judging whether the relevant word to be selected can be used as a formal enterprise relevant word or not according to the relevance ratio of the relevant word to be selected, wherein the relevance ratio of the relevant word to be selected is N2/N1.
As a further improvement of an embodiment of the present invention, the "determining whether the relevant word to be selected can be used as a formal enterprise relevant word according to the relevance ratio of the relevant word to be selected" specifically includes:
and if the relevance ratio of the relevant word to be selected is lower than a preset lower limit threshold, judging that the relevant word to be selected cannot be used as a formal enterprise relevant word.
As a further refinement of an embodiment of the invention, the predetermined lower threshold is 0.1%.
As a further improvement of an embodiment of the present invention, the "determining whether the relevant word to be selected can be a formal enterprise relevant word according to the relevance ratio of the relevant word to be selected" further includes:
and if the relevance ratio of the relevant words to be selected is greater than or equal to a preset lower limit threshold, the news after the relevance processing is checked in a sampling mode, and if the news exceeding the preset ratio in the news checked in the sampling mode is the news relevant to the enterprise, the relevant words to be selected can be used as formal enterprise relevant words.
As a further improvement of an embodiment of the present invention, the "performing association processing on the news related to the enterprise by using the association word to be selected" specifically includes:
and using the relevant word to be selected as the relevant word of the enterprise, calculating the TFIDF value of the relevant word in each piece of news relevant to the enterprise, and selecting the news with the TFIDF value larger than a set threshold value as the news after the relevant word is subjected to correlation processing.
As a further improvement of an embodiment of the invention, the elastic search recalls all news including the enterprise associated words to obtain preliminarily screened news related to the enterprise, wherein the enterprise associated words include formal enterprise associated words and associated words to be selected;
performing association processing on the news related to the enterprise by using the association words to be selected by the elastic search to obtain an association log;
and calculating the relevance ratio of the relevant word to be selected through the relevance log, and judging whether the relevant word to be selected can be used as a formal enterprise relevant word or not through the relevance ratio, wherein the relevance log comprises the news volume N1 before the relevant word to be selected is used and the news volume N2 after the relevant word is used, and the relevance ratio is N2/N1.
As a further improvement of an embodiment of the present invention, the method further comprises:
taking enterprise associated words input at the front end as associated words to be selected, and adding the enterprise associated words into a blacklist of an ElasticSearch lexicon;
and after a certain relevant word to be selected is judged to be available as a formal enterprise relevant word, the relevant word to be selected is removed from the blacklist and added into a formal enterprise relevant word list.
As a further improvement of an embodiment of the present invention, the related word to be selected is a product name, a brand name, a stock abbreviation or an enterprise abbreviation of an enterprise.
In order to achieve one of the above objects, an embodiment of the present invention provides an electronic device, which includes a memory and a processor, where the memory stores a computer program operable on the processor, and the processor executes the computer program to implement the steps in any one of the above methods for selecting an enterprise related word.
In order to achieve one of the above objects, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps in any one of the above methods for selecting an enterprise-related word.
Compared with the prior art, the method for selecting the enterprise associated words can monitor and manage the enterprise associated words input by the front end, select the enterprise associated words meeting the requirements to associate the enterprise news, and avoid uncontrollable influence caused by directly using the enterprise associated words input by the front end.
Drawings
Fig. 1 is a flow chart of a method for selecting an enterprise associated word according to the present invention.
Detailed Description
The present invention will be described in detail below with reference to specific embodiments shown in the drawings. These embodiments are not intended to limit the present invention, and structural, methodological, or functional changes made by those skilled in the art according to these embodiments are included in the scope of the present invention.
The related words of the enterprise are used for associating the news of the enterprise, and are generally the product name, brand name, stock short name or enterprise short name of the enterprise. The relevant words are generally directly input into the elastic search thesaurus through the front end, but the influence of most input relevant words is unknown, and the direct addition of the relevant words into the thesaurus may bring a large number of false relevant words (the false relevant words refer to relevant words which have no association relation with corresponding enterprises), so that the accuracy of news results of subsequently associated enterprises is directly influenced.
It should be noted that, elastic search, abbreviated as ES, is a search server based on the full text search engine Lucene, and provides a full text search engine with distributed multi-user capability. The ElasticSearch is the prior art, and in China, many Internet companies, E-commerce platforms and the like use the ElasticSearch for retrieval and analysis, so that many practical production problems are solved.
Therefore, as shown in fig. 1, the invention provides a method for selecting enterprise associated words, which can monitor and manage enterprise associated words entered at the front end, select enterprise associated words meeting requirements to associate enterprise news, and avoid uncontrollable influence caused by directly using enterprise associated words entered at the front end. The method comprises the following steps:
step S100: the preliminarily filtered news and news volume N1 related to a certain business is obtained.
For example, all news including the business-related words may be recalled all together using the ElasticSearch, resulting in preliminarily filtered business-related news and newsgroup N1, where the business-related words include formal business-related words and candidate-related words. The formal enterprise associated word refers to an identified associated word having an association relation with the enterprise, and the associated word to be selected refers to an associated word directly obtained through front-end inputting.
Step S200: and performing association processing on the news related to the enterprise by using the associated words to be selected to obtain the news volume N2 related to the associated words to be selected.
The association processing refers to selecting news related to the associated words through an association algorithm. In a preferred embodiment, the step S200 includes: and using the relevant word to be selected as the relevant word of the enterprise, calculating the TFIDF value of the relevant word in each piece of news relevant to the enterprise, and selecting the news with the TFIDF value larger than a set threshold value as the news after the relevant word is subjected to correlation processing.
The TFIDF value is a value calculated according to a TF-IDF (Term Frequency-Inverse Document Frequency) statistical method, and the TF-IDF is called a Term Frequency-Inverse Document Frequency for evaluating the importance degree of a word to a Document set or one of documents in a corpus. The importance of a word increases in proportion to the number of times it appears in a document, but at the same time decreases in inverse proportion to the frequency with which it appears in the corpus. This is prior art and will not be described herein.
In a specific embodiment, the elastic search uses the to-be-selected associated word to perform association processing on the news related to the enterprise to obtain an associated log, where the associated log includes a newsgroup N1 before being associated with a certain to-be-selected associated word and a newsgroup N2 after being associated with the certain to-be-selected associated word.
Step S300: and judging whether the relevant word to be selected can be used as a formal enterprise relevant word or not according to the relevance ratio of the relevant word to be selected, wherein the relevance ratio of the relevant word to be selected is N2/N1.
The formal enterprise associated word refers to an associated word which is determined to have an association relation with the enterprise. The relevance ratio is the proportion of the newsgroup N2 related by using a certain related word to be selected in the newsgroup N1 before the related word is used. The association ratio can reflect the association degree between the relevant word to be selected and the corresponding enterprise, specifically, the higher the association ratio is, the higher the association degree between the relevant word to be selected and the enterprise is, and conversely, the lower the association ratio is, the lower the association degree between the relevant word to be selected and the enterprise is, when the association ratio of a certain relevant word to be selected is lower than a predetermined lower threshold (preferably, the predetermined lower threshold is one thousandth), it may be directly determined that the relevant word to be selected is not associated with the corresponding enterprise, that is, the relevant word to be selected cannot be used as a formal enterprise relevant word.
Further, in order to increase accuracy, for the associated word to be selected with the association ratio greater than or equal to the predetermined lower threshold, news after association processing is checked by sampling, and whether the associated word to be selected can be used as a formal enterprise associated word is determined. Specifically, if news exceeding a predetermined percentage in the news viewed in the sample is news associated with the business, for example, news exceeding a predetermined percentage 50% in the news viewed in the sample 6 is news associated with the business, it is determined that the relevant word to be selected can be used as a formal business relevant word.
In another embodiment, the relevant words to be selected with the relevance ratio exceeding a preset upper threshold are directly selected as the formal enterprise relevant words.
As shown in table 1 below, in a specific embodiment, an enterprise related word entered at the front end is first used as a related word to be selected, and is added to a blacklist of an ElasticSearch thesaurus. In table 1, the related words to be selected from company a to company F are product name 1 (product name of company a), product name 2 (product name of company B), company name 1 (company name of company C), and their respective methods.
Then, the ElasticSearch recalls all news including enterprise associated words of a certain enterprise to obtain preliminarily screened news related to the enterprise, wherein the enterprise associated words comprise formal enterprise associated words and associated words to be selected.
And next, performing association processing on the news related to the enterprise by using the associated word to be selected by the elastic search to obtain an associated log, wherein the associated log comprises a newsgroup N1 before being associated with a certain associated word to be selected and a newsgroup N2 after being associated. As can be seen from table 1, the news volume N1 before company a to company F are associated is: 38508. 16711, 15672, 52519, 49579, 47683. The news volume N2 after company a to company F are associated is: 4295. 1834, 4025, 17, 34, 43.
And then calculating the relevance ratio of the relevant word to be selected through the relevance log, and judging whether the relevant word to be selected can be used as a formal enterprise relevant word or not through the relevance ratio, wherein the relevance ratio is N2/N1. As can be seen from table 1, the relevance ratios of the relevant words to be selected of companies D to F are all lower than one in a thousand, and it can be directly determined that the relevant words to be selected of the three companies are not relevant to the corresponding companies, that is, cannot be used as formal enterprise relevant words of the three companies. The relevance ratio of the relevant words to be selected from the company A to the company C is relatively large, and the relevant words to be selected from the company A to the company C can be used as formal enterprise relevant words of the corresponding company through simple sampling inspection.
And finally, removing the relevant words to be selected from the company A to the company C from the blacklist, and adding the relevant words to the corresponding formal list of the enterprise relevant words.
To-be-selected associated word Company(s) Associated Pre-News volume N1 Associated newsgroup N2 Correlation ratio
Product name 1 Company A 38508 4295 0.111535
Product name 2 Company B 16711 1834 0.109748
Company abbreviation 1 C Corp Ltd 15672 4025 0.256827
At one time Company D 52519 17 0.000323692
Of course Company E 49579 34 0.000685774
Method of producing a composite material Company F 47683 43 0.000901789
TABLE 1
The invention further provides an electronic device, which comprises a memory and a processor, wherein the memory stores a computer program capable of running on the processor, and the processor implements any one step of the method for selecting the enterprise associated words when executing the program, that is, implements the step of any one technical scheme of the method for selecting the enterprise associated words.
The invention also provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements any one of the steps in the method for selecting an enterprise related word, that is, implements any one of the steps in the method for selecting an enterprise related word.
It should be understood that although the present description refers to embodiments, not every embodiment contains only a single technical solution, and such description is for clarity only, and those skilled in the art should make the description as a whole, and the technical solutions in the embodiments can also be combined appropriately to form other embodiments understood by those skilled in the art.
The above-listed detailed description is only a specific description of a possible embodiment of the present invention, and they are not intended to limit the scope of the present invention, and equivalent embodiments or modifications made without departing from the technical spirit of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method for selecting enterprise associated words is characterized by comprising the following steps:
acquiring preliminarily screened news and news volume N1 related to a certain enterprise;
performing association processing on the news related to the enterprise by using the associated words to be selected to obtain news volume N2 related to the associated words to be selected;
and judging whether the relevant word to be selected can be used as a formal enterprise relevant word or not according to the relevance ratio of the relevant word to be selected, wherein the relevance ratio of the relevant word to be selected is N2/N1.
2. The method for selecting an enterprise related word according to claim 1, wherein the step of determining whether the related word to be selected can be used as a formal enterprise related word according to the relevance ratio of the related word to be selected specifically comprises the steps of:
and if the relevance ratio of the relevant word to be selected is lower than a preset lower limit threshold, judging that the relevant word to be selected cannot be used as a formal enterprise relevant word.
3. The method for selecting the enterprise related word according to claim 2, wherein:
the predetermined lower threshold is 0.1%.
4. The method for selecting an enterprise related word according to claim 2, wherein the step of determining whether the related word to be selected can be used as an official enterprise related word according to the relevance ratio of the related word to be selected further comprises:
and if the relevance ratio of the relevant words to be selected is greater than or equal to a preset lower limit threshold, the news after the relevance processing is checked in a sampling mode, and if the news exceeding the preset ratio in the news checked in the sampling mode is the news relevant to the enterprise, the relevant words to be selected can be used as formal enterprise relevant words.
5. The method for selecting the enterprise related word according to claim 1, wherein the "performing the association processing on the news related to the enterprise by using the related word to be selected" specifically comprises:
and using the relevant word to be selected as the relevant word of the enterprise, calculating the TFIDF value of the relevant word in each piece of news relevant to the enterprise, and selecting the news with the TFIDF value larger than a set threshold value as the news after the relevant word is subjected to correlation processing.
6. The method for selecting the enterprise relevant word according to claim 1, wherein:
the ElasticSearch recalls all news including enterprise associated words to obtain preliminarily screened news related to the enterprise, wherein the enterprise associated words comprise formal enterprise associated words and to-be-selected associated words;
performing association processing on the news related to the enterprise by using the association words to be selected by the elastic search to obtain an association log;
and calculating the relevance ratio of the relevant word to be selected through the relevance log, and judging whether the relevant word to be selected can be used as a formal enterprise relevant word or not through the relevance ratio, wherein the relevance log comprises the news volume N1 before the relevant word to be selected is used and the news volume N2 after the relevant word is used, and the relevance ratio is N2/N1.
7. The method for selecting the enterprise relevant word according to claim 6, wherein the method further comprises the following steps:
taking enterprise associated words input at the front end as associated words to be selected, and adding the enterprise associated words into a blacklist of an ElasticSearch lexicon;
and after a certain relevant word to be selected is judged to be available as a formal enterprise relevant word, the relevant word to be selected is removed from the blacklist and added into a formal enterprise relevant word list.
8. The method for selecting the enterprise relevant word according to claim 1, wherein:
the associated words to be selected are product names, brand names, stock names or enterprise names of enterprises.
9. An electronic device comprising a memory and a processor, wherein the memory stores a computer program operable on the processor, and wherein the processor executes the program to implement the steps in the method for selecting an enterprise-related word according to any one of claims 1 to 8.
10. A computer-readable storage medium, on which a computer program is stored, wherein the computer program, when being executed by a processor, implements the steps in the method for selecting an enterprise-related word according to any one of claims 1 to 8.
CN202010547677.2A 2020-06-16 2020-06-16 Method and device for selecting enterprise associated words and storage medium Pending CN111737553A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010547677.2A CN111737553A (en) 2020-06-16 2020-06-16 Method and device for selecting enterprise associated words and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010547677.2A CN111737553A (en) 2020-06-16 2020-06-16 Method and device for selecting enterprise associated words and storage medium

Publications (1)

Publication Number Publication Date
CN111737553A true CN111737553A (en) 2020-10-02

Family

ID=72649341

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010547677.2A Pending CN111737553A (en) 2020-06-16 2020-06-16 Method and device for selecting enterprise associated words and storage medium

Country Status (1)

Country Link
CN (1) CN111737553A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116777686A (en) * 2023-04-19 2023-09-19 深圳昊通技术有限公司 Enterprise intellectual property classification early warning method, system and storage medium

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101067808A (en) * 2007-05-24 2007-11-07 上海大学 Text key word extracting method
CN102360358A (en) * 2011-09-28 2012-02-22 百度在线网络技术(北京)有限公司 Keyword recommendation method and system
US20120117082A1 (en) * 2010-11-05 2012-05-10 Koperda Frank R Method and system for document classification or search using discrete words
CN103218368A (en) * 2012-01-20 2013-07-24 深圳市腾讯计算机系统有限公司 Method and device for discovering hot words
CN103593350A (en) * 2012-08-14 2014-02-19 阿里巴巴集团控股有限公司 Method and device for recommending promotion keyword price parameters
CN103942189A (en) * 2014-03-19 2014-07-23 百度在线网络技术(北京)有限公司 Method and device for determining keywords of compositions
CN103971678A (en) * 2013-01-29 2014-08-06 腾讯科技(深圳)有限公司 Method and device for detecting keywords
CN105488027A (en) * 2015-11-30 2016-04-13 百度在线网络技术(北京)有限公司 Keyword pushing method and apparatus
CN106708880A (en) * 2015-11-16 2017-05-24 北京国双科技有限公司 Topic associated word obtaining method and apparatus
CN108073568A (en) * 2016-11-10 2018-05-25 腾讯科技(深圳)有限公司 keyword extracting method and device
CN108710664A (en) * 2018-05-14 2018-10-26 平安科技(深圳)有限公司 A kind of hot word analysis method, computer readable storage medium and terminal device
CN109634983A (en) * 2018-12-13 2019-04-16 百度在线网络技术(北京)有限公司 Recall determination method, apparatus, equipment and the medium of interest point information
CN109670176A (en) * 2018-12-19 2019-04-23 武汉瓯越网视有限公司 A kind of keyword abstraction method, device, electronic equipment and storage medium
CN109885753A (en) * 2019-01-16 2019-06-14 苏宁易购集团股份有限公司 A kind of method and device for expanding commercial articles searching and recalling
CN110489757A (en) * 2019-08-26 2019-11-22 北京邮电大学 A kind of keyword extracting method and device
CN110750682A (en) * 2018-07-06 2020-02-04 武汉斗鱼网络科技有限公司 Title hot word automatic metering method, storage medium, electronic equipment and system

Patent Citations (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101067808A (en) * 2007-05-24 2007-11-07 上海大学 Text key word extracting method
US20120117082A1 (en) * 2010-11-05 2012-05-10 Koperda Frank R Method and system for document classification or search using discrete words
US20130144874A1 (en) * 2010-11-05 2013-06-06 Nextgen Datacom, Inc. Method and system for document classification or search using discrete words
CN102360358A (en) * 2011-09-28 2012-02-22 百度在线网络技术(北京)有限公司 Keyword recommendation method and system
CN103218368A (en) * 2012-01-20 2013-07-24 深圳市腾讯计算机系统有限公司 Method and device for discovering hot words
CN103593350A (en) * 2012-08-14 2014-02-19 阿里巴巴集团控股有限公司 Method and device for recommending promotion keyword price parameters
CN103971678A (en) * 2013-01-29 2014-08-06 腾讯科技(深圳)有限公司 Method and device for detecting keywords
CN103942189A (en) * 2014-03-19 2014-07-23 百度在线网络技术(北京)有限公司 Method and device for determining keywords of compositions
CN106708880A (en) * 2015-11-16 2017-05-24 北京国双科技有限公司 Topic associated word obtaining method and apparatus
CN105488027A (en) * 2015-11-30 2016-04-13 百度在线网络技术(北京)有限公司 Keyword pushing method and apparatus
CN108073568A (en) * 2016-11-10 2018-05-25 腾讯科技(深圳)有限公司 keyword extracting method and device
CN108710664A (en) * 2018-05-14 2018-10-26 平安科技(深圳)有限公司 A kind of hot word analysis method, computer readable storage medium and terminal device
CN110750682A (en) * 2018-07-06 2020-02-04 武汉斗鱼网络科技有限公司 Title hot word automatic metering method, storage medium, electronic equipment and system
CN109634983A (en) * 2018-12-13 2019-04-16 百度在线网络技术(北京)有限公司 Recall determination method, apparatus, equipment and the medium of interest point information
CN109670176A (en) * 2018-12-19 2019-04-23 武汉瓯越网视有限公司 A kind of keyword abstraction method, device, electronic equipment and storage medium
CN109885753A (en) * 2019-01-16 2019-06-14 苏宁易购集团股份有限公司 A kind of method and device for expanding commercial articles searching and recalling
CN110489757A (en) * 2019-08-26 2019-11-22 北京邮电大学 A kind of keyword extracting method and device

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BRIAN GAWALT: "Discovering word associations in news media via feature selection and sparse classification", 《PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON MULTIMEDIA INFORMATION RETRIEVAL》, pages 211 - 220 *
吴睿: "面向微博文本的热词分析技术研究", 《中国优秀硕士学位论文全文数据库 信息科技》, pages 138 - 590 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116777686A (en) * 2023-04-19 2023-09-19 深圳昊通技术有限公司 Enterprise intellectual property classification early warning method, system and storage medium

Similar Documents

Publication Publication Date Title
US7895235B2 (en) Extracting semantic relations from query logs
US10489399B2 (en) Query language identification
US9519634B2 (en) Systems and methods for determining lexical associations among words in a corpus
US8983963B2 (en) Techniques for comparing and clustering documents
US10169449B2 (en) Method, apparatus, and server for acquiring recommended topic
US8051088B1 (en) Document analysis
US20080208840A1 (en) Diverse Topic Phrase Extraction
US8316026B2 (en) Method and system for keyword management
EP2228737A2 (en) Improving search effectiveness
CN106886512B (en) Article classification method and device
WO2014028860A2 (en) System and method for matching data using probabilistic modeling techniques
CN108363694B (en) Keyword extraction method and device
CN111091883B (en) Medical text processing method, device, storage medium and equipment
CN104361115A (en) Entry weight definition method and device based on co-clicking
CN113656575B (en) Training data generation method and device, electronic equipment and readable medium
CN111737553A (en) Method and device for selecting enterprise associated words and storage medium
Pojanapunya et al. The influence of the benchmark corpus on keyword analysis
CN111104422B (en) Training method, device, equipment and storage medium of data recommendation model
CN113743090A (en) Keyword extraction method and device
CN114175012A (en) System and method for ranking electronic documents based on query token density
CN117076599A (en) Knowledge graph-based data searching method and device and electronic equipment
JP2008282111A (en) Similar document retrieval method, program and device
CN115328945A (en) Data asset retrieval method, electronic device and computer-readable storage medium
CN112115237B (en) Construction method and device of tobacco science and technology literature data recommendation model
CN116932732A (en) Method, device, electronic equipment and storage medium for determining target keywords

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination