CN110609907A - A random walk-based reasoning method for medical domain knowledge - Google Patents
A random walk-based reasoning method for medical domain knowledge Download PDFInfo
- Publication number
- CN110609907A CN110609907A CN201910876121.5A CN201910876121A CN110609907A CN 110609907 A CN110609907 A CN 110609907A CN 201910876121 A CN201910876121 A CN 201910876121A CN 110609907 A CN110609907 A CN 110609907A
- Authority
- CN
- China
- Prior art keywords
- medical field
- entities
- medical
- knowledge
- named entity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 238000005295 random walk Methods 0.000 title claims abstract description 14
- 239000003814 drug Substances 0.000 claims description 10
- 208000007882 Gastritis Diseases 0.000 description 10
- 230000007704 transition Effects 0.000 description 9
- 230000008451 emotion Effects 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 8
- 239000013598 vector Substances 0.000 description 8
- 239000011159 matrix material Substances 0.000 description 7
- 206010000087 Abdominal pain upper Diseases 0.000 description 6
- 206010011224 Cough Diseases 0.000 description 5
- 238000010276 construction Methods 0.000 description 5
- 201000010099 disease Diseases 0.000 description 4
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 4
- 208000024891 symptom Diseases 0.000 description 4
- 206010028813 Nausea Diseases 0.000 description 3
- 208000007107 Stomach Ulcer Diseases 0.000 description 3
- 206010047700 Vomiting Diseases 0.000 description 3
- 230000000740 bleeding effect Effects 0.000 description 3
- 229940079593 drug Drugs 0.000 description 3
- 230000002996 emotional effect Effects 0.000 description 3
- 201000005917 gastric ulcer Diseases 0.000 description 3
- 230000008693 nausea Effects 0.000 description 3
- 230000011218 segmentation Effects 0.000 description 3
- 230000008673 vomiting Effects 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 238000013145 classification model Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000007935 neutral effect Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- BSYNRYMUTXBXSQ-UHFFFAOYSA-N Aspirin Chemical compound CC(=O)OC1=CC=CC=C1C(O)=O BSYNRYMUTXBXSQ-UHFFFAOYSA-N 0.000 description 1
- 229930186147 Cephalosporin Natural products 0.000 description 1
- 206010061218 Inflammation Diseases 0.000 description 1
- 239000004098 Tetracycline Substances 0.000 description 1
- 229960001138 acetylsalicylic acid Drugs 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 229940124587 cephalosporin Drugs 0.000 description 1
- 150000001780 cephalosporins Chemical class 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 210000001156 gastric mucosa Anatomy 0.000 description 1
- 230000004054 inflammatory process Effects 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 229960002180 tetracycline Drugs 0.000 description 1
- 229930101283 tetracycline Natural products 0.000 description 1
- 235000019364 tetracycline Nutrition 0.000 description 1
- 150000003522 tetracyclines Chemical class 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H70/00—ICT specially adapted for the handling or processing of medical references
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Public Health (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Epidemiology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Medical Informatics (AREA)
- Artificial Intelligence (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
本发明涉及一种基于随机游走的医药领域知识推理方法。其发明内容主要包括(1)基于上下文字符二元组和信息熵的医药领域命名实体识别方法;(2)基于谓词情感分类的医药领域实体间关系抽取方法;(3)基于随机游走的医药领域知识图谱推理方法。基于上述方法,识别医药领域命名实体、抽取命名实体间关系,从而自动构建医药领域知识图谱,并实现医药领域知识图谱推理。The invention relates to a reasoning method for medical field knowledge based on random walk. The content of the invention mainly includes (1) a named entity recognition method in the medical field based on contextual character binary groups and information entropy; (2) a method for extracting relationships between entities in the medical field based on predicate sentiment classification; (3) a medical field based on random walk Domain knowledge graph reasoning method. Based on the above method, the named entities in the medical field are identified, and the relationship between named entities is extracted, so as to automatically construct the knowledge map in the medical field and realize the reasoning of the knowledge map in the medical field.
Description
技术领域technical field
本发明涉及知识工程和机器学习领域,一种基于随机游走的医药领域知识推理方法。The invention relates to the fields of knowledge engineering and machine learning, and relates to a random walk-based reasoning method for medical field knowledge.
背景技术Background technique
知识图谱技术作为知识工程和人工智能领域的关键技术之一,是当前热门的技术研究领域之一。不同于机器学习技术,往往存在特征间局部关系难解释以及特征与输出间全局关系难解释的问题,知识图谱技术通过三元组表示知识实体间关系,直观的反映知识本体和知识实体间关联逻辑,具有很好的可解释性,已得到工业界越来越多的重视,成为人工智能技术的重要基础之一。As one of the key technologies in the field of knowledge engineering and artificial intelligence, knowledge graph technology is one of the current hot technical research fields. Different from machine learning technology, there are often problems that it is difficult to explain the local relationship between features and the global relationship between features and output. Knowledge graph technology uses triples to represent the relationship between knowledge entities, which intuitively reflects the knowledge ontology and the association logic between knowledge entities. , has very good explainability, has been paid more and more attention by the industry, and has become one of the important foundations of artificial intelligence technology.
知识图谱技术主要包括构建、推理等方面,其中,知识图谱构建技术主要包括命名实体识别、关系抽取等,知识图谱推理技术主要包括实体关系预测、知识推理等。通过从文本数据中的事实中识别知识实体、抽取知识实体间的关系,并基于三元组表示法构建知识图谱,并通过挖掘和预测可能存在的实体间关系来对知识图谱进行补全,基于知识图谱中已知的实体间关系进行知识规则的提取与推理。Knowledge map technology mainly includes construction, reasoning, etc. Among them, knowledge map construction technology mainly includes named entity recognition, relationship extraction, etc., knowledge map reasoning technology mainly includes entity relationship prediction, knowledge reasoning, etc. By identifying knowledge entities from the facts in the text data, extracting the relationship between knowledge entities, constructing a knowledge graph based on the triple representation, and completing the knowledge graph by mining and predicting possible relationships between entities, based on The relationship between known entities in the knowledge graph is used to extract and reason knowledge rules.
医药领域作为知识密集型领域,十分依赖医学、药学背景知识,利用知识图谱表示医学、药学背景知识,对医药领域的辅助智能应用有着十分重要的支撑作用。然而,医药领域的命名实体、实体间关系、知识逻辑等具有十分鲜明的领域特点,相较于通用领域有着较大差异,需要提出有针对性的知识图谱构建与推理技术支撑知识图谱在医药领域中的辅助智能应用。As a knowledge-intensive field, the medical field relies heavily on the background knowledge of medicine and pharmacy. The use of knowledge graphs to represent the background knowledge of medicine and pharmacy plays a very important role in supporting the application of auxiliary intelligence in the field of medicine. However, named entities, relationships between entities, and knowledge logic in the medical field have very distinct domain characteristics, which are quite different from those in the general field. It is necessary to propose targeted knowledge graph construction and reasoning techniques to support knowledge graphs in the medical field. Assistant intelligent applications in .
发明内容Contents of the invention
本发明目的旨在解决医药知识图谱自动构建和推理问题。The purpose of the present invention is to solve the problem of automatic construction and reasoning of medical knowledge graphs.
为此,本发明提出了一种基于随机游走的医药领域知识推理方法,主要包括三部分内容:For this reason, the present invention proposes a method for inferring medical field knowledge based on random walks, which mainly includes three parts:
(1)基于上下文字符二元组和信息熵的医药领域命名实体识别方法;(1) A named entity recognition method in the medical field based on contextual character binary groups and information entropy;
(2)基于谓词情感分类的医药领域实体间关系抽取方法;(2) A method for extracting relationships between entities in the medical field based on predicate sentiment classification;
(3)基于随机游走的医药领域知识图谱推理方法。(3) A reasoning method based on random walk knowledge map in medical field.
具体内容如下:The specific content is as follows:
采用方法(1)识别医药领域命名实体,包括药品、疾病、症状、人群、成分等概念;采用方法(2)抽取医药领域命名实体间的正向关系和负向关系,包括适用、禁忌等关系;利用医药领域命名实体和实体间关系自动构建医药知识图谱,并采用方法(3)实现医药知识图谱推理。基于上述方法实现医药知识图谱自动构建以及医药领域知识推理。Use method (1) to identify named entities in the medical field, including concepts such as drugs, diseases, symptoms, populations, and ingredients; use method (2) to extract positive and negative relationships between named entities in the medical field, including applicable, taboo, etc. ;Using named entities in the medical field and the relationship between entities to automatically construct a medical knowledge graph, and using method (3) to achieve medical knowledge graph reasoning. Based on the above method, the automatic construction of medical knowledge map and knowledge reasoning in medical field are realized.
(1)基于上下文字符二元组和信息熵的医药领域命名实体识别方法。(1) A named entity recognition method in the medical field based on contextual character dyads and information entropy.
收集常规语料和医药专业语料,去掉其中标点符号和停用词,根据医药语料和常规预料库中上下文分别建立了两个字符转移概率矩阵,矩阵中的每个元素是上下文中的转移频率值。令Matmedical为医药语料的上下文字符转移概率矩阵,Matnormal为常规语料的上下文字符转移概率矩阵,令{ci,ci+1}为语料中连续的字符上下文,通过分别计算{ci,ci+1}在医药语料和常规语料中转移概率,我们得到矩阵Matmedical(ci,ci+1)和矩阵Matnormal(ci,ci+1)。Collect conventional corpus and medical professional corpus, remove the punctuation marks and stop words, and establish two character transition probability matrices according to the medical corpus and the context in the conventional prediction database. Each element in the matrix is the transition frequency value in the context. Let Mat medical be the context character transition probability matrix of the medical corpus, Mat normal be the context character transition probability matrix of the conventional corpus, let {c i , ci+1 } be the continuous character context in the corpus, and calculate {c i , c i+1 }Transfer probability between medical corpus and regular corpus, we get matrix Mat medical ( ci ,ci +1 ) and matrix Mat normal ( ci ,ci +1 ).
基于医药语料和常规语料的上下文字符转移概率矩阵,采用信息熵计算每组字符上下文属于医药领域的显著程度,由于常规语料中的字符转移概率比较稳定,医药语料中显著偏离常规语料字符转移概率的字符上下文则判定医药命名实体。Based on the context character transition probability matrix of the medical corpus and conventional corpus, information entropy is used to calculate the significance of each group of character contexts belonging to the medical field. Since the character transition probability in the conventional corpus is relatively stable, the medical corpus significantly deviates from the conventional corpus character transition probability. The character context determines the pharmaceutical named entity.
根据下列公式计算字符转移概率的信息熵,信息熵Entropy(ci,ci+1)用于标记{ci,ci+1}是否为医药领域命名实体,如果Entropy(ci,ci+1)>t,其中t(t=1)是临界值,那么{ci,ci+1}是同一个命名实体的字符上下文,组合连续的字符上下文形成医药命名实体。Calculate the information entropy of the character transfer probability according to the following formula, the information entropy Entropy(ci,ci +1 ) is used to mark whether { ci , ci+1 } is a named entity in the medical field, if Entropy( ci ,ci+ 1 +1 )>t, where t(t=1) is a critical value, then {c i , c i+1 } is the character context of the same named entity, and continuous character contexts are combined to form a medical named entity.
(2)基于谓词情感分类的医药领域实体间关系抽取方法。(2) A method for extracting relationships between entities in the medical field based on predicate sentiment classification.
将医药语料根据标点符号进行断句,得到短句集合,对其中部分短句的情感进行标记,标签包括正向、负向、中性。采用基于维特比的条件随机场方法对医药语料中带有情感标签的所有短句进行中文分词处理,并采用词向量方法对所有词进行向量化处理。将所有词的词向量进行加权平均后得到短句的文本向量,并采用支持向量机对带有情感标签的文本向量进行训练,得到文本情感分类模型。基于该模型对医药语料中所有短句进行情感分类,提取其中具有显著正向或负向情感的短句。The medical corpus is segmented according to the punctuation marks to obtain a set of short sentences, and the emotions of some of the short sentences are marked, and the labels include positive, negative, and neutral. All short sentences with emotional tags in the medical corpus are processed by Chinese word segmentation using Viterbi-based conditional random field method, and all words are vectorized by word vector method. The word vectors of all words are weighted and averaged to obtain the text vectors of short sentences, and the support vector machine is used to train the text vectors with emotional labels to obtain the text emotion classification model. Based on this model, sentiment classification is carried out for all short sentences in the medical corpus, and short sentences with significant positive or negative emotions are extracted.
对上述具有正向或负向情感的短句进行中文分词处理,并对其中的词与进行词性标注,提取其中的谓词(动词)。如果该短句包含的医药命名实体数量大于或等于2,且分属谓词所在位置的两侧或谓词是头词、尾词,则抽取谓词两侧实体并建立实体间关系,同时根据短句的正向或负向情感判别实体间的关系属于正向关系或负向关系。Perform Chinese word segmentation processing on the above short sentences with positive or negative emotions, and perform part-of-speech tagging on the words and phrases in them, and extract the predicates (verbs) in them. If the number of medical named entities contained in the short sentence is greater than or equal to 2, and the two sides of the predicate belong to the position or the predicate is the head word and the tail word, then extract the entities on both sides of the predicate and establish the relationship between the entities, and at the same time, according to the short sentence The relationship between positive or negative emotion discrimination entities belongs to positive relationship or negative relationship.
(3)基于随机游走的医药领域知识图谱推理方法。(3) A reasoning method based on random walk knowledge map in medical field.
根据上述医药命名实体识别方法和实体间关系抽取方法,基于三元组表示法,构建了医药知识图谱KG(V,E,P),其中V表示知识图谱中的顶点,即医药实体,E表示知识图谱中的两顶点之间的边,即两实体间的关系,P表示知识图谱中边的正向或负向属性。According to the above-mentioned medical named entity recognition method and the relationship extraction method between entities, a medical knowledge graph KG(V, E, P) is constructed based on the triple representation method, where V represents the vertex in the knowledge graph, that is, the medical entity, and E represents The edge between two vertices in the knowledge graph is the relationship between two entities, and P represents the positive or negative attribute of the edge in the knowledge graph.
如图1知识图谱概念间关系图所示,医药实体的概念包括疾病、症状、药品、人群、科室、身体部位等。知识图谱中的顶点也包括疾病实体,如感冒、胃炎等;症状实体,如咳嗽、胃痛等;药物实体,如阿司匹林、头孢等;人群实体,如婴儿、孕妇等;身体部位实体,如头、胸等。边表示的是每两个实体之间的关系,例如,感冒-咳嗽、胃炎-胃痛表示感冒和胃炎分别导致了咳嗽和胃痛。另外,医药实体之间的关系还包括了正向关系和反向关系,例如边感冒-咳嗽是正向关系,因为感冒导致了咳嗽,然而边四环素-孕妇是负向关系,因为四环素是孕妇的禁忌药。正向关系包括适用于、引起等,负向关系包括慎用、禁忌等。例如,给定一个短句“胃炎是胃黏膜的炎症…通常表现为上腹部疼痛、恶心、呕吐…并发症包括出血、胃溃疡…”,从短句中提取出的疾病顶点就是{胃炎},提取的症状顶点是{上腹部疼痛,恶心,呕吐…出血,胃溃疡…},关系和权值则是{(胃炎,上腹部疼痛,1.0),(胃炎,恶心,1.0),(胃炎,呕吐,1.0)…(胃炎,出血,1.0),(胃炎,胃溃疡,1.0)…}。As shown in Figure 1, the relationship between knowledge map concepts, the concepts of medical entities include diseases, symptoms, medicines, groups of people, departments, body parts, etc. The vertices in the knowledge graph also include disease entities, such as colds, gastritis, etc.; symptom entities, such as cough, stomach pain, etc.; drug entities, such as aspirin, cephalosporin, etc.; crowd entities, such as babies, pregnant women, etc.; body part entities, such as head, Chest etc. The edge represents the relationship between each two entities, for example, cold-cough, gastritis-stomach pain means that cold and gastritis cause cough and stomach pain respectively. In addition, the relationship between medical entities also includes positive and negative relationships. For example, a cold-cough is a positive relationship, because a cold causes a cough, but a tetracycline-pregnant woman is a negative relationship, because tetracycline is a contraindication for pregnant women. medicine. Positive relationships include applying to, causing, etc., and negative relationships include cautious use, taboo, etc. For example, given a short sentence "Gastritis is inflammation of the gastric mucosa... usually manifested as epigastric pain, nausea, vomiting... Complications include bleeding, gastric ulcer...", the disease vertex extracted from the short sentence is {gastritis}, The symptom vertex extracted is {upper abdominal pain, nausea, vomiting... bleeding, gastric ulcer...}, and the relationship and weight are {(gastritis, upper abdominal pain, 1.0), (gastritis, nausea, 1.0), (gastritis, vomiting , 1.0)...(gastritis, bleeding, 1.0), (gastritis, gastric ulcer, 1.0)...}.
根据医药知识图谱,基于随机游走方法进行知识推理。推理过程会被转化成遍历的过程,从有限线索中(若干实体)开始迭代搜索推理结果。V={v1,v2,....,vn}是一组实体,根据下列公式推断出这一组实体可以推理出候选结果。According to the medical knowledge map, knowledge reasoning is performed based on the random walk method. The reasoning process will be transformed into a traversal process, starting from limited clues (several entities) to iteratively search for reasoning results. V={v 1 ,v 2 ,...,v n } is a group of entities, and the candidate results can be inferred from this group of entities according to the following formula.
其中,score(vi)是指定实体vi的分值,In(vi)是vi的入度,Out(vi)是vi的出度,pj,i是vi和vj之间边的属性值,正向为1,负向为-1,α(α=0.85)是经验参数。推理时,在知识图谱上对于已知的这组实体进行初始化,对应顶点分值初始化为1,其余顶点分值初始化为0。通过随机游走迭代计算得到所有顶点分值,对分值进行排序,分值较高的顶点对应的实体则为这组已知实体可以推理出的候选结果,最后根据实际情况进行筛选。Among them, score(v i ) is the score of the specified entity v i , In(v i ) is the in-degree of v i , Out(v i ) is the out-degree of v i , p j,i are v i and v j The attribute value of the edge in between is 1 for the positive direction and -1 for the negative direction, and α (α=0.85) is an empirical parameter. During inference, the known group of entities is initialized on the knowledge graph, the corresponding vertex scores are initialized to 1, and the remaining vertex scores are initialized to 0. The scores of all vertices are obtained through random walk iterative calculation, and the scores are sorted. The entities corresponding to the vertices with higher scores are the candidate results that can be inferred from this group of known entities, and finally screened according to the actual situation.
附图说明Description of drawings
图1为知识图谱概念间关系图Figure 1 is a diagram of the relationship between knowledge map concepts
具体实施方式Detailed ways
本发明步骤如下:The steps of the present invention are as follows:
步骤1:采集常规语料和医药专业语料,去掉其中标点符号和停用词。Step 1: Collect regular corpus and medical professional corpus, and remove punctuation marks and stop words.
步骤2:根据医药语料和常规预料库中字符上下文{ci,ci+1}分别建立了医药语料和常规语料的字符转移概率矩阵Matmedical(ci,ci+1)和Matnormal(ci,ci+1)。Step 2: According to the character context {c i ,c i+1 } in the medical corpus and the conventional prediction library, the character transition probability matrices Mat medical ( ci ,ci +1 ) and Mat normal ( c i ,c i+1 ).
步骤3:基于医药语料和常规语料的上下文字符转移概率矩阵,采用信息熵计算每组字符上下文属于医药领域的显著程度。Step 3: Based on the context character transition probability matrix of the medical corpus and the conventional corpus, the information entropy is used to calculate the significance of each group of character contexts belonging to the medical field.
步骤4:如果信息熵Entropy(ci,ci+1)>t,其中t(t=1)是临界值,那么{ci,ci+1}是同一个命名实体的字符上下文,组合连续的字符上下文形成医药命名实体。Step 4: If information entropy Entropy(ci,ci +1 )>t, where t(t=1) is the critical value, then { ci , ci+1 } is the character context of the same named entity, the combination Consecutive character contexts form pharmaceutical named entities.
步骤5:将医药语料根据标点符号进行断句,得到短句集合,对其中部分短句的情感进行标记,标签包括正向、负向、中性。Step 5: Segment the medical corpus according to punctuation marks to obtain a set of short sentences, and label the emotions of some of the short sentences, including positive, negative, and neutral.
步骤6:采用中文分词和词向量方法对所有词进行向量化处理。将所有词的词向量进行加权平均后得到短句的文本向量,并采用支持向量机对带有情感标签的文本向量进行训练,得到文本情感分类模型。基于该模型对医药语料中的短句进行情感分类。Step 6: Use Chinese word segmentation and word vector method to vectorize all words. The word vectors of all words are weighted and averaged to obtain the text vectors of short sentences, and the support vector machine is used to train the text vectors with emotional labels to obtain the text emotion classification model. Sentiment classification of short sentences in medical corpus is carried out based on this model.
步骤7:采用词性标注方法提取句子中的谓词,如果句子中包含的医药命名实体数量大于或等于2,且分属谓词所在位置的两侧或谓词是头词、尾词,则抽取谓词两侧实体并建立实体间关系,同时根据短句的正向或负向情感判别实体间的关系属于正向关系或负向关系。Step 7: Use the part-of-speech tagging method to extract the predicates in the sentence. If the number of medical named entities contained in the sentence is greater than or equal to 2, and belong to both sides of the predicate or the predicate is the head word and tail word, then extract both sides of the predicate Entities and establish the relationship between entities, and at the same time judge whether the relationship between entities is a positive relationship or a negative relationship according to the positive or negative emotion of the short sentence.
步骤8:根据医药命名实体识别和实体间关系,基于三元组表示法,构建了医药知识图谱。Step 8: According to the medical named entity recognition and the relationship between entities, based on the triple representation, the medical knowledge graph is constructed.
步骤9:根据医药知识图谱,基于随机游走方法进行知识推理。推理过程会被转化成遍历的过程,从有限线索中(若干实体)开始迭代搜索候选推理结果。。Step 9: According to the medical knowledge map, knowledge reasoning is performed based on the random walk method. The reasoning process will be transformed into a traversal process, starting from limited clues (several entities) to iteratively search for candidate reasoning results. .
Claims (4)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910876121.5A CN110609907A (en) | 2019-09-17 | 2019-09-17 | A random walk-based reasoning method for medical domain knowledge |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910876121.5A CN110609907A (en) | 2019-09-17 | 2019-09-17 | A random walk-based reasoning method for medical domain knowledge |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110609907A true CN110609907A (en) | 2019-12-24 |
Family
ID=68891506
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910876121.5A Pending CN110609907A (en) | 2019-09-17 | 2019-09-17 | A random walk-based reasoning method for medical domain knowledge |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110609907A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112463895A (en) * | 2020-12-01 | 2021-03-09 | 零氪科技(北京)有限公司 | Method and device for automatically discovering medicine components based on medicine name mining |
CN112967820A (en) * | 2021-04-12 | 2021-06-15 | 平安科技(深圳)有限公司 | Medicine property cognitive information extraction method, device, equipment and storage medium |
CN116187868A (en) * | 2023-04-27 | 2023-05-30 | 深圳市迪博企业风险管理技术有限公司 | Knowledge graph-based industrial chain development quality evaluation method and device |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109190113A (en) * | 2018-08-10 | 2019-01-11 | 北京科技大学 | A kind of knowledge mapping construction method of theory of traditional Chinese medical science ancient books and records |
CN109325131A (en) * | 2018-09-27 | 2019-02-12 | 大连理工大学 | A Drug Recognition Method Based on Biomedical Knowledge Graph Reasoning |
CN109783698A (en) * | 2019-01-15 | 2019-05-21 | 辽宁大学 | Industrial production data entity recognition method based on Merkle-tree |
US20190171656A1 (en) * | 2017-05-10 | 2019-06-06 | Boe Technology Group Co., Ltd. | Traditional chinese medicine knowledge graph and establishment method therefor, and computer system |
CN110119451A (en) * | 2019-05-08 | 2019-08-13 | 北京颢云信息科技股份有限公司 | A kind of knowledge mapping construction method based on relation inference |
-
2019
- 2019-09-17 CN CN201910876121.5A patent/CN110609907A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20190171656A1 (en) * | 2017-05-10 | 2019-06-06 | Boe Technology Group Co., Ltd. | Traditional chinese medicine knowledge graph and establishment method therefor, and computer system |
CN109190113A (en) * | 2018-08-10 | 2019-01-11 | 北京科技大学 | A kind of knowledge mapping construction method of theory of traditional Chinese medical science ancient books and records |
CN109325131A (en) * | 2018-09-27 | 2019-02-12 | 大连理工大学 | A Drug Recognition Method Based on Biomedical Knowledge Graph Reasoning |
CN109783698A (en) * | 2019-01-15 | 2019-05-21 | 辽宁大学 | Industrial production data entity recognition method based on Merkle-tree |
CN110119451A (en) * | 2019-05-08 | 2019-08-13 | 北京颢云信息科技股份有限公司 | A kind of knowledge mapping construction method based on relation inference |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112463895A (en) * | 2020-12-01 | 2021-03-09 | 零氪科技(北京)有限公司 | Method and device for automatically discovering medicine components based on medicine name mining |
CN112463895B (en) * | 2020-12-01 | 2024-06-11 | 零氪科技(北京)有限公司 | Method and device for automatically discovering medicine components based on medicine name mining |
CN112967820A (en) * | 2021-04-12 | 2021-06-15 | 平安科技(深圳)有限公司 | Medicine property cognitive information extraction method, device, equipment and storage medium |
CN112967820B (en) * | 2021-04-12 | 2023-09-19 | 平安科技(深圳)有限公司 | Drug-nature cognition information extraction method, device, equipment and storage medium |
CN116187868A (en) * | 2023-04-27 | 2023-05-30 | 深圳市迪博企业风险管理技术有限公司 | Knowledge graph-based industrial chain development quality evaluation method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110032648B (en) | Medical record structured analysis method based on medical field entity | |
Gan et al. | Sparse attention based separable dilated convolutional neural network for targeted sentiment analysis | |
Wang et al. | Incorporating dictionaries into deep neural networks for the Chinese clinical named entity recognition | |
Arora et al. | Mining twitter data for depression detection | |
CN112115238B (en) | Question-answering method and system based on BERT and knowledge base | |
Ciaramita et al. | Broad-coverage sense disambiguation and information extraction with a supersense sequence tagger | |
Ramachandran et al. | Named entity recognition on bio-medical literature documents using hybrid based approach | |
CN110609907A (en) | A random walk-based reasoning method for medical domain knowledge | |
CN111950283B (en) | Chinese word segmentation and named entity recognition system for large-scale medical text mining | |
CN110931128B (en) | Method, system and device for automatically identifying unsupervised symptoms of unstructured medical texts | |
Liu et al. | Multi-granularity sequence labeling model for acronym expansion identification | |
CN114077673B (en) | A knowledge graph construction method based on BTBC model | |
CN106919794A (en) | Towards the drug class entity recognition method and device of multi-data source | |
CN115293161A (en) | Reasonable medicine taking system and method based on natural language processing and medicine knowledge graph | |
Li et al. | Adapting clip for phrase localization without further training | |
CN115391547A (en) | Multi-mode cultural relic knowledge graph construction method, device and system and storage medium | |
CN112966117A (en) | Entity linking method | |
Palani et al. | T-BERT--Model for Sentiment Analysis of Micro-blogs Integrating Topic Model and BERT | |
CN111581974A (en) | Biomedical entity identification method based on deep learning | |
Khan et al. | Exerting 2D-space of sentiment Lexicons with machine learning techniques: a hybrid approach for sentiment analysis | |
Singh et al. | Visual questions answering developments, applications, datasets and opportunities: A state-of-the-art survey | |
Lyu et al. | vtGraphNet: Learning weakly-supervised scene graph for complex visual grounding | |
Samuel et al. | Textual data distributions: Kullback leibler textual distributions contrasts on gpt-2 generated texts, with supervised, unsupervised learning on vaccine & market topics & sentiment | |
Ly et al. | An end-to-end local attention based model for table recognition | |
Menglong et al. | Image classification based on image knowledge graph and semantics |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20191224 |
|
RJ01 | Rejection of invention patent application after publication |