CN114639483B - Electronic medical record retrieval method and device based on graphic neural network - Google Patents
Electronic medical record retrieval method and device based on graphic neural network Download PDFInfo
- Publication number
- CN114639483B CN114639483B CN202210291079.2A CN202210291079A CN114639483B CN 114639483 B CN114639483 B CN 114639483B CN 202210291079 A CN202210291079 A CN 202210291079A CN 114639483 B CN114639483 B CN 114639483B
- Authority
- CN
- China
- Prior art keywords
- medical
- patient
- entity
- vector representation
- entities
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000013528 artificial neural network Methods 0.000 title claims abstract description 43
- 238000000034 method Methods 0.000 title claims abstract description 36
- 239000011159 matrix material Substances 0.000 claims abstract description 37
- 238000012549 training Methods 0.000 claims abstract description 10
- 230000002776 aggregation Effects 0.000 claims description 10
- 238000004220 aggregation Methods 0.000 claims description 10
- 230000004913 activation Effects 0.000 claims description 7
- 238000012935 Averaging Methods 0.000 claims description 4
- 238000004590 computer program Methods 0.000 claims description 4
- 238000010586 diagram Methods 0.000 claims description 4
- 230000003213 activating effect Effects 0.000 claims description 3
- 238000003062 neural network model Methods 0.000 claims description 3
- 230000002159 abnormal effect Effects 0.000 abstract 1
- 230000006870 function Effects 0.000 description 42
- 238000013135 deep learning Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000007721 medicinal effect Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/60—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Public Health (AREA)
- Biomedical Technology (AREA)
- Medical Informatics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Primary Health Care (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- Epidemiology (AREA)
- Pathology (AREA)
- Animal Behavior & Ethology (AREA)
- Medical Treatment And Welfare Office Work (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses an electronic medical record retrieval method based on a graph neural network, which comprises the following steps: obtaining a co-occurrence matrix of medical entities in the electronic medical record, adding co-occurrence information of the medical entities and ancestor medical entities into the medical entity co-occurrence matrix to obtain an enhanced medical entity co-occurrence matrix, extracting each medical entity vector representation and each patient vector representation by adopting a GloVe model, wherein the electronic medical record abnormal graph comprises medical entity nodes, patient nodes, a link real relationship between the medical entities and a link real relationship between the patient and the medical entities; inputting the electronic medical record heterogram into a graph neural network to respectively obtain a patient node output vector representation and a medical entity node output vector representation, and linking relation probability of the patient and the medical entity; probability of link relation between medical entities; training the graph neural network by using the total loss function, and updating parameters to obtain a final graph neural network; the method can prepare for predicting the probability of association of the patient with the medical entity.
Description
Technical Field
The invention relates to the technical field of medical information data processing, in particular to an electronic medical record retrieval method and device based on a graphic neural network.
Background
Medical practice is an activity requiring a large amount of data support that requires the information of the patient to be constantly acquired for analysis and decision making. As one of the main information sources at present, electronic medical records contain rich information, and the use of the information to support medical activities such as clinical decision support, clinical research and clinical trials is of great importance, and the research development needs to effectively query the electronic medical record data. In the query task developed by the medical field personnel, the lack of support of information technicians enables the information technicians to complete query expression only by relying on own knowledge, so that the process of the query task is challenging, a large amount of browsing and exploration is needed to find target information, the work efficiency is greatly reduced, and the work load of the medical professionals is increased. To address this problem, an automated approach is needed to reduce the time costs of clinical staff.
In this process, the query performance can be effectively improved by utilizing the semantic association mode. Currently, in the query task of the actual scene, various medical entities in the electronic medical record are associated by utilizing medical ontology knowledge, and corresponding target query entities are expanded in the query through the relationship. However, the method excessively depends on a general medical knowledge body, and associated information existing in the electronic medical record is easy to ignore; in addition, entities in the electronic medical record, which are not present in the medical knowledge body, cannot be expanded, so that the application range of the method is limited.
The electronic medical record contains rich associated information which can effectively help to optimize the query task. Based on this idea, it is necessary to establish a link relationship between electronic medical record data using different information, and then improve a query task by an association relationship between the data. With the development of machine learning, in particular, the practical effect of deep learning in various fields is confirmed, so that modeling of electronic medical records by using a neural network is an effective means. The image neural network can be used for representing a complex topological structure, and the electronic medical record can be regarded as a complex heterogeneous image structure, so that the image neural network can be used for effectively representing the structure of the electronic medical record.
The graph neural network is a novel neural network structure developed from convolutional neural network and graph representation learning, can extract and represent the characteristics of the data in the graph field compared with the data type oriented by the prior neural network, is an efficient and easily-expanded structure, and shows a powerful function in the aspect of learning graph data. In contrast to conventional deep learning methods, it can reflect entities and their associations through a built graph model. The graph neural network firstly carries out initialization description on nodes, then obtains states with characteristics including neighbor node information and network topology through continuous node state update, and finally outputs the nodes through a specific method to obtain required results which can be used in subsequent tasks. Therefore, this method is well suited for modeling heterogeneous electronic medical records.
The medical field has a great deal of medical ontology knowledge due to its professionals and complexities, such as: ICD, SNOMED-CT, etc. can be used to build the relationship between different medical entities, and build the association information which does not exist in the electronic medical record, thereby enriching the topology structure information in the network.
Disclosure of Invention
The invention discloses an electronic medical record retrieval method based on a graph neural network, which can expand the relation range between medical entities and between the medical entities and patients so as to prepare for predicting the association probability between the patients and the medical entities.
An electronic medical record retrieval method based on a graph neural network comprises the following steps:
(1) Obtaining a co-occurrence matrix of medical entities in the electronic medical record, traversing medical ontology ICD codes to obtain a plurality of ancestor medical entities corresponding to the medical entities, adding co-occurrence information of the medical entities and the ancestor medical entities into the medical entity co-occurrence matrix to obtain an enhanced medical entity co-occurrence matrix, extracting each medical entity vector representation by adopting a GloVe model based on the enhanced medical entity co-occurrence matrix, and taking an aggregate result of the plurality of medical entity vector representations associated with a patient as a patient vector representation;
(2) Constructing an electronic medical record heterogram, wherein the electronic medical record heterogram comprises medical entity nodes, patient nodes, a real link relationship between medical entities and a real link relationship between a patient and the medical entities;
Each medical entity vector is expressed as an initial attribute of each medical entity node, each patient vector is expressed as an initial attribute of each patient node, relevant medical entities are connected to obtain a real link relationship between the medical entities, and the associated medical entities are connected with the patient to obtain a real link relationship between the patient and the medical entities;
(3) Inputting the electronic medical record iso-graph into GRAPHSAGE graph neural network to respectively obtain patient node output vector representation and medical entity node output vector representation; based on the patient node output vector representation and the medical entity node output vector representation, obtaining the probability of the link relation between the patient and the medical entity by adopting an activation function; based on the medical entity node output vector representation, obtaining the probability of the link relation between medical entities by adopting an activation function;
(4) Constructing a total loss function, wherein the total loss function comprises a first loss function, a second loss function and a multi-task weighted loss function;
Constructing a first loss function through the cross entropy of the patient and medical entity link true relationship and the patient and medical entity link relationship probability;
Constructing a second loss function through the cross entropy of the link true relationship between the medical entities and the probability of the link relationship between the medical entities;
Constructing a multitasking weighted loss function by the loss value of the first loss function and the loss value of the second loss function;
(5) Training GRAPHSAGE the graph neural network by using the total loss function, and updating parameters to obtain a final GRAPHSAGE graph neural network;
(6) When the method is applied, the medical entity vector representation and the patient vector representation are input into the final GRAPHSAGE-diagram neural network to predict and obtain the association probability of the medical entity and the patient.
Obtaining the co-occurrence matrix of the medical entity in the electronic medical record comprises:
And taking the frequency product of each patient in each medical entity in each treatment record as the co-occurrence information of each two medical entities, constructing a co-occurrence matrix of each treatment record based on the co-occurrence information of each two medical entities, adding the co-occurrence matrices of the treatment records of each patient to obtain a co-occurrence matrix of each patient electronic medical record, and adding the co-occurrence matrices of a plurality of patient electronic medical records to obtain the co-occurrence matrix of the medical entities in the electronic medical record.
Traversing the ICD codes to obtain a plurality of ancestor medical entities corresponding to the medical entities, including:
and taking each medical entity as a leaf node, traversing the medical ontology ICD codes from bottom to top to obtain a plurality of ancestor nodes corresponding to the leaf nodes, extracting the medical entities corresponding to the ancestor nodes to obtain ancestor medical entities, obtaining the co-occurrence information of each medical entity and the ancestor medical entities in the medical ontology ICD codes, and adding the co-occurrence information into a medical entity co-occurrence matrix to expand the medical entity co-occurrence matrix.
Extracting each medical entity vector representation using GloVe models based on the enhanced medical entity co-occurrence matrix, including:
setting an initial vector representation of each medical entity, inputting the initial vector representation into a GloVe model, and obtaining the vector representation of each medical entity through training an objective function, wherein the objective function J is as follows:
Wherein M ij is the co-occurrence product of the ith entity vector and the jth entity vector in the enhanced medical entity co-occurrence matrix, |D| is the number of medical entities, e j is the vector representation of the jth medical entity, e i is the vector representation of the ith medical entity, b i is the bias parameter of the ith medical entity, and b j is the bias parameter of the jth medical entity.
An aggregate result of the plurality of medical entity vector representations associated with the patient is represented as a patient vector, the aggregate result comprising a sum, an average, a maximum, or a minimum.
Inputting the electronic medical record iso-graph to GRAPHSAGE graph neural network to respectively obtain patient node output vector representation and medical entity node output vector representation, including:
Performing Mean aggregator aggregation on the current layer neighbor node vector representation and the previous layer output vector representation of the patient node to obtain a patient node output vector representation, and performing Mean aggregator aggregation on the current layer neighbor node vector representation and the previous layer output vector representation of the medical entity node to obtain a medical entity node output vector representation;
wherein the current layer neighbor node vector represents The method comprises the following steps:
Wherein R is the link real relationship between medical entities or the link real relationship between a patient and the medical entities, R is the set of the link real relationship, u is the neighbor node, v is the current node, N (r) (v) is the neighbor node of the current node v on the R link real relationship, For the neighbor node vector representation of the previous layer, l is the current layer, AGGREGATE (·) is the aggregation operation of merging the neighbor information of all the current nodes v together;
Medical entity node output vector representation The method comprises the following steps:
Wherein, For the vector representation of the medical entity node d of the previous layer, W d is the weight parameter of the medical entity node d, MEAN (·) is an averaging function, and sigma (·) is an activating function;
patient node output vector representation The method comprises the following steps:
Wherein, For the vector representation of the patient node p of the previous layer, W p is the weight parameter of the patient node p.
The multitasking weighted loss function L is:
Wherein, Is an index of the mth loss function weight factor,The loss value of the mth loss function.
An electronic medical record prediction device based on a graph neural network comprises a computer memory, a computer processor and a computer program stored in the computer memory and executable on the computer processor, wherein the computer memory adopts the final GRAPHSAGE graph neural network model;
the computer processor, when executing the computer program, performs the steps of:
And inputting the medical entity vector representation and the patient vector representation into a final GRAPHSAGE-diagram neural network to predict and obtain the association probability of the medical entity and the patient.
Compared with the prior art, the invention has the beneficial effects that:
(1) According to the invention, the co-occurrence matrix of the medical entity is expanded by introducing the co-occurrence information of the medical entity in the ICD code of the medical entity in the co-occurrence matrix of the electronic medical record, so that the relations between the medical entities and the patients are enriched, and the relevance between the patients and the medical entities is more accurately obtained through the post-learning graph neural network.
(2) According to the invention, the link relation between the medical entities is established through the heterogeneous graph, the link relation between the medical entities and the patient is obtained through the multi-task weighted loss function training, and the relevance between the medical entities and the patient can be accurately determined.
Drawings
Fig. 1 is a flowchart of an electronic medical record retrieval method based on a neural network according to an embodiment of the present invention.
FIG. 2 is a flow chart of a neural network model of a multi-task weighted loss function optimization graph according to an embodiment of the present invention.
Detailed Description
The implementation scheme of the electronic medical record link prediction method based on the graph neural network and fused with knowledge is clearly and completely described below with reference to the attached drawings.
An electronic medical record prediction method based on a graph neural network, as shown in fig. 1, specifically comprises the following steps:
S1: and taking the frequency product of each patient in each medical entity in each treatment record as the co-occurrence information of each two medical entities, constructing a co-occurrence matrix of each treatment record based on the co-occurrence information of each two medical entities, adding the co-occurrence matrices of the treatment records of each patient to obtain a co-occurrence matrix of each patient electronic medical record, and adding the co-occurrence matrices of a plurality of patient electronic medical records to obtain the co-occurrence matrix of the medical entities in the electronic medical record.
Wherein co-occurrence information co-occurrence (c i,cj, p) of each two medical entities is:
co-occurrence(ci,cj,p)=count(ci,p)×count(cj,p)
wherein count (c i, p) is the number of occurrences of the ith medical entity of the p-th patient in each visit record, and count (c j, p) is the number of occurrences of the jth medical entity of the p-th patient in each visit record.
Obtaining a co-occurrence matrix of medical entities in an electronic medical record, taking each medical entity as a leaf node, obtaining a plurality of ancestor nodes corresponding to the leaf node through traversing a medical ontology ICD code from bottom to top, extracting the medical entities corresponding to the ancestor nodes to obtain ancestor medical entities, obtaining co-occurrence information of each medical entity and the ancestor medical entities in the medical ontology ICD code, adding the co-occurrence information into the medical entity co-occurrence matrix to obtain an enhanced medical entity co-occurrence matrix so as to expand the medical entity co-occurrence matrix, setting an initial vector representation of each medical entity, inputting the initial vector representation into a GloVe model, and obtaining each medical entity vector representation through training of an objective function, wherein the objective function J is as follows:
Wherein M ij is the co-occurrence product of the ith entity vector and the jth entity vector in the enhanced medical entity co-occurrence matrix, |D| is the number of medical entities, e j is the vector representation of the jth medical entity, e i is the vector representation of the ith medical entity, b i is the bias parameter of the ith medical entity, and b j is the bias parameter of the jth medical entity.
And representing as a patient vector an aggregate result of the plurality of medical entity vector representations associated with the patient, the aggregate result comprising a sum, an average, a maximum or a minimum.
S2: using each medical entity vector representation as an initial input of each medical entity node, using each patient vector representation as an initial input of each patient node, connecting related medical entities to obtain a real link relationship between medical entities, and connecting related medical entities with a patient to obtain a real link relationship between the patient and the medical entities; to construct an electronic medical record heterogram.
S3: inputting the electronic medical record iso-graph into GRAPHSAGE-graph neural network, obtaining current-layer neighbor node vector representation by means of a sum calculation method through the link real relationship among medical entities and the link real relationship between patients and the medical entities and through aggregator aggregation of GRAPHSAGEThe method comprises the following steps:
Wherein R is the link real relationship between medical entities or the link real relationship between a patient and the medical entities, R is the set of the link real relationship, u is the neighbor node, v is the current node, N (r) (v) is the neighbor node of the current node v on the R link real relationship, For the neighbor node vector representation of the previous layer, l is the current layer, AGGREGATE (·) is the aggregation operation that merges together the neighbor information of all current nodes v.
Performing Mean aggregator aggregation on the current layer neighbor node vector representation and the previous layer output vector representation of the patient node to obtain a patient node output vector representation, and performing Mean aggregator aggregation on the current layer neighbor node vector representation and the previous layer output vector representation of the medical entity node to obtain a medical entity node output vector representation;
Medical entity node output vector representation The method comprises the following steps:
Wherein, For the vector representation of the medical entity node d of the previous layer, W d is the weight parameter of the medical entity node d, MEAN (·) is an averaging function, and sigma (·) is an activating function;
patient node output vector representation The method comprises the following steps:
Wherein, For the vector representation of the patient node p of the previous layer, W p is the weight parameter of the patient node p.
Obtaining the probability of the link relation between the medical entity and the patient by adopting an activation function based on the patient node output vector representation and the medical entity node output vector representationThe method comprises the following steps:
Where z d is the output vector representation of the medical entity node, z p is the patient node output vector representation, and δ (·) is the activation function.
Obtaining the probability of the link relation between the medical entities by adopting an activation function based on the output vector representation of the medical entity nodesIs that;
wherein z d′ is another medical entity node output vector representation.
S4: constructing a total loss function: as shown in fig. 2, node information of the heterogram G is calculated by using a graph neural network, and a first loss function is constructed by a patient-medical entity link true relationship (the patient-medical entity link true relationship is 1 if there is a link relationship, and 0 if there is no link relationship) and a cross entropy of the patient-medical entity link relationship probability to train a patient-medical entity link prediction task L 1;
Constructing a second loss function through the cross entropy of the link true relationship between the medical entities and the probability of the link relationship between the medical entities to train the medical entity-medical entity relationship link prediction task L 2;
Constructing a multi-task weighted loss function learning weight factor eta through the loss value of the first loss function and the loss value of the second loss function;
The training method using the multi-task weighted loss function combines the two loss functions to optimize simultaneously, and the multi-task weighted loss function L is:
Wherein, Is an index of the mth loss function weighting factor eta m,And (3) ending training to obtain a weight factor eta for the loss value of the mth loss function if the weight factor eta is converged, and continuously calculating node information of the heterograph G by using the graph neural network if the weight factor eta is not converged.
Training GRAPHSAGE the graph neural network by using the total loss function, and updating parameters to obtain a final GRAPHSAGE graph neural network;
s5: when the method is applied, the medical entity vector representation and the patient vector representation are input into the final GRAPHSAGE-diagram neural network to predict and obtain the association probability of the medical entity and the patient.
Based on the method, the relation range between the medical entities and the patient is enlarged, so that the association degree between the medical entities and the patient can be accurately predicted.
Claims (4)
1. The electronic medical record retrieval method based on the graph neural network is characterized by comprising the following steps of:
(1) Obtaining a co-occurrence matrix of medical entities in the electronic medical record, traversing medical ontology ICD codes to obtain a plurality of ancestor medical entities corresponding to the medical entities, adding co-occurrence information of the medical entities and the ancestor medical entities into the medical entity co-occurrence matrix to obtain an enhanced medical entity co-occurrence matrix, extracting each medical entity vector representation by adopting a GloVe model based on the enhanced medical entity co-occurrence matrix, and taking an aggregate result of the plurality of medical entity vector representations associated with a patient as a patient vector representation;
(2) Constructing an electronic medical record heterogram, wherein the electronic medical record heterogram comprises medical entity nodes, patient nodes, a real link relationship between medical entities and a real link relationship between a patient and the medical entities;
the method comprises the steps of using each medical entity vector as an initial attribute of each medical entity node, using each patient vector as an initial attribute of each patient node, connecting related medical entities to obtain a real link relationship between medical entities, and connecting the related medical entities with a patient to obtain a real link relationship between the patient and the medical entities;
(3) Inputting the electronic medical record iso-graph into GRAPHSAGE graph neural network to respectively obtain patient node output vector representation and medical entity node output vector representation; obtaining the probability of the link relation between the patient and the medical entity by adopting an activation function based on the patient node output vector representation and the medical entity node output vector representation; obtaining the probability of the link relation between the medical entities by adopting an activation function based on the medical entity node output vector representation;
(4) Constructing a total loss function, wherein the total loss function comprises a first loss function, a second loss function and a multi-task weighted loss function;
Constructing a first loss function through the cross entropy of the patient and medical entity link true relationship and the patient and medical entity link relationship probability;
Constructing a second loss function through the cross entropy of the link true relationship between the medical entities and the probability of the link relationship between the medical entities;
Constructing a multitasking weighted loss function by the loss value of the first loss function and the loss value of the second loss function;
(5) Training GRAPHSAGE the graph neural network by using the total loss function, and updating parameters to obtain a final GRAPHSAGE graph neural network;
(6) When the method is applied, the medical entity vector representation and the patient vector representation are input into a final GRAPHSAGE graph neural network to predict and obtain the association probability of the medical entity and the patient;
Obtaining the co-occurrence matrix of the medical entity in the electronic medical record comprises:
taking the frequency product of each patient in each visit record of every two medical entities as the co-occurrence information of every two medical entities, constructing a co-occurrence matrix of each visit record based on the co-occurrence information of every two medical entities, adding the co-occurrence matrices of the multiple visit records of each patient to obtain a co-occurrence matrix of each patient electronic medical record, and adding the co-occurrence matrices of a plurality of patient electronic medical records to obtain a co-occurrence matrix of the medical entities in the electronic medical record;
Traversing the ICD codes to obtain a plurality of ancestor medical entities corresponding to the medical entities, including:
taking each medical entity as a leaf node, traversing the medical ontology ICD codes from bottom to top to obtain a plurality of ancestor nodes corresponding to the leaf nodes, extracting the medical entities corresponding to the ancestor nodes to obtain ancestor medical entities, obtaining the co-occurrence information of each medical entity and the ancestor medical entities in the medical ontology ICD codes, and adding the co-occurrence information into a medical entity co-occurrence matrix to expand the medical entity co-occurrence matrix;
extracting each medical entity vector representation using GloVe models based on the enhanced medical entity co-occurrence matrix, including:
setting an initial vector representation of each medical entity, inputting the initial vector representation into a GloVe model, and obtaining the vector representation of each medical entity through training an objective function, wherein the objective function J is as follows:
Wherein M ij is the co-occurrence product of the ith entity vector and the jth entity vector in the enhanced medical entity co-occurrence matrix, |d| is the number of medical entities, e j is the vector representation of the jth medical entity, e i is the vector representation of the ith medical entity, b i is the bias parameter of the ith medical entity, and b j is the bias parameter of the jth medical entity;
Inputting the electronic medical record iso-graph to GRAPHSAGE graph neural network to respectively obtain patient node output vector representation and medical entity node output vector representation, including:
Performing Mean aggregator aggregation on the current layer neighbor node vector representation and the previous layer output vector representation of the patient node to obtain a patient node output vector representation, and performing Mean aggregator aggregation on the current layer neighbor node vector representation and the previous layer output vector representation of the medical entity node to obtain a medical entity node output vector representation;
wherein the current layer neighbor node vector represents The method comprises the following steps:
Wherein R is the link real relationship between medical entities or the link real relationship between a patient and the medical entities, R is the set of the link real relationship, u is the neighbor node, v is the current node, N (r) (v) is the neighbor node of the current node v on the R link real relationship, For the neighbor node vector representation of the previous layer, l is the current layer, AGGREGATE (·) is the aggregation operation of merging the neighbor information of the current node v together;
Medical entity node output vector representation The method comprises the following steps:
Wherein, For the vector representation of the medical entity node d of the previous layer, W d is the weight parameter of the medical entity node d, MEAN (·) is an averaging function, and sigma (·) is an activating function;
patient node output vector representation The method comprises the following steps:
Wherein, For the vector representation of the patient node p of the previous layer, W p is the weight parameter of the patient node p.
2. The electronic medical record retrieval method based on a graph neural network of claim 1, wherein the aggregate result of the plurality of medical entity vector representations associated with the patient is represented as a patient vector, and the aggregate operation includes summing, averaging, maximum or minimum.
3. The electronic medical record retrieval method based on a graph neural network according to claim 1, wherein the multitasking weighted loss function L is:
Wherein, Is an index of the mth loss function weighting factor eta m,The loss value of the mth loss function.
4. An electronic medical record retrieval device based on a graph neural network, comprising a computer memory, a computer processor and a computer program stored in the computer memory and executable on the computer processor, wherein the computer memory adopts the final GRAPHSAGE graph neural network model as claimed in any one of claims 1 to 3;
the computer processor, when executing the computer program, performs the steps of:
And inputting the medical entity vector representation and the patient vector representation into a final GRAPHSAGE-diagram neural network to predict and obtain the association probability of the medical entity and the patient.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210291079.2A CN114639483B (en) | 2022-03-23 | 2022-03-23 | Electronic medical record retrieval method and device based on graphic neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210291079.2A CN114639483B (en) | 2022-03-23 | 2022-03-23 | Electronic medical record retrieval method and device based on graphic neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114639483A CN114639483A (en) | 2022-06-17 |
CN114639483B true CN114639483B (en) | 2024-10-18 |
Family
ID=81949527
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210291079.2A Active CN114639483B (en) | 2022-03-23 | 2022-03-23 | Electronic medical record retrieval method and device based on graphic neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114639483B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114943314B (en) * | 2022-07-26 | 2023-03-24 | 牛津大学(苏州)科技有限公司 | ICD (interface control document) diagnosis code-based object partitioning method, storage medium and electronic medical record system |
CN115083616B (en) * | 2022-08-16 | 2022-11-08 | 之江实验室 | Chronic nephropathy subtype mining system based on self-supervision graph clustering |
CN116564535B (en) * | 2023-05-11 | 2024-02-20 | 之江实验室 | Central disease prediction method and device based on local graph information exchange under privacy protection |
CN116821375B (en) * | 2023-08-29 | 2023-12-22 | 之江实验室 | Cross-institution medical knowledge graph representation learning method and system |
CN116936108B (en) * | 2023-09-19 | 2024-01-02 | 之江实验室 | Unbalanced data-oriented disease prediction system |
CN118299064B (en) * | 2024-06-04 | 2024-10-29 | 湖南工商大学 | Rare disease-based graph model training method, application method and related equipment |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112507720A (en) * | 2020-11-12 | 2021-03-16 | 西安交通大学 | Graph convolution network root identification method based on causal semantic relation transfer |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11481418B2 (en) * | 2020-01-02 | 2022-10-25 | International Business Machines Corporation | Natural question generation via reinforcement learning based graph-to-sequence model |
-
2022
- 2022-03-23 CN CN202210291079.2A patent/CN114639483B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112507720A (en) * | 2020-11-12 | 2021-03-16 | 西安交通大学 | Graph convolution network root identification method based on causal semantic relation transfer |
Also Published As
Publication number | Publication date |
---|---|
CN114639483A (en) | 2022-06-17 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114639483B (en) | Electronic medical record retrieval method and device based on graphic neural network | |
Felix et al. | A review on methods and software for fuzzy cognitive maps | |
Srinivasu et al. | From blackbox to explainable AI in healthcare: existing tools and case studies | |
Ganji et al. | A fuzzy classification system based on Ant Colony Optimization for diabetes disease diagnosis | |
Azevedo et al. | Hybrid approaches to optimization and machine learning methods: a systematic literature review | |
Poornima et al. | A survey of predictive analytics using big data with data mining | |
CN111696661B (en) | Patient grouping model construction method, patient grouping method and related equipment | |
Rahman et al. | Discretization of continuous attributes through low frequency numerical values and attribute interdependency | |
EP3832487A1 (en) | Systems and methods driven by link-specific numeric information for predicting associations based on predicate types | |
Biswas et al. | Hybrid expert system using case based reasoning and neural network for classification | |
Shan et al. | The data-driven fuzzy cognitive map model and its application to prediction of time series | |
WO2024067373A1 (en) | Data processing method and related apparatus | |
Sandhya et al. | Tailored feedforward artificial neural network based link prediction | |
Parouha et al. | An innovative hybrid algorithm for bound-unconstrained optimization problems and applications | |
CN110299194B (en) | Similar case recommendation method based on comprehensive feature representation and improved wide-depth model | |
Amiriebrahimabadi et al. | A comprehensive survey of feature selection techniques based on whale optimization algorithm | |
CN114463596B (en) | Small sample image recognition method, device and equipment for hypergraph neural network | |
Ullah et al. | Heart disease classification using various heuristic algorithms | |
Yao et al. | A Novel Tropical Geometry-Based Interpretable Machine Learning Method: Pilot Application to Delivery of Advanced Heart Failure Therapies | |
Silva et al. | Continuous learning of the structure of bayesian networks: a mapping study | |
CN112070200B (en) | Harmonic group optimization method and application thereof | |
WO2021059527A1 (en) | Learning device, learning method, and recording medium | |
Rakhmetulayeva et al. | Building Disease Prediction Model Using Machine Learning Algorithms on Electronic Health Records' Logs. | |
Fister et al. | Continuous optimizers for automatic design and evaluation of classification pipelines | |
CN115238075A (en) | Text emotion classification method based on hypergraph pooling |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |