CN117153391A - Health management method based on lung nodule prediction probability and related equipment thereof - Google Patents
Health management method based on lung nodule prediction probability and related equipment thereof Download PDFInfo
- Publication number
- CN117153391A CN117153391A CN202311083935.6A CN202311083935A CN117153391A CN 117153391 A CN117153391 A CN 117153391A CN 202311083935 A CN202311083935 A CN 202311083935A CN 117153391 A CN117153391 A CN 117153391A
- Authority
- CN
- China
- Prior art keywords
- entity
- lung nodule
- probability
- target
- lung
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 206010056342 Pulmonary mass Diseases 0.000 title claims abstract description 175
- 230000036541 health Effects 0.000 title claims abstract description 62
- 238000007726 management method Methods 0.000 title claims abstract description 57
- 238000000034 method Methods 0.000 claims abstract description 29
- 239000013598 vector Substances 0.000 claims description 70
- 238000000605 extraction Methods 0.000 claims description 49
- 238000012549 training Methods 0.000 claims description 41
- 210000004072 lung Anatomy 0.000 claims description 29
- 238000002372 labelling Methods 0.000 claims description 28
- 230000006870 function Effects 0.000 claims description 19
- 230000002685 pulmonary effect Effects 0.000 claims description 19
- 238000012795 verification Methods 0.000 claims description 17
- 238000012360 testing method Methods 0.000 claims description 13
- 230000002159 abnormal effect Effects 0.000 claims description 5
- 230000005856 abnormality Effects 0.000 claims description 4
- 238000011282 treatment Methods 0.000 abstract description 12
- 238000013473 artificial intelligence Methods 0.000 abstract description 4
- 238000005516 engineering process Methods 0.000 abstract description 3
- 230000000391 smoking effect Effects 0.000 description 9
- 208000019901 Anxiety disease Diseases 0.000 description 7
- 230000036506 anxiety Effects 0.000 description 7
- 201000011510 cancer Diseases 0.000 description 5
- 238000004891 communication Methods 0.000 description 5
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 4
- 201000005202 lung cancer Diseases 0.000 description 4
- 208000020816 lung neoplasm Diseases 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 238000012545 processing Methods 0.000 description 4
- 206010028980 Neoplasm Diseases 0.000 description 3
- 238000002591 computed tomography Methods 0.000 description 3
- 238000010276 construction Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 201000010099 disease Diseases 0.000 description 3
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 239000002699 waste material Substances 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 230000006835 compression Effects 0.000 description 2
- 238000007906 compression Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000003745 diagnosis Methods 0.000 description 2
- 230000008451 emotion Effects 0.000 description 2
- 239000000284 extract Substances 0.000 description 2
- 206010061818 Disease progression Diseases 0.000 description 1
- 206010057190 Respiratory tract infections Diseases 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 238000003915 air pollution Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004159 blood analysis Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000037213 diet Effects 0.000 description 1
- 235000005911 diet Nutrition 0.000 description 1
- 230000005750 disease progression Effects 0.000 description 1
- 235000019800 disodium phosphate Nutrition 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000000835 fiber Substances 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000007689 inspection Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000005823 lung abnormality Effects 0.000 description 1
- 230000004199 lung function Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000036210 malignancy Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000005055 memory storage Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 208000020029 respiratory tract infectious disease Diseases 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 238000004092 self-diagnosis Methods 0.000 description 1
- 230000005586 smoking cessation Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 238000011269 treatment regimen Methods 0.000 description 1
- 210000002700 urine Anatomy 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/0002—Inspection of images, e.g. flaw detection
- G06T7/0012—Biomedical image inspection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/74—Image or video pattern matching; Proximity measures in feature spaces
- G06V10/75—Organisation of the matching processes, e.g. simultaneous or sequential comparisons of image or video features; Coarse-fine approaches, e.g. multi-scale approaches; using context analysis; Selection of dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/764—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using classification, e.g. of video objects
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/766—Arrangements for image or video recognition or understanding using pattern recognition or machine learning using regression, e.g. by projecting features on hyperplanes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/70—Arrangements for image or video recognition or understanding using pattern recognition or machine learning
- G06V10/77—Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
- G06V10/774—Generating sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H10/00—ICT specially adapted for the handling or processing of patient-related medical or healthcare data
- G16H10/60—ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H30/00—ICT specially adapted for the handling or processing of medical images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10072—Tomographic images
- G06T2207/10081—Computed x-ray tomography [CT]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30061—Lung
- G06T2207/30064—Lung nodule
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30004—Biomedical image processing
- G06T2207/30096—Tumor; Lesion
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Computing Systems (AREA)
- Public Health (AREA)
- Software Systems (AREA)
- Multimedia (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Radiology & Medical Imaging (AREA)
- Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
- Computational Linguistics (AREA)
- Pathology (AREA)
- Biomedical Technology (AREA)
- Quality & Reliability (AREA)
- Probability & Statistics with Applications (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
Abstract
The application belongs to the field of artificial intelligence, is applied to the field of digital medical treatment, and relates to a health management method based on lung nodule prediction probability and related equipment thereof, wherein the method comprises the steps of extracting target basic information and target lung nodule characteristic information from acquired CT image report data to be detected; acquiring a target lung nodule risk factor from target basic information and the target lung nodule characteristic information based on the constructed lung nodule probability prediction model; inputting the target lung nodule risk factors into a lung nodule probability prediction model to obtain lung nodule prediction probability; determining a lung nodule risk level for the target patient based on the predictive probability; and matching a corresponding health management scheme for the target patient according to the risk level, and sending the target patient. In addition, the application also relates to a blockchain technology, and CT image report data can be stored in the blockchain. The application can enable the patient to intuitively know the health condition of the patient, realize the individual management of the patient and improve the treatment efficiency.
Description
Technical Field
The application relates to the field of artificial intelligence technology and data medical treatment, in particular to a health management method based on lung nodule prediction probability and related equipment thereof.
Background
Hospital informatization is a trend, medical information data is supported by the current hospital basic configuration, but a large number of unstructured descriptions exist in electronic cases, the capability of data management and the association capability between data and knowledge are lacked, and the capability of big data and artificial intelligence cannot be directly applied. Free text needs to be converted into normalized, standardized and structured data through scientific data analysis and mining, and analyzed and utilized.
CT (Computed Tomography, electronic computed tomography) image text reporting of traditional medical scenes has a lot of professional description, and the patient has difficulty in diagnosing and understanding the image report due to the fact that the patient is not knowledgeable about medical knowledge and does not understand medical terms. Particularly when abnormality is reported, a certain anxiety emotion is generated. Because the patient cannot obtain convenient self-diagnosis service, the current physical health condition cannot be judged correctly. Some patients with mild symptoms may not feel confident and cause excessive medical treatment, and some patients with high risk may not pay attention to medical resource mismatch such as disease delay.
In addition, in the short time of the clinic, the doctor may not fully explain the illness state to the patient, and the patient is likely to be guided by the wrong medical information on the network, so that the pressure of the patient is multiplied, a great deal of manpower, material resources and financial resources are wasted, and the medical care service cannot be obtained to the maximum extent.
Disclosure of Invention
The embodiment of the application aims to provide a health management method, a device, computer equipment and a storage medium based on lung nodule prediction probability, which are used for solving the technical problems that in the prior art, patients have difficulty in diagnosing and understanding an image report, the current physical health condition cannot be judged correctly, misleading is easy, and medical care service cannot be obtained to the maximum extent.
In order to solve the above technical problems, the embodiment of the application provides a health management method based on lung nodule prediction probability, which adopts the following technical scheme:
acquiring CT image report data to be detected of a target patient, and extracting target basic information and target lung nodule characteristic information from the CT image report data to be detected;
obtaining a target lung nodule risk factor from the target basic information and the target lung nodule characteristic information based on the constructed lung nodule probability prediction model;
Inputting the target lung nodule risk factor into the lung nodule probability prediction model to obtain lung nodule prediction probability;
determining a lung nodule risk level for the target patient based on the predictive probability;
and matching a corresponding health management scheme for the target patient according to the risk level, and sending the health management scheme to the target patient.
Further, before the step of acquiring the CT image report data to be measured of the target patient, the method further includes:
acquiring historical CT image text report data, and extracting basic information and historical lung nodule characteristic information of a patient from the historical CT image text report data;
performing entity relation labeling on the historical lung nodule characteristic information to obtain entity relation labeling data;
training based on the entity relationship labeling data to obtain an entity relationship joint extraction model;
acquiring the latest CT image report data of the patient according to the basic information, and inputting the latest CT image report data into the entity relationship joint extraction model to extract entity relationships so as to obtain lung entity relationships;
and obtaining a lung nodule probability prediction model by utilizing Logistics regression based on the basic information and the lung entity relationship.
Further, the step of training to obtain the entity relationship joint extraction model based on the entity relationship labeling data comprises the following steps:
dividing the entity relation annotation data into a training set and a testing set according to a preset proportion;
inputting the marked data training set into a pre-constructed initial model, and outputting a predicted entity relation triplet;
calculating a loss function according to the predicted entity relationship triplet and the entity relationship in the training set;
adjusting model parameters of the initial model based on the loss function until the model converges, and outputting a model to be verified;
inputting the test set into the model to be verified for verification to obtain a verification result;
and outputting a final entity relationship joint extraction model when the verification result meets the preset condition.
Further, the initial model includes an input layer, a coding layer, a head entity identification layer and a relationship and tail entity joint identification layer, and the step of inputting the annotation data training set into a pre-constructed initial model and outputting a predicted entity relationship triplet includes:
transmitting the training set to the coding layer through the input layer to perform feature extraction, and obtaining word coding vectors containing context semantic information;
Inputting the word coding vector into the head entity recognition layer for recognition, and outputting all head entities;
fusing each head entity with the corresponding word coding vector to obtain an entity coding vector of each head entity;
inputting the entity coding vector into the relation and tail entity joint identification layer for identification to obtain all specific relations and tail entities of each head entity;
and obtaining a predicted entity relationship triplet based on the head entity, the specific relationship and the tail entity.
Further, the step of inputting the word encoding vector into the header entity recognition layer for recognition and outputting all header entities includes:
calculating probabilities of the word coding vectors serving as a starting position and an ending position respectively through the head entity recognition layer to obtain a first starting probability and a first ending probability;
and obtaining all head entities by utilizing a nearest matching principle according to the first starting probability and the first ending probability.
Further, the step of fusing each head entity with the corresponding word coding vector to obtain an entity coding vector of each head entity includes:
acquiring the word coding vector corresponding to each word in each head entity to obtain an entity word coding vector;
And calculating the average vector of all the entity word coding vectors, and fusing the head entity and the average vector to obtain the entity coding vector.
Further, the step of obtaining the lung nodule probability prediction model by using Logistics regression based on the basic information and the pulmonary entity relationship includes:
acquiring basic influence factors based on the basic information, and extracting lung abnormal influence factors from the lung entity relationship;
taking the basic influence factors and the pulmonary abnormality influence factors as pulmonary nodule risk factors, and analyzing the pulmonary nodule risk factors by using Logistics regression to obtain regression coefficients of each pulmonary nodule risk factor;
and obtaining a lung nodule probability prediction model according to the regression coefficients.
In order to solve the above technical problems, the embodiment of the present application further provides a health management device based on lung nodule prediction probability, which adopts the following technical scheme:
the extraction module is used for acquiring CT image report data to be detected of a target patient and extracting target basic information and target lung nodule characteristic information from the CT image report data to be detected;
the acquisition module is used for acquiring a target lung nodule risk factor from the target basic information and the target lung nodule characteristic information based on the constructed lung nodule probability prediction model;
The probability prediction module is used for inputting the target lung nodule risk factor into the lung nodule probability prediction model to obtain lung nodule prediction probability;
a grade judgment module for determining a lung nodule risk grade of the target patient based on the predictive probability;
and the matching module is used for matching the corresponding health management scheme for the target patient according to the risk level and sending the health management scheme to the target patient.
In order to solve the above technical problems, the embodiment of the present application further provides a computer device, which adopts the following technical schemes:
the computer device includes a memory having stored therein computer readable instructions which when executed implement the steps of the method of health management based on the predicted probability of lung nodules as described above.
In order to solve the above technical problems, an embodiment of the present application further provides a computer readable storage medium, which adopts the following technical schemes:
the computer readable storage medium has stored thereon computer readable instructions which when executed by a processor implement the steps of the method of health management based on lung nodule prediction probabilities as described above.
Compared with the prior art, the embodiment of the application has the following main beneficial effects:
according to the application, the lung nodule probability of the patient is predicted by the constructed lung nodule probability prediction model, and the lung nodule risk level of the patient is estimated by using the lung nodule probability, so that the patient can more intuitively know the health condition of the patient, and the anxiety degree of the patient is reduced; according to the risk grade and the basic information, corresponding health management schemes are matched for the patient, the individual management of the patient is realized, the treatment efficiency is improved, meanwhile, the patient is guaranteed to obtain medical care services to the greatest extent, medical abuse is reduced, surgical abuse is reduced, and waste of a large amount of manpower, material resources and financial resources is avoided.
Drawings
In order to more clearly illustrate the solution of the present application, a brief description will be given below of the drawings required for the description of the embodiments of the present application, it being apparent that the drawings in the following description are some embodiments of the present application, and that other drawings may be obtained from these drawings without the exercise of inventive effort for a person of ordinary skill in the art.
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow chart of one embodiment of a method of health management based on lung nodule prediction probabilities in accordance with the present application;
FIG. 3 is a flow chart of another embodiment of a method of health management based on a predicted probability of a lung nodule according to the present application;
FIG. 4 is a flow chart of one embodiment of step S303 of FIG. 3;
FIG. 5 is a schematic structural view of one embodiment of a health management device based on lung nodule prediction probabilities in accordance with the present application;
FIG. 6 is a schematic structural diagram of one embodiment of a computer device in accordance with the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs; the terminology used in the description of the applications herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application; the terms "comprising" and "having" and any variations thereof in the description of the application and the claims and the description of the drawings above are intended to cover a non-exclusive inclusion. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
In order to make the person skilled in the art better understand the solution of the present application, the technical solution of the embodiment of the present application will be clearly and completely described below with reference to the accompanying drawings.
The application provides a health management method based on lung nodule prediction probability, which relates to artificial intelligence, and can be applied to a system architecture 100 shown in fig. 1, wherein the system architecture 100 can comprise terminal devices 101, 102 and 103, a network 104 and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as a web browser application, a shopping class application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
It should be noted that, the health management method based on the lung nodule prediction probability provided by the embodiment of the present application is generally executed by a server/terminal device, and accordingly, the health management apparatus based on the lung nodule prediction probability is generally disposed in the server/terminal device.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to fig. 2, a flowchart of one embodiment of a method of health management based on lung nodule prediction probabilities according to the present application is shown, comprising the steps of:
step S201, CT image report data to be detected of a target patient is obtained, and target basic information and target lung nodule characteristic information are extracted from the CT image report data to be detected.
The potential lung cancer micro focus of the patient is discovered and treated in time in the early stage, and the method has an important effect on reducing the morbidity and mortality of the lung cancer patient. Lung nodules are one of the most important signs in lung cancer, and chest CT examination is a common routine assessment of lung nodules and can display information on lung nodule location, size, morphology, density, edges, internal features, etc.
In this embodiment, CT image report data of a target patient to be measured is obtained, where the CT image report data includes basic information of the patient and lung nodule feature information, and the basic information includes, but is not limited to, age, sex, smoking condition (whether smoking, smoking amount, smoking history, smoking stopping time, etc.), current tumor medical history, past tumor medical history; pulmonary nodule characteristic information includes, but is not limited to, lung location, nodule type, nodule diameter size, burr sign, and the like.
It should be emphasized that, to further ensure the privacy and security of the CT image report data, the CT image report data may also be stored in a node of a blockchain.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
Step S202, obtaining target lung nodule risk factors from target basic information and target lung nodule characteristic information based on the constructed lung nodule probability prediction model.
The lung nodule probability prediction model is obtained by using a logic regression model to perform variable analysis, wherein variables are lung nodule risk factors, and the lung nodule risk factors comprise age, smoking history, malignant tumor history (extrathoracic malignant tumor history before lung nodule discovery), nodule diameter, burr sign, nodule position and the like. Logistics regression model in θ 0 +θ 1 x 1 +…θ n x n Is a weight function, wherein θ 0 And theta i Is a model parameter, x i Is a lung nodule risk factor. Logistics regression model Obtaining a lung nodule probability prediction model according to the weight function and the probability function as a probability function:
the output result of the lung nodule probability prediction model is 0-1, wherein smoking history, malignant tumor history, burr sign and the like are qualitative indexes, the "yes" is recorded as 1, and the "no" is recorded as 0.
In this embodiment, the lung nodule risk factor according to the lung nodule probability prediction model extracts corresponding information from the target basic information and the target lung nodule feature information, and is the target lung nodule risk factor.
Step S203, inputting the target lung nodule risk factor into a lung nodule probability prediction model to obtain the prediction probability of the lung nodule.
In this embodiment, the target lung nodule risk factor is input into the lung nodule probability prediction model, and the lung nodule prediction probability can be calculated.
Step S204, determining the lung nodule risk level of the target patient based on the prediction probability.
The lung nodule risk level can be set according to actual conditions, and the higher the prediction probability is, the higher the lung nodule risk level is, and lung cancer is more easily obtained.
For example, the lung nodule risk is divided into 5 risk classes, wherein 0 < P.ltoreq.0.2 is extremely low risk; p is more than 0.2 and less than or equal to 0.4, and is a low risk; p is more than 0.4 and less than or equal to 0.6, and is a medium risk; p is more than 0.6 and less than or equal to 0.8, and is a high risk; p is more than 0.8 and less than or equal to 1, and is extremely high in risk.
Step S205, matching the corresponding health management scheme for the target patient according to the risk level, and sending the health management scheme to the target patient.
In this embodiment, according to the risk level, the basic information and the lung nodule characteristic information of the target patient are combined to match the corresponding health management scheme for the target patient, so that the personalized management scheme can be formulated for patients with different risk levels, the patient can be helped to control the development of lung nodules, relevant diseases can be prevented and treated, the life quality can be improved, and the medical cost and the social cost of the patient can be reduced.
The health management scheme specifically comprises the following contents:
1) Recommended follow-up frequency: and setting the follow-up frequency according to the risk level and specific condition of the lung nodule patient. Generally, high risk patients require more frequent follow-up, while low risk patients may suitably reduce the number of follow-up. I.e. the higher the risk level, the higher the follow-up frequency
2) Advocates lifestyle: patients with pulmonary nodules need to maintain a healthy lifestyle, including smoking cessation, diet adjustment, proper exercise, and the like. In addition, there is a need to avoid air pollution, respiratory tract infections, and the like, from damaging the lungs.
3) Health education and psychological support: provides the relevant health education of the patient, helps the pulmonary nodule patient to scientifically treat the illness state, improves the health consciousness and the health literacy of the patient, provides the support and care on the professional psychology, and avoids the negative emotion of anxiety, depression and the like of the patient caused by illness.
4) Suggesting an inspection scheme: according to the properties of the lung nodules and the specific conditions of the patient, a personalized treatment scheme is formulated, and the change of the condition of the patient is closely monitored. In addition to periodic chest CT examinations, lung nodule patients also need to undergo lung function examinations, blood examinations, urine examinations, etc. to assess the effects of lung nodules on the body. In addition, the treatment regimen is dynamically adjusted according to the changing condition of the lung nodules and the patient's disease progression.
According to the application, the lung nodule probability of the patient is predicted by the constructed lung nodule probability prediction model, and the lung nodule risk level of the patient is estimated by using the lung nodule probability, so that the patient can more intuitively know the health condition of the patient, and the anxiety degree of the patient is reduced; according to the risk grade and the basic information, corresponding health management schemes are matched for the patient, the individual management of the patient is realized, the treatment efficiency is improved, meanwhile, the patient is guaranteed to obtain medical care services to the greatest extent, medical abuse is reduced, surgical abuse is reduced, and waste of a large amount of manpower, material resources and financial resources is avoided.
In some optional implementations, the method further includes constructing a lung nodule probability prediction model before the step of acquiring the CT image report data to be measured of the target patient, where the method for constructing the lung nodule probability prediction model includes the following steps:
step S301, acquiring historical CT image text report data, and extracting basic information and historical lung nodule characteristic information of a patient from the historical CT image text report data.
In this embodiment, the historical CT image text report data may be text report data generated by a doctor annotating lung nodules in CT images after detecting the clinical lung CT scan images, where the annotation content may include, but is not limited to, lung position, nodule type, nodule diameter size, burr features, etc.
And taking the historical CT image text report data as a sample set, and extracting basic information and historical lung nodule characteristic information of the patient under each historical CT image text report data from the sample set.
And step S302, labeling entity relations on the characteristic information of the historical lung nodule to obtain entity relation labeling data.
And (3) carrying out entity definition and labeling specification according to the related medical information, and clearly extracting hosts and guests with established relations among the entities, wherein the entities comprise lung positions, node types, node diameters, node burrs and the like by way of example.
The entity definition is as follows:
in this embodiment, a labeling label is determined according to a host-guest relationship between entities, the labeling label includes an entity label and a relationship label, and historical lung nodule feature information is labeled according to the entity label and the relationship label to obtain entity relationship labeling data, where the entity relationship labeling data includes (head entity, relationship, tail entity) triples, the head entity is a subject, and the tail entity is a guest.
Step S303, training based on entity relationship labeling data to obtain an entity relationship joint extraction model.
In this embodiment, the entity relationship joint extraction model performs entity relationship joint extraction by using a CasRel-based entity relationship extraction model. The model will identify all possible subjects; then under a given relationship category, objects associated with the subject are identified. The relation is seen as a function fr (sub) - > obj, the head entity sub is an independent variable, the tail entity obj is a dependent variable, the head entity is extracted first, and then the corresponding tail entity is extracted by combining each relation category.
By way of example, taking the case of "the right middle lung sees a lobed nodule shadow, a long diameter of about 15mm, and a surrounding visible burr shadow", the entities are the right middle lung, the lobed nodule shadow, 15mm, and the burr shadow, and the triplets of the entity relationship labeling data include (the right middle lung, the nodule type, the lobed nodule shadow), (the right middle lung, the nodule diameter, 15 mm), and (the right middle lung, the burr feature, the burr shadow).
Training the pre-constructed entity relationship joint extraction model by using entity relationship labeling data to obtain a trained entity relationship joint extraction model, wherein the model can be used for identifying entities in CT image report data.
Step S304, the latest CT image report data of the patient is obtained according to the basic information, and the latest CT image report data is input into the entity relationship joint extraction model to carry out entity relationship extraction, so as to obtain the lung entity relationship.
In this embodiment, the latest CT image report data of the patient is obtained according to the personal information in the basic information, and the latest CT image report data can represent the change condition and the latest disease development of the lung nodule of the patient.
And inputting the latest CT image report data into the entity relationship joint extraction model to extract the entity relationship, thereby obtaining the lung entity relationship.
Step S305, obtaining a lung nodule probability prediction model by utilizing logics regression based on the basic information and the lung entity relationship.
And screening out risk factors causing lung nodules from the basic information and the lung entity relationship as lung nodule factors, and obtaining a lung nodule probability prediction model by using a Logistics regression model.
Specifically, basic influencing factors are obtained based on basic information, and lung abnormal influencing factors are extracted from a lung entity relationship; taking the basic influence factors and the lung abnormality influence factors as lung nodule risk factors, and analyzing the lung nodule risk factors by utilizing Logistics regression to obtain regression coefficients of each lung nodule risk factor; and obtaining a lung nodule probability prediction model according to the regression coefficients.
Wherein the basic influencing factors are age, smoking history and malignant tumor history; the pulmonary abnormality influencing factors are the nodule diameter, the burr sign and the nodule position, the influencing factors are taken as pulmonary nodule risk factors and substituted into a logics regression model for analysis, and the obtained pulmonary nodule probability prediction model is as follows:
wherein x= -6.8272+ (0.0391X age) + (0.7917X smoking history) + (1.3388X malignancy history) + (0.1274X nodule diameter) + (1.0407X spike) + (0.7838X nodule position).
Substituting a specific value of the lung nodule risk factor into the lung nodule probability prediction model to obtain the lung nodule probability. The lung nodule risk factors are all obtained through inquiry or routine examination of the patient, do not involve invasive examination or operation, and reduce the pain to the greatest extent for the patient; the model obtained by using Logics regression is based on the application of a clinical big data method, and the reliability is high.
According to the application, entity relation extraction is carried out by training the entity relation joint extraction model, so that vectors which can better represent the entity are extracted, and the accuracy of entity relation extraction is improved; meanwhile, the lung nodule probability prediction model provided by the application is simple and easy to use, the used indexes are all indexes which can be obtained through routine examination, the model is easy to use, effective intermediate reference information can be provided for further diagnosis and treatment of doctors according to the model, and the model has higher reference value.
In some optional implementations of this embodiment, the step of training to obtain the entity relationship joint extraction model based on the entity relationship labeling data includes:
in step S401, the entity relationship labeling data is divided into a training set and a testing set according to a preset proportion.
Specifically, the training set and the testing set can be divided into the training set and the testing set according to the ratio of 7:3, the training set and the testing set can be divided into the 8:2, the specific ratio can be selected according to actual conditions, and usually, more training set data are compared with testing set data.
Step S402, inputting the annotation data training set into a pre-constructed initial model, and outputting a predicted entity relationship triplet.
In this embodiment, the initial model includes an input layer, an encoding layer, a head entity identification layer, and a relationship and tail entity joint identification layer. Wherein the input layer is used for converting input data into a format which can be processed inside the model.
And transmitting the training set to the coding layer through the input layer to perform feature extraction, and obtaining word coding vectors containing context semantic information. By adopting the pre-trained BERT as the coding layer, more sentence characteristics can be learned, the entity labeling data in the labeling data training set is subjected to characteristic extraction, semantic information of the word in the context is captured, and word coding vectors containing the semantic information are obtained.
Combining word encoding vectors into an input sequence h n The input header entity identification layer identifies and outputs all header entities (subjects). Specifically, calculating probabilities of word coding vectors serving as a starting position and an ending position respectively through a head entity recognition layer to obtain a first starting probability and a first ending probability; and obtaining all head entities by utilizing the nearest matching principle according to the first starting probability and the second ending probability.
It should be noted that, the word encoding vector is a generalized word vector, which may be a vector corresponding to a chinese character or a vector corresponding to an english word, and in this embodiment, a word encoding vector is a token.
In this embodiment, the input sequence h is encoded by the header entity identification layer pair n Decoding is performed and then the start position (start position) and end position (end position) of the entity are obtained to identify all possible header entities. The header entity identification layer specifically adopts the following calculation formula:
wherein,and->Respectively expressed in the input sequence h n The i-th word coding vector is identified as the probability of the start position and the end position of the head entity, if the probability exceeds a preset threshold value, the marking value of the word coding vector is marked as 1, otherwise, the marking value is marked as 0; sigma is a sigmoid activation function, omega start And omega end As trainable weights, b start And b end Representing the bias value; and x is i =h n [i]Representing the input sequence h n The word encoding of the i-th tag in (c).
The most recent matching principle is: obtain all ofAnd->Will be adjacent nearest->Start and 1The end of 1 is used as a complete head entity, the matching method ensures the integrity of the entity, and the accuracy of the identification of the lung nodule head entity in CT image report data is improved.
After the head entity is identified, the joint identification of the relation and the tail entity is carried out, and the structure of each layer of relation and tail entity joint identification layer is the same as that of the head entity identification layer. The obj entity (tail entity) decoding is similar to the sub entity (head entity) decoding, and the relationship differs from the tail entity joint identification layer to the head entity identification layer in that: the word coding vector of the coding layer containing the context semantic information is added during input, and meanwhile, the characteristic information of the head entity extracted by the head entity identification layer is considered, so that the relation modeling is used as a function of mapping the head entity to the tail entity.
Before the input relation and tail entity joint identification layer, fusing each head entity with the corresponding word coding vector to obtain the entity coding vector of each head entity. Specifically, a word coding vector corresponding to each word in each head entity is obtained, and an entity word coding vector is obtained; and calculating the average vector of all the entity word coding vectors, and fusing the head entity and the average vector to obtain the entity coding vector. By the method, the accuracy of the identification of the pulmonary nodule tail entity in the specific relation in the CT image report data can be improved.
And identifying the entity coding vector input relation and the tail entity joint identification layer to obtain all specific relations and tail entities of each head entity.
In this embodiment, the relationship and tail entity joint identification layer specifically uses the following formula to calculate:
wherein,and->Respectively expressed in the input sequence h n The i-th word coding vector in the (ii) is identified as the probability of the starting position and the ending position of the tail entity, if the probability exceeds a preset threshold value, the marking value of the word coding vector is marked as 1, otherwise, the marking value is marked as 0; r represents a specific relationship; />The average vector of all word-encoding vectors between the start position and the end position of the kth head entity, i.e. the entity-encoding vector, is represented.
It should be noted that when the start_o and end_o are subjected to binary judgment, the probability is 0, which indicates that there is no corresponding obj entity under the specific relationship, that is, there is no triplet of the relationship when the sub is the head entity.
According to the embodiment, the entity relation extraction model based on CasRel is adopted, so that the problem of entity overlapping of EPO (Entity Pair Overlap: multiple relations exist between the same pair of entities)/SEO (Single Entity Overlap: association relations exist between a plurality of entities and the same entity) in the entity relation can be solved.
Step S403, calculating a loss function according to the predicted entity relationship triplet and the entity relationship in the training set.
In this embodiment, the loss function is the sum of the head entity identification layer decoding loss L (S) and the two parts of the relation and tail entity joint identification layer decoding loss L (O), and a two-class cross entropy loss function is adopted.
The entity relationship labeling data in the training set comprises all practical entity relationship triples, and the decoding loss L (S) of the head entity recognition layer and the decoding loss L (O) of the joint recognition layer of the relationship and the tail entity can be calculated according to the predicted entity triples and the practical entity relationship triples.
And step S404, adjusting model parameters of the initial model based on the loss function until the model converges, and outputting the model to be verified.
And (3) adjusting model parameters according to the loss function, continuing to perform iterative training, training the model to a certain extent, wherein at the moment, the performance of the model reaches an optimal state, the loss function value cannot be continuously reduced, namely convergence is achieved, and the model to be verified is output.
Step S405, inputting the test set into a model to be verified for verification, and obtaining a verification result.
In order to accurately verify the extraction effect of the model and comprehensively analyze each performance of the model, the embodiment uses three evaluation indexes commonly used in machine learning to evaluate the experimental result: accuracy P (Precision), recall R (Recall), and F value (F-Score).
Specifically, the test set is input into a model to be verified, the extraction result is output, and the accuracy, recall and F value are calculated based on the extraction result and the actual entity relation triplet.
Step S406, when the verification result meets the preset condition, outputting a final entity relationship joint extraction model.
When the verification result meets the preset condition, namely the accuracy, the recall rate and the F value reach the preset threshold, the entity relation extraction effect of the model is optimal, the entity relation in the CT image text report data can be effectively extracted, and the validity of the entity relation joint extraction method is verified. And when the verification result does not meet the preset condition, the training set can be acquired again for training until the verification result meets the preset condition.
The application is operational with numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, hand-held or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, distributed computing environments that include any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by computer readable instructions stored in a computer readable storage medium that, when executed, may comprise the steps of the embodiments of the methods described above. The storage medium may be a nonvolatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.
With further reference to fig. 5, as an implementation of the method shown in fig. 2, the present application provides an embodiment of a health management device based on a lung nodule prediction probability, where the embodiment of the device corresponds to the embodiment of the method shown in fig. 2, and the device is particularly applicable to various electronic devices.
As shown in fig. 5, the health management device 500 according to the present embodiment based on the lung nodule prediction probability includes: extraction module 501, acquisition module 502, probability prediction module 503, rank judgment module 504, and matching module 505. Wherein:
the extraction module 501 is used for acquiring CT image report data to be detected of a target patient, and extracting target basic information and target lung nodule characteristic information from the CT image report data to be detected;
the obtaining module 502 is configured to obtain a target lung nodule risk factor from the target basic information and the target lung nodule feature information based on the constructed lung nodule probability prediction model;
the probability prediction module 503 is configured to input the target lung nodule risk factor into the lung nodule probability prediction model to obtain a lung nodule prediction probability;
the level judgment module 504 is used for determining a lung nodule risk level of the target patient based on the prediction probability;
The matching module 505 is configured to match a corresponding health management scheme for the target patient according to the risk level, and send the health management scheme to the target patient.
It should be emphasized that, to further ensure the privacy and security of the CT image report data, the CT image report data may also be stored in a node of a blockchain.
Based on the health management device based on the lung nodule prediction probability, the lung nodule probability of the patient is predicted through the constructed lung nodule probability prediction model, and the lung nodule risk level of the patient is estimated through the lung nodule probability, so that the patient can more intuitively know the health condition of the patient, and the anxiety degree of the patient is reduced; according to the risk grade and the basic information, corresponding health management schemes are matched for the patient, the individual management of the patient is realized, the treatment efficiency is improved, meanwhile, the patient is guaranteed to obtain medical care services to the greatest extent, medical abuse is reduced, surgical abuse is reduced, and waste of a large amount of manpower, material resources and financial resources is avoided.
In some alternative implementations, the health management device 500 based on the lung nodule prediction probability further includes a labeling module, a training module, an entity extraction module, and a construction module, wherein:
The acquisition module is used for acquiring historical CT image text report data, and extracting basic information and historical lung nodule characteristic information of a patient from the historical CT image text report data;
the labeling module is used for labeling the entity relationship of the historical lung nodule characteristic information to obtain entity relationship labeling data;
the training module is used for training based on the entity relationship labeling data to obtain an entity relationship joint extraction model;
the entity extraction module is used for acquiring the latest CT image report data of the patient according to the basic information, inputting the latest CT image report data into the entity relation joint extraction model for entity relation extraction, and obtaining a lung entity relation;
the construction module is used for obtaining a lung nodule probability prediction model by utilizing Logistics regression based on the basic information and the lung entity relation.
Extracting the entity relationship by training the entity relationship joint extraction model to obtain a vector capable of better representing the entity, and improving the accuracy of extracting the entity relationship; meanwhile, the lung nodule probability prediction model provided by the application is simple and easy to use, the used indexes are all indexes which can be obtained through routine examination, the model is easy to use, effective intermediate reference information can be provided for further diagnosis and treatment of doctors according to the model, and the model has higher reference value.
In this embodiment, the training module includes a dividing sub-module, an entity prediction sub-module, a loss sub-module, an adjustment sub-module, and a verification sub-module, where:
the dividing sub-module is used for dividing the entity relation annotation data into a training set and a testing set according to a preset proportion;
the entity prediction sub-module is used for inputting the marked data training set into a pre-constructed initial model and outputting a predicted entity relation triplet;
the loss submodule is used for calculating a loss function according to the predicted entity relation triplet and the entity relation in the training set;
the adjusting submodule is used for adjusting model parameters of the initial model based on the loss function until the model converges and outputting a model to be verified;
the verification sub-module is used for inputting the test set into the model to be verified for verification, and obtaining a verification result; and outputting a final entity relationship joint extraction model when the verification result meets the preset condition.
The embodiment can effectively extract the entity relationship in the CT image text report data, and simultaneously proves the effectiveness of the entity relationship joint extraction method.
In some optional implementations of the present embodiment, the entity prediction submodule includes a feature extraction unit, a head entity identification unit, a fusion unit, a relationship and tail entity identification unit, and a combination unit, where:
The feature extraction unit is used for transmitting the training set to the coding layer through the input layer to perform feature extraction, so as to obtain word coding vectors containing context semantic information;
the head entity recognition unit is used for inputting the word coding vector into the head entity recognition layer for recognition and outputting all head entities;
the fusion unit is used for fusing each head entity with the corresponding word coding vector to obtain the entity coding vector of each head entity;
the relation and tail entity identification unit is used for identifying the entity coding vector input relation and tail entity joint identification layer to obtain all specific relations and tail entities of each head entity;
the combining unit is used for obtaining a predicted entity relationship triplet based on the head entity, the specific relationship and the tail entity.
The embodiment can solve the EPO/SEO entity overlapping problem in the entity relationship.
In this embodiment, the header entity identifying unit is further configured to: calculating probabilities of the word coding vectors serving as a starting position and an ending position respectively through the head entity recognition layer to obtain a first starting probability and a first ending probability; and obtaining all head entities by utilizing a nearest matching principle according to the first starting probability and the second ending probability.
The embodiment can improve the accuracy of the identification of the pulmonary nodule head entity in the CT image report data.
In this embodiment, the fusion unit is further configured to: acquiring the word coding vector corresponding to each word in each head entity to obtain an entity word coding vector; and calculating the average vector of all the entity word coding vectors, and fusing the head entity and the average vector to obtain the entity coding vector.
The embodiment can improve the accuracy of the identification of the pulmonary nodule tail entity under the specific relation in the CT image report data.
In some alternative implementations, the build module includes an extraction sub-module, an analysis sub-module, and a build sub-module, wherein:
the extraction submodule is used for acquiring basic influence factors based on the basic information and extracting lung abnormal influence factors from the lung entity relationship;
the analysis submodule is used for taking the basic influence factors and the lung abnormal influence factors as lung nodule risk factors, and analyzing the lung nodule risk factors by utilizing Logistics regression to obtain regression coefficients of each lung nodule risk factor;
and the construction submodule is used for obtaining a lung nodule probability prediction model according to the regression coefficients.
The lung nodule risk factors in the embodiment are all obtained through inquiry or routine examination of the patient, do not involve invasive examination or operation, and reduce the pain to the greatest extent for the patient; the model obtained by using Logics regression is based on the application of a clinical big data method, and the reliability is high.
In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 6, fig. 6 is a basic structural block diagram of a computer device according to the present embodiment.
The computer device 6 comprises a memory 61, a processor 62, a network interface 63 communicatively connected to each other via a system bus. It is noted that only computer device 6 having components 61-63 is shown in the figures, but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculations and/or information processing in accordance with predetermined or stored instructions, the hardware of which includes, but is not limited to, microprocessors, application specific integrated circuits (Application Specific Integrated Circuit, ASICs), programmable gate arrays (fields-Programmable Gate Array, FPGAs), digital processors (Digital Signal Processor, DSPs), embedded devices, etc.
The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.
The memory 61 includes at least one type of readable storage media including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the storage 61 may be an internal storage unit of the computer device 6, such as a hard disk or a memory of the computer device 6. In other embodiments, the memory 61 may also be an external storage device of the computer device 6, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash Card (Flash Card) or the like, which are provided on the computer device 6. Of course, the memory 61 may also comprise both an internal memory unit of the computer device 6 and an external memory device. In this embodiment, the memory 61 is typically used to store an operating system and various application software installed on the computer device 6, such as computer readable instructions of a health management method based on the lung nodule prediction probability. Further, the memory 61 may be used to temporarily store various types of data that have been output or are to be output.
The processor 62 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 62 is typically used to control the overall operation of the computer device 6. In this embodiment, the processor 62 is configured to execute computer readable instructions stored in the memory 61 or process data, such as computer readable instructions for executing the health management method based on the predicted probability of lung nodules.
The network interface 63 may comprise a wireless network interface or a wired network interface, which network interface 63 is typically used for establishing a communication connection between the computer device 6 and other electronic devices.
According to the method, the steps of the health management method based on the lung nodule prediction probability in the embodiment are realized when the processor executes the computer readable instructions stored in the memory, the lung nodule probability of a patient is predicted through the constructed lung nodule probability prediction model, and the lung nodule risk level of the patient is estimated through the lung nodule probability, so that the patient can more intuitively know the health condition of the patient, and the anxiety degree of the patient is reduced; according to the risk grade and the basic information, corresponding health management schemes are matched for the patient, the individual management of the patient is realized, the treatment efficiency is improved, and meanwhile, the patient is ensured to obtain medical care services to the maximum extent.
The present application also provides another embodiment, namely, a computer readable storage medium, where computer readable instructions are stored, where the computer readable instructions are executable by at least one processor, so that the at least one processor performs the steps of the health management method based on lung nodule prediction probability, predicts the lung nodule probability of a patient through a constructed lung nodule probability prediction model, and evaluates the lung nodule risk level of the patient by using the lung nodule probability, so that the patient can more intuitively understand the health condition of the patient, and reduce the anxiety level of the patient; according to the risk grade and the basic information, corresponding health management schemes are matched for the patient, the individual management of the patient is realized, the treatment efficiency is improved, and meanwhile, the patient is ensured to obtain medical care services to the maximum extent.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present application.
It is apparent that the above-described embodiments are only some embodiments of the present application, but not all embodiments, and the preferred embodiments of the present application are shown in the drawings, which do not limit the scope of the patent claims. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a thorough and complete understanding of the present disclosure. Although the application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing description, or equivalents may be substituted for elements thereof. All equivalent structures made by the content of the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the scope of the application.
Claims (10)
1. A method of health management based on lung nodule prediction probability, comprising the steps of:
acquiring CT image report data to be detected of a target patient, and extracting target basic information and target lung nodule characteristic information from the CT image report data to be detected;
obtaining a target lung nodule risk factor from the target basic information and the target lung nodule characteristic information based on the constructed lung nodule probability prediction model;
Inputting the target lung nodule risk factor into the lung nodule probability prediction model to obtain lung nodule prediction probability;
determining a lung nodule risk level for the target patient based on the predictive probability;
and matching a corresponding health management scheme for the target patient according to the risk level, and sending the health management scheme to the target patient.
2. The method of claim 1, further comprising, prior to the step of acquiring the CT image report data to be measured for the target patient:
acquiring historical CT image text report data, and extracting basic information and historical lung nodule characteristic information of a patient from the historical CT image text report data;
performing entity relation labeling on the historical lung nodule characteristic information to obtain entity relation labeling data;
training based on the entity relationship labeling data to obtain an entity relationship joint extraction model;
acquiring the latest CT image report data of the patient according to the basic information, and inputting the latest CT image report data into the entity relationship joint extraction model to extract entity relationships so as to obtain lung entity relationships;
And obtaining a lung nodule probability prediction model by utilizing Logistics regression based on the basic information and the lung entity relationship.
3. The method for health management based on lung nodule prediction probability according to claim 2, wherein the step of training to obtain the entity relationship joint extraction model based on the entity relationship labeling data comprises:
dividing the entity relation annotation data into a training set and a testing set according to a preset proportion;
inputting the marked data training set into a pre-constructed initial model, and outputting a predicted entity relation triplet;
calculating a loss function according to the predicted entity relationship triplet and the entity relationship in the training set;
adjusting model parameters of the initial model based on the loss function until the model converges, and outputting a model to be verified;
inputting the test set into the model to be verified for verification to obtain a verification result;
and outputting a final entity relationship joint extraction model when the verification result meets the preset condition.
4. The method for health management based on lung nodule prediction probability of claim 3, wherein the initial model comprises an input layer, a coding layer, a head entity recognition layer and a relationship and tail entity joint recognition layer, the step of inputting the labeled data training set into the pre-constructed initial model and outputting a predicted entity relationship triplet comprises:
Transmitting the training set to the coding layer through the input layer to perform feature extraction, and obtaining word coding vectors containing context semantic information;
inputting the word coding vector into the head entity recognition layer for recognition, and outputting all head entities;
fusing each head entity with the corresponding word coding vector to obtain an entity coding vector of each head entity;
inputting the entity coding vector into the relation and tail entity joint identification layer for identification to obtain all specific relations and tail entities of each head entity;
and obtaining a predicted entity relationship triplet based on the head entity, the specific relationship and the tail entity.
5. The method of claim 4, wherein the step of inputting the word encoding vector into the head entity recognition layer for recognition and outputting all head entities comprises:
calculating probabilities of the word coding vectors serving as a starting position and an ending position respectively through the head entity recognition layer to obtain a first starting probability and a first ending probability;
and obtaining all head entities by utilizing a nearest matching principle according to the first starting probability and the first ending probability.
6. The method of claim 4, wherein the step of fusing each of the head entities with the corresponding word encoding vector to obtain an entity encoding vector for each of the head entities comprises:
acquiring the word coding vector corresponding to each word in each head entity to obtain an entity word coding vector;
and calculating the average vector of all the entity word coding vectors, and fusing the head entity and the average vector to obtain the entity coding vector.
7. The method of claim 2, wherein the step of obtaining a lung nodule probability prediction model using Logistics regression based on the underlying information and the pulmonary entity relationship comprises:
acquiring basic influence factors based on the basic information, and extracting lung abnormal influence factors from the lung entity relationship;
taking the basic influence factors and the pulmonary abnormality influence factors as pulmonary nodule risk factors, and analyzing the pulmonary nodule risk factors by using Logistics regression to obtain regression coefficients of each pulmonary nodule risk factor;
And obtaining a lung nodule probability prediction model according to the regression coefficients.
8. A health management device based on lung nodule prediction probability, comprising:
the extraction module is used for acquiring CT image report data to be detected of a target patient and extracting target basic information and target lung nodule characteristic information from the CT image report data to be detected;
the acquisition module is used for acquiring a target lung nodule risk factor from the target basic information and the target lung nodule characteristic information based on the constructed lung nodule probability prediction model;
the probability prediction module is used for inputting the target lung nodule risk factor into the lung nodule probability prediction model to obtain lung nodule prediction probability;
a grade judgment module for determining a lung nodule risk grade of the target patient based on the predictive probability;
and the matching module is used for matching the corresponding health management scheme for the target patient according to the risk level and sending the health management scheme to the target patient.
9. A computer device comprising a memory having stored therein computer readable instructions which when executed implement the steps of the lung nodule prediction probability based health management method of any of claims 1 to 7.
10. A computer readable storage medium having stored thereon computer readable instructions which when executed by a processor implement the steps of the lung nodule prediction probability based health management method of any of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311083935.6A CN117153391A (en) | 2023-08-25 | 2023-08-25 | Health management method based on lung nodule prediction probability and related equipment thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311083935.6A CN117153391A (en) | 2023-08-25 | 2023-08-25 | Health management method based on lung nodule prediction probability and related equipment thereof |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117153391A true CN117153391A (en) | 2023-12-01 |
Family
ID=88886022
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311083935.6A Pending CN117153391A (en) | 2023-08-25 | 2023-08-25 | Health management method based on lung nodule prediction probability and related equipment thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117153391A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119296795A (en) * | 2024-12-12 | 2025-01-10 | 宁波大学 | Pulmonary nodule risk prediction system and method for patients with chronic obstructive pulmonary disease |
-
2023
- 2023-08-25 CN CN202311083935.6A patent/CN117153391A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN119296795A (en) * | 2024-12-12 | 2025-01-10 | 宁波大学 | Pulmonary nodule risk prediction system and method for patients with chronic obstructive pulmonary disease |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112949786B (en) | Data classification identification method, device, equipment and readable storage medium | |
US10593423B2 (en) | Classifying medically relevant phrases from a patient's electronic medical records into relevant categories | |
US10553308B2 (en) | Identifying medically relevant phrases from a patient's electronic medical records | |
US11031103B2 (en) | Personalized questionnaire for health risk assessment | |
CN108899064A (en) | Electronic health record generation method, device, computer equipment and storage medium | |
CN113707299B (en) | Auxiliary diagnosis method and device based on inquiry session and computer equipment | |
US10984024B2 (en) | Automatic processing of ambiguously labeled data | |
WO2021120688A1 (en) | Medical misdiagnosis detection method and apparatus, electronic device and storage medium | |
US10936962B1 (en) | Methods and systems for confirming an advisory interaction with an artificial intelligence platform | |
CN110867231A (en) | Disease prediction method, device, computer equipment and medium based on text classification | |
US20240053307A1 (en) | Identifying Repetitive Portions of Clinical Notes and Generating Summaries Pertinent to Treatment of a Patient Based on the Identified Repetitive Portions | |
US11837343B2 (en) | Identifying repetitive portions of clinical notes and generating summaries pertinent to treatment of a patient based on the identified repetitive portions | |
US20240029714A1 (en) | Speech signal processing and summarization using artificial intelligence | |
CN117153391A (en) | Health management method based on lung nodule prediction probability and related equipment thereof | |
CN115858886A (en) | Data processing method, device, equipment and readable storage medium | |
US12087442B2 (en) | Methods and systems for confirming an advisory interaction with an artificial intelligence platform | |
US12073930B1 (en) | Apparatus and a method for generating a user report | |
CN118824484A (en) | Doctor diagnosis and treatment ability evaluation method, system, storage medium and electronic device | |
CN117350291A (en) | Electronic medical record named entity identification method, device, equipment and storage medium | |
CN116796840A (en) | Medical entity information extraction method, device, computer equipment and storage medium | |
Wu et al. | Developing EMR-based algorithms to Identify hospital adverse events for health system performance evaluation and improvement: Study protocol | |
CN111079420B (en) | Text recognition method and device, computer readable medium and electronic equipment | |
CN113764063A (en) | Physical examination report processing method, device, equipment and storage medium | |
CN114882993B (en) | Method, device, medium and electronic equipment for generating explanation element of problem | |
CN117457130A (en) | Text processing method, device, equipment and storage medium thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |