CN109102896A - A kind of method of generating classification model, data classification method and device - Google Patents
A kind of method of generating classification model, data classification method and device Download PDFInfo
- Publication number
- CN109102896A CN109102896A CN201810712862.5A CN201810712862A CN109102896A CN 109102896 A CN109102896 A CN 109102896A CN 201810712862 A CN201810712862 A CN 201810712862A CN 109102896 A CN109102896 A CN 109102896A
- Authority
- CN
- China
- Prior art keywords
- data
- sign data
- index value
- sign
- health
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- Public Health (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Computation (AREA)
- Databases & Information Systems (AREA)
- Biomedical Technology (AREA)
- Primary Health Care (AREA)
- General Health & Medical Sciences (AREA)
- Epidemiology (AREA)
- Pathology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the present application discloses a kind of method of generating classification model, data classification method and device, this method comprises: obtaining original sign data of health, at least one of every original sign data of health includes index value;There are the sign datas that index value lacks for lookup in original sign data of health;Data filling is carried out to index value lacked in the sign data lacked there are index value, generates the sign data after filling up;Sign data using the sign data that index value missing is not present in original sign data of health and after filling up is as training data, preliminary classification model is trained according to training data and every training data corresponding data classification label, generate sign data disaggregated model, sign data disaggregated model generated can classify to any sign data, classification results can assist doctor to diagnose, to which the application is directed to a large amount of original sign data of health, it excavates its internal connection and establishes sign data disaggregated model, improve the utilization rate of original sign data of health.
Description
Technical field
This application involves data processing fields, and in particular to a kind of method of generating classification model and device, a kind of data point
Class method and device.
Background technique
Population base of China is numerous to which the illness size of population is also more, a large amount of case histories can be generated after patient assessment, in disease
It will include a large amount of medical data in example, such as patient carries out the sign data after medical inspection.In the prior art, patient
Case history is usually retained for patient or is consulted for doctor, but a large amount of medical datas are caused there is no effectively excavating, utilizing
Medical data utilization rate is low.
Summary of the invention
In view of this, the embodiment of the present application provides a kind of method of generating classification model and device, a kind of data classification method
And device, realization further analyze sign data, utilize, and improve the utilization rate of medical data.
To solve the above problems, technical solution provided by the embodiments of the present application is as follows:
A kind of method of generating classification model, which comprises
Original sign data of health is obtained, every original sign data of health includes at least one index value;
There are the sign datas that index value lacks for lookup in the original sign data of health;
To described there are the index value progress data filling lacked in the sign data of index value missing, generate after filling up
Sign data;
By in the original sign data of health be not present index value missing sign data and it is described fill up after sign number
According to as training data, according to the training data and the corresponding data classification label of every training data to initial point
Class model is trained, and generates sign data disaggregated model.
In one possible implementation, the method also includes:
It, will be every before the index value lacked in the sign data lacked there are index value carries out data filling
Index value in original sign data of health described in item is normalized.
In one possible implementation, described to the finger lacked in the sign data lacked there are index value
Scale value carries out data filling, generates the sign data after filling up, comprising:
For it is any it is described there are index value missing sign data, determine this there are index value missing sign data in
The corresponding index item of the index value lacked;
It generates this using a variety of data filling algorithms according to the index value of the index item in other original sign data of health and refers to
Mark multiple data filling results of item;
The average value for calculating the multiple data filling result will be lacked in the sign data there are index value missing
The index value of the index item fill up as the average value, generate the sign data after this fills up.
In one possible implementation, the data filling algorithm includes Maximum Likelihood Estimation Method, average value filling
It is any number of in method and approximate polishing method.
In one possible implementation, the preliminary classification model uses model-naive Bayesian or decision tree mould
Type.
A kind of data classification method, which comprises
Sign data to be sorted is obtained, the sign data to be sorted includes at least one index value;
If there are index value missings for the sign data to be sorted, index value carries out data filling to lacking in, will
Sign data to be sorted after filling up inputs sign data disaggregated model, obtains the classification results of the sign data to be sorted;
If there is no index value missings for the sign data to be sorted, the sign data to be sorted is inputted into the body
Data classification model is levied, the classification results of the sign data to be sorted are obtained;
The sign data disaggregated model is generated according to the method for generating classification model.
In one possible implementation, the method also includes:
Index value in the sign data to be sorted is normalized.
In one possible implementation, if there are index value missings for the sign data to be sorted, by institute
The index value lacked carries out data filling, comprising:
It is lacked in the sign data to be sorted if the sign data to be sorted there are index value missing, determines
The corresponding index item of index value;
The index item is generated using a variety of data filling algorithms according to the index value of the index item in original sign data of health
Multiple data filling results;
The average value for calculating the multiple data filling result, the index that will be lacked in the sign data to be sorted
Index value fill up as the average value, the sign data to be sorted after being filled up.
A kind of disaggregated model generating means, described device include:
Acquiring unit, for obtaining original sign data of health, every original sign data of health includes at least one index value;
Searching unit, for searching in the original sign data of health, there are the sign datas that index value lacks;
Shim is filled out for carrying out data to the index value lacked in the sign data lacked there are index value
It mends, generates the sign data after filling up;
Generation unit, for the sign data of index value missing will to be not present in the original sign data of health and described fill out
Sign data after benefit is as training data, according to the training data and the corresponding data classification of every training data
Label is trained preliminary classification model, generates sign data disaggregated model.
In one possible implementation, described device further include:
Normalization unit, the index value for being lacked in the sign data lacked there are index value count
According to before filling up, the index value in every original sign data of health is normalized.
In one possible implementation, the shim specifically includes:
Determine subelement, any described there are the sign data of index value missing for being directed to, determining this, there are index values
The corresponding index item of the index value lacked in the sign data of missing;
Subelement is generated to fill out for the index value according to the index item in other original sign data of health using a variety of data
Algorithm is mended, multiple data filling results of the index item are generated;
Subelement is filled up, the average value of the multiple data filling result is calculated, by the sign there are index value missing
The index value of the index item lacked in data is filled up as the average value, and the sign data after this fills up is generated.
In one possible implementation, the data filling algorithm includes Maximum Likelihood Estimation Method, average value filling
It is any number of in method and approximate polishing method.
In one possible implementation, the preliminary classification model uses model-naive Bayesian or decision tree mould
Type.
A kind of device for classifying data, described device include:
Acquiring unit, for obtaining sign data to be sorted, the sign data to be sorted includes at least one index value;
First obtains unit, if for the sign data to be sorted there are index value missing, the index to lacking in
Value carries out data filling, and the sign data to be sorted after filling up inputs sign data disaggregated model, obtains the body to be sorted
Levy the classification results of data;
Second obtaining unit, if for the sign data to be sorted there is no index value missing, it will be described to be sorted
Sign data inputs the sign data disaggregated model, obtains the classification results of the sign data to be sorted;
The sign data disaggregated model is generated according to the disaggregated model generating means.
In one possible implementation, described device further include:
Normalization unit, for the index value in the sign data to be sorted to be normalized.
In one possible implementation, the first obtains unit specifically includes:
Subelement is determined, for determining the corresponding index item of index value lacked in the sign data to be sorted;
Subelement is generated, for the index value according to the index item in original sign data of health, is calculated using a variety of data fillings
Method generates multiple data filling results of the index item;
Subelement is filled up, for calculating the average value of the multiple data filling result, by the sign data to be sorted
The index value of middle the lacked index item is filled up as the average value, the sign data to be sorted after being filled up.
A kind of computer readable storage medium is stored with instruction in the computer readable storage medium storing program for executing, works as described instruction
When running on the terminal device, so that the terminal device executes the method for generating classification model or the data classification side
Method.
A kind of computer program product, when the computer program product is run on the terminal device, so that the terminal
Equipment executes method of generating classification model or data classification method.
It can be seen that the embodiment of the present application has the following beneficial effects:
After the embodiment of the present application obtains original sign data of health, each index value in original sign data of health is filled up into complete generation
Training data is trained preliminary classification model using the tag along sort of training data and training data, generates sign number
According to disaggregated model, sign data disaggregated model generated can classify to any sign data, and classification results can be auxiliary
It helps doctor to diagnose, so that the embodiment of the present application is directed to a large amount of original sign data of health, has excavated its internal connection and established
Sign data disaggregated model improves the utilization rate of original sign data of health.
Detailed description of the invention
Fig. 1 is a kind of method of generating classification model flow chart provided by the embodiments of the present application;
Fig. 2 is a kind of data filling method flow diagram provided by the embodiments of the present application;
Fig. 3 is disaggregated model training flow chart provided by the embodiments of the present application;
Fig. 4 is a kind of data classification method flow chart provided by the embodiments of the present application;
Fig. 5 is another data filling method flow diagram provided by the embodiments of the present application;
Fig. 6 is data classification flow chart provided by the embodiments of the present application;
Fig. 7 is a kind of disaggregated model generating means structure chart provided by the embodiments of the present application;
Fig. 8 is a kind of device for classifying data structure chart provided by the embodiments of the present application.
Specific embodiment
In order to make the above objects, features, and advantages of the present application more apparent, with reference to the accompanying drawing and it is specific real
Mode is applied to be described in further detail the embodiment of the present application.
Technical solution provided by the present application in order to facilitate understanding below first carries out the research background of technical scheme
Simple declaration.
With the continuous development of computer field, data mining causes the very big concern of every field, and data mining is
Refer to from a large amount of data by algorithm search hide with wherein information and knowledge, so that the information and knowledge that will acquire will be converted
At useful information to instruct subsequent development.
The a large amount of medical datas that can be generated but in existing medical domain, after patient assessment are not carried out effective
It excavates to assist doctor to carry out medical diagnosis, lies on the table so as to cause a large amount of medical datas, cause the waste of medical data.
Based on this, present applicant proposes a kind of method of generating classification model and device, a kind of data classification method and device,
Using a large amount of original sign data of health as training data, its internal connection is excavated and has established sign data disaggregated model, and benefit
It treats classification sign data with the model to classify, so that classification results are supplied to doctor, to assist doctor to carry out medical treatment
Diagnosis, improves the utilization rate of original sign data of health.
Method of generating classification model provided by the embodiments of the present application is introduced below in conjunction with attached drawing.
Referring to Fig. 1, it illustrates a kind of flow chart of method of generating classification model provided by the embodiments of the present application, such as Fig. 1
It is shown, this method comprises:
S101: obtaining original sign data of health, and every original sign data of health includes at least one index value.
In the present embodiment, classify to realize to sign data, it is necessary first to by training generation disaggregated model, and
In the generating process of disaggregated model, need to obtain original sign data of health first.Wherein, original sign data of health can refer to that patient carries out
The sign data generated when medical inspection may include at least one index value, such as pressure value, blood glucose in the sign data
Value, body temperature, heart rate etc..
In practical applications, the accuracy of the disaggregated model generated for guarantee training, available a large amount of original sign numbers
According to.Since medical inspection can be divided into the projects such as multiple projects, such as blood routine, routine urinalysis, biochemical analysis, patient be can choose
Check a project or multiple projects, therefore the original sign data of health may include the corresponding a kind of sign number of a certain inspection item
According to, also may include the corresponding multiclass sign data of multiple inspection items, for example, the original sign data of health may include patient into
The sign data generated when row routine urinalysis or blood routine or biochemical analysis, the original sign data of health also may include that patient urinates
The sign data generated when conventional and blood routine examination.
For ease of understanding, by taking original sign data of health includes the index value in blood routine examination as an example, available Data1,
The a plurality of original sign data of health such as Data2, Data3 and Data4, every characteristic may include average hemoglobin amount, be averaged
Index value in the blood routine examinations such as hemoglobin concentration, blood platelet distributed density and erythrocyte distribution width, such as 1 institute of table
Show.It should be noted that average hemoglobin amount, mean hemoglobin concentration, blood platelet distributed density, erythrocyte distribution width
Deng for index item, the corresponding specific value of index item is the index value in original sign data of health.
1 original sign data of health table of table
It is understood that may include in every original sign data of health in the original sign data of health that medical inspection generates
Numeric type index value, for example, average hemoglobin amount is 30.2pg, platelet distribution width 11.3fl, it also may include non-
Numeric type index value, for example, Urine proteins and examination of sugar in urine result characterize patient sign with the negative or positive in routine urinalysis.
The index value that the embodiment of the present application includes to original sign data of health can carry out according to the actual situation without limiting
Selection.
S102: there are the sign datas that index value lacks for lookup in original sign data of health.
In the present embodiment, after obtaining original sign data of health, needs to check every initial characteristic data, find out
Lack the sign data of index value, to execute S103 using the sign data for lacking index value.In order to search, there are index values to lack
The sign data of mistake can obtain the corresponding each index item of index value in whole original sign data of health first, determine initial body
Corresponding whole index item in data are levied, lack any index item if existed in a certain original sign data of health, the initial body
Sign data are that there are the sign datas of index value missing.For example, being corresponding with the index of index item 1,2,3 in original sign data of health A
It is worth, the index value of index item 2,3,4 is corresponding in original sign data of health B, is corresponding with index item 3,4,5 in original sign data of health C
Index value, then original sign data of health A, B, C be there are index value missing sign data, original sign data of health A missing refers to
The index value of item 4,5 is marked, original sign data of health B lacks the index value of index item 1,5, and original sign data of health C lacks index item 1,2
Index value.
S103: data filling is carried out to the index value lacked in the sign data lacked there are index value, generation is filled up
Sign data afterwards.
In practical applications, lead to generate sign classification mould to avoid being trained using the characteristic of missing index value
The inaccuracy of type can be to the characteristic of the missing index value when finding the characteristic of missing index value by S102
According to being filled up, the index value of missing is supplemented, to obtain a complete original sign data of health, executes S104.Wherein, it fills up
The specific implementation for lacking the characteristic of index value will be described in detail in subsequent embodiment.
S104: the sign data by the sign data that index value missing is not present in original sign data of health and after filling up is made
For training data, preliminary classification model is instructed according to training data and every training data corresponding data classification label
Practice, generates sign data disaggregated model.
In this example, by S103, after filling up the sign data of missing index value, the sign number after being filled up
According to further, the characteristic there will be no the sign data of index value missing and after filling up is used as training data, so
Preliminary classification model is trained according to training data and every training data corresponding data classification label afterwards, and then is obtained
Take sign data disaggregated model.
In specific application, can classify in advance to every original sign data of health of acquisition, and according to classification results
Data classification label is distributed to original sign data of health, to be used as when training data using original sign data of health, according to trained number
Accordingly and the corresponding data classification label of training data is trained, and generates sign data disaggregated model.
Wherein, data classification label can be used for characterizing the corresponding patient's constitution of every sign data, different patients its
Constitution may be different, and the sign data of generation may not also be identical when carrying out medical inspection for different constitutions.Specific implementation
When, data classification label can be used different characters and be identified, such as the corresponding mark constitution 1 of label 1, the corresponding mark of label 2
Know corresponding mark constitution 3 of constitution 2, label 3 etc..
As can be seen from the above-described embodiment, after the embodiment of the present application is by obtaining original sign data of health, by original sign number
Each index value in fills up complete and generates training data, using the tag along sort of training data and training data to initial point
Class model is trained, and generates sign data disaggregated model, sign data disaggregated model generated can be to any sign number
According to classifying, classification results can assist doctor to diagnose, so that the embodiment of the present application is directed to a large amount of original sign data of health,
It has excavated its internal connection and has established sign data disaggregated model, improved the utilization rate of original sign data of health.
In the embodiment of the present application, one kind is possible is achieved in that, the preliminary classification model in the application can be Piao
Plain Bayesian model or decision-tree model.It will be introduced respectively below according to training data and the corresponding data of every training data
The process that tag along sort is trained model-naive Bayesian or decision-tree model.
One, naive Bayesian training pattern
In the present embodiment, naive Bayesian theory refers to, the probability of event is had occurred and that according to one, calculates another
The probability that event occurs, mathematic(al) representation is referring to formula (1)
Wherein, P (Y) is the prior probability of event Y, and P (Y | X) is the posterior probability of event X, after indicating that event X occurs, hair
It makes trouble the probability of part Y.
On this basis, in conjunction with the practical application of the application, wherein Y indicates data classification label classification, and X is training number
According to, it is assumed that there are 4 data tag along sorts, respectively y1, y2, y3 and y4, obtains 5 original sign data of health, every sign data
Including 4 index values, the tag along sort of every sign data is respectively y1, y2, y3, y3, y4, wherein the 1st article of sign data be
[x1 x2 x3 x4], data classification label are y1;2nd article of sign data is [x5 x6 x7 x8], and data classification label is y2;
3rd data is [x9 x10 x11 x12], and data classification label is y3;4th article of sign data is [x13 x14 x15
X16], data classification label is y3;5th article of sign data is [x17 x18 x19 x20], and data classification label is y4.
Then
For ease of understanding, X1 is the 1st article of sign data, X2 is the 2nd article of sign data, X3 is that the 3rd article of sign data, X4 are
4th article of sign data, X5 are the 5th article of sign data, and the purpose of the present embodiment training is to calculate P (y1 | X1), P (y2 | X2), P
(y3 | X3), P (y3 | X4) and P (y4 | X5), calculation formula is referring to formula (2):
Assuming that mutually indepedent between each data in Xi, then above-mentioned formula (2) can be written as:
Wherein, the specific value of xa, xb, xc and xd are related to Xi, for example, being respectively x1, x2, x3 as Xi=X1
And x4, then formula (3) can be with are as follows:
Since denominator and input data are related to constant, then denominator can be removed, then above-mentioned formula (3) can be with are as follows:
In practical application, all probable values that can use given data tag along sort Y calculate probability, and select to export
The result of maximum probability, that is to say, that be respectively the probability of y1, y2, y3 and y4 when data are X1, select maximum general
The corresponding tag along sort of rate, as the data classification label of X1, then:
It is illustrated by taking P (y1 | X1) as an example, then above-mentioned formula (4) can be with are as follows:
It can be seen that from above-mentioned calculation formula when obtaining P (y1 | X1), need to know P (Y=y1) and P (X1 | y1), under
Face will introduce how to obtain specific probability value respectively.
(1) if prior probability without P (Y=y1), P (Y=y is utilizedk)=mk/ m is obtained, wherein mkFor data point
Class label is ykNumber, m is the number of data classification label in all sign datas obtained, that is, the sign number obtained
According to item number.
(2) when obtaining P (X1 | y1), the attribute of sign data X1 need to be distinguished, when sign data is discrete value, P (X1 |
Y1 it) is obtained using following formula:
Wherein, xj is index value in sign data Xi, mkThe number for being yk for data classification label, n are every characteristic
According to including index value number, δ is pre-set positive integer.
When sign data is successive value, P (X1 | y1) it is obtained using following formula:
Wherein, μkWithRespectively as Y=yk, the average value of all Xi, variance.
P (yk | Xi) is obtained by above-mentioned calculating, then model-naive Bayesian is trained, to generate sign data
Disaggregated model.
Two, decision-tree model
Decision tree is also known as classification tree, is a kind of common classification method, and basic principle is a large amount of training sample of input,
In, each training sample has attribute value and classification, and the category is predetermined, and decision tree obtains classifier by study,
The classifier can correctly classify to the data newly inputted.
For ease of understanding, be illustrated by binary tree of decision tree, it is assumed that the original sign data of health of acquisition be table 2 shown in,
4 original sign data of health are obtained altogether, and every original sign data of health includes 4 index values, which is merely to illustrate how to train certainly
Plan tree-model does not do any restriction to the original sign data of health of acquisition.
2 decision tree training data of table
There are four index values in the every sign data obtained as can be seen from Table 2, while having determined that respective
Data classification label, then training process can be such that
(1) average hemoglobin amount is judged whether in no threshold range A, if it is, determining this sign data pair
The classification of TCM constitution answered is y1;If not, carrying out (2);
(2) mean hemoglobin concentration is judged whether in threshold range B, if it is, determining this sign data pair
The classification of TCM constitution answered is y2;If not, carrying out (3);
(3) platelet distribution width is judged whether in threshold range C, if it is, determining that this sign data is corresponding
Classification of TCM constitution be y3;If not, carrying out (4);
(4) erythrocyte distribution width is judged whether in threshold range D, if it is, determining that this sign data is corresponding
Classification of TCM constitution be y4;If not, can be other with mark, to be distinguished with the classification of above-mentioned four kinds of signs.
Wherein, the specific setting of A, B, C and D are referred to each index value in the original sign data of health obtained, by above-mentioned
After learning training, sign data disaggregated model can be generated.
It should be noted that above-mentioned training process is using average hemoglobin amount as the first Rule of judgment, naturally it is also possible to
It, can also be by mean hemoglobin concentration and blood with mean hemoglobin concentration or platelet distribution width for the first Rule of judgment
The platelet dispersion of distribution is collectively as the first Rule of judgment, and the present embodiment is it is not limited here.
Through the foregoing embodiment, it can use training data to be trained above two preliminary classification model, so as to
To quickly generate sign data disaggregated model, divided to treat classification sign data using the sign data disaggregated model
Class.
Included index in every original sign data of health is provided in the original sign data of health provided by Tables 1 and 2
The dimension of value is different, such as average hemoglobin amount is (pg), mean hemoglobin concentration is (g/L), two indices value
It is distributed in the different orders of magnitude, is unfavorable for subsequent trained preliminary classification model, it is therefore, in the embodiment of the present application, former obtaining
After beginning sign data, the index value in original sign data of health is normalized first, thus by different dimension data
Homogeneous classification data are divided into, it is inconvenient to eliminate dimension bring.Simultaneously, it is contemplated that need to the sign number lacked there are index value
It is filled up in, to make the data filled up more accurate, in some possible implementations, to there are index value missings
Sign data in front of the index value that is lacked carries out data filling, the index value in every original sign data of health is returned
One change processing.
It in this example, is normalized for the corresponding index value of index item same in original sign data of health, it should
Index item for example can be average hemoglobin, mean hemoglobin concentration, platelet distribution width or erythrocyte distribution width
Deng.
In specific implementation, the index value in original sign data of health can be normalized using 0-1 standardized method
Processing, wherein 0-1 standardization is also known as deviation standardization, is to carry out linear transformation to initial data, result is made to fall in [0,1] area
Between, transfer function are as follows:
Wherein, x is that a corresponding index value, max are the index item in whole original sign data of health in certain index item
Maximum value, min are the minimum value of the index item in whole original sign data of health.
It is illustrated by taking mean hemoglobin concentration as an example, max 345, min 320 is converted by above-mentioned transfer function
Afterwards, mean hemoglobin concentration is normalized to that 0.76, Data2 is corresponding to be normalized to the corresponding normalization of 0, Data3 in Data1
1 is normalized to for 0.6, Data4 is corresponding.
It should be noted that can also be normalized using other standards method, such as min-max standard
Change, the embodiment of the present application to the concrete mode of normalized without limitation.
In addition, can also be normalized when index value corresponding for index item is nonumeric type, implement
When, it can be the nonumeric carry out assignment, then the index value after assignment is normalized.If certain mark sense pair
The index value answered only there are two types of as a result, for example negative or positive, then can set the positive to 1, and feminine gender is set as 0, without into
The subsequent normalized of row.
By present embodiment, place can be normalized to the index value in original sign data of health using normalization algorithm
Reason is handled so that each index value is in [0,1] section convenient for subsequent classification, improves processing speed.
Through the foregoing embodiment it is found that needing to carry out the index value lacked in the sign data lacked there are index value
Data filling is illustrated algorithm provided by the embodiments of the present application of filling up below in conjunction with attached drawing.
Referring to fig. 2, it illustrates a kind of data filling method flow diagrams provided by the embodiments of the present application, as shown in Fig. 2, should
Method may include:
S201: for any there are the sign data of index value missing, the sign data there are index value missing is determined
The corresponding index item of middle lacked index value.
In the present embodiment, it can be searched by S102 there are the sign data of index value missing, any one is lacked
The sign data of index value, it is thus necessary to determine that the corresponding index item of index value is lacked in every sign data, which can be
The corresponding inspection item title of the index value.For example, it is assumed that the corresponding index of erythrocyte distribution width is lacked in table 1 in Data2
Value can then determine that it is erythrocyte distribution width that the corresponding index item of index value is lacked in Data2;Blood platelet is lacked in Data3
The dispersion of distribution corresponds to index value, then can determine that it is platelet distribution width that the corresponding index item of index value is lacked in Data3.
S202: it according to the index value of the index item in other original sign data of health, is generated using a variety of data filling algorithms
Multiple data filling results of the index item.
By S201, determines in the sign data of every missing index value after the index item of lacked index value, then utilize
The index value for not lacking the index item in other original sign data of health of the index value carries out data filling.For example, in table 1,
It is erythrocyte distribution width that the corresponding index item of index value is lacked in Data2, and is not lacked in Data1, Data3 and Data4
The corresponding index value of the index item then can use the corresponding index value of erythrocyte distribution width in Data1, Data3 and Data4
Carry out data filling;It is platelet distribution width that the corresponding index item of index value is lacked in Data3, and Data1, Data2 and
It does not lack the corresponding index value of the index item in Data4, then it is wide to can use the distribution of Data1, Data2 and Data4 blood platelet
Corresponding index value is spent to be filled up.In specific implementation, the accuracy that result is filled up for guarantee, can use a variety of data and fills out
It mends algorithm to be filled up, every kind of data filling algorithm generates the corresponding data filling of the index item as a result, so as to obtain
Much a data filling results.
It should be noted that determining the missing index value by S201 first when the index value for missing is nonumeric
After corresponding index item, then can in other original sign data of health the index item corresponding index value carry out assignment, then
Data filling is carried out according to the index value of the index item in other original sign data of health after assignment.In a kind of optional realization side
In formula, data filling algorithm may include any more in Maximum Likelihood Estimation Method, average value completion method and approximate polishing method
It is a.That is, when carrying out data filling, it can choose and wherein fill up algorithm and filled up for any two kinds, generate two numbers
According to filling up result;Also it three kinds be can choose fills up algorithm and filled up, generate three data filling results.
Wherein, Maximum Likelihood Estimation Method is built upon a statistical method on the basis of maximum likelihood principle, provides one
The method that the given observation data of kind carry out assessment models parameter.In the present embodiment, when deletion type is missing at random, it is assumed that mould
Type is that reliably, can carry out maximum likelihood to unknown parameter by the limit distribution of observation data and estimate for complete sample
Meter, in the case of, the calculation method that maximum likelihood parameter estimation uses maximizes (Expectation- for desired value
Maximization, EM) algorithm, which is that one kind calculates Maximum-likelihood estimation or posteriority point in fragmentary data
The iterative algorithm of cloth is alternately performed two steps in each iterative process:
(1) to calculate complete data in the case where giving complete main clause and the obtained parameter Estimation of preceding an iteration corresponding
Log-likelihood function conditional expectation;
(2) parameter value is determined using maximization log-likelihood function, and be used for the iteration of lower step.
Until restraining, i.e. Parameters variation between two steps is less than continuous iteration EM algorithm between above-mentioned two step
Terminate iterative process when preset threshold value.
Average value fill method, the attribute for the original sign data of health that can first will acquire are divided into numerical attribute and nonumeric category
Property;When index value for missing is numerical attribute, be using in other initial bodies card data the index value of the index item it is flat
Mean value is filled up;It, can be according to the mode principle in statistics, in other originals when index value for missing is nonumeric attribute
The most numerical value of the corresponding index value frequency of occurrence of the index item is searched in beginning sign data, then the number that frequency of occurrence is most
Value fills up the index value of missing.
Nearest polishing method, is searched in other original sign data of health and there are the sign data of index value missing is most like
Then sign data fills up the index value of missing using the corresponding index value of the index item in the most like sign data of lookup.
Wherein, most like sign data, can be identical for the corresponding data classification label of two strips sign data, alternatively, two strips levies number
The difference of the corresponding index value of other index item is in preset threshold range in.
S203: calculating the average value of multiple data filling results, will lack in the sign data there are index value missing
The index value of the index item lost is filled up as average value, and the sign data after this fills up is generated.
By S202, multiple data fillings can be obtained as a result, can incite somebody to action to improve the index value accuracy finally filled up
Results are averaged for the multiple data fillings obtained, using the average value as being lacked in the sign data lacked there are index value
Index value, thus generate fill up after sign data.
What is provided through this embodiment fills up algorithm, can fast and accurately generate it is required fill up data so that
There are the sign datas of index value missing to be converted to complete sign data, and then provides complete training sample for subsequent training
This, improves trained accuracy.
For ease of understanding in the application sign data disaggregated model training process, referring to Fig. 3, it illustrates the application realities
The flow chart that the sign data disaggregated model training of example offer is provided, as shown in figure 3, in sign data disaggregated model training process
In, it is necessary first to original sign data of health is obtained, duplicate checking then is carried out to original sign data of health, duplicate sign data is removed, subtracts
Few redundancy, then the sign data after duplicate removal is normalized, normalization sign data is obtained, in normalization sign data
There are the sign datas that index value lacks for middle lookup, and using a variety of data filling algorithms to the sign number lacked there are index value
According to data filling is carried out, the sign data after filling up is generated, finally, the body that index value missing will be not present in original sign data of health
Sign data are trained generation sign data classification mould to preliminary classification model as training data with the sign data after filling up
Type.
The above are a kind of specific implementations of method of generating classification model provided by the embodiments of the present application, are based on above-mentioned reality
The sign data disaggregated model in example is applied, the embodiment of the present application also provides a kind of data classification methods.
Ginseng is by Fig. 4, and it illustrates a kind of data classification method flow charts provided by the embodiments of the present application, as shown in figure 4, should
Method may include:
S401: sign data to be sorted is obtained, wherein sign data to be sorted includes at least one index value.
In the present embodiment, when it needs to be determined that when the corresponding classification results of certain sign data, it is necessary first to obtain to be sorted
Sign data may include one or more index values, such as pressure value, blood glucose value, heart rate etc. in the sign data.
S402: judge to lack in the sign data to be sorted obtained with the presence or absence of index value, if so, executing S403;Such as
Fruit is no, executes S404.
In this example, after obtaining sign data to be sorted, need to check this sign data, it should with judgement
The case where sign data is lacked with the presence or absence of index value, to avoid will be present the sign data input sign of index value missing
In data classification model, classification results are influenced, therefore, when obtaining in sign data to be sorted in the presence of missing index value, are then held
Row S403.If then executing S404 there is no missing index value in the sign data to be sorted of acquisition.
S403: carrying out data filling to the scale value of missing, the sign data to be sorted input sign data point after filling up
Class model obtains the classification results of sign data to be sorted.
In the present embodiment, when determining in the sign data to be sorted obtained in the presence of the case where missing index value, to missing
Index value is filled up, and specific complementing method is subsequent to be introduced.
In specific application, the sign data to be sorted after filling up is input to sign data classification mould as input data
Type so that sign data disaggregated model according to input data obtain classification results, such as the classification results can characterize this to
Classification sign data corresponds to the constitution of patient.Wherein, the sign data disaggregated model of the present embodiment is above-described embodiment training life
At disaggregated model.
S404: sign data to be sorted is inputted into sign data disaggregated model, obtains the classification knot of sign data to be sorted
Fruit.
By S402, after determining sign data to be sorted is partial data, using the data to be sorted as input data
It is input in sign data disaggregated model, so that sign data disaggregated model can judge sign to be sorted according to input data
The type of data obtains classification results.
As can be seen from the above-described embodiment, sign data to be sorted is obtained first, then judges that the sign data to be sorted is
No there are index value deletion conditions, if it is, filling up to missing index value, the sign data to be sorted after filling up is defeated
Enter in sign data disaggregated model;If it is not, then directly sign data to be sorted is input in sign data disaggregated model, into
And the classification results of sign data to be sorted are obtained, to realize that quickly treating classification sign data classifies, and knot of classifying
Fruit can assist doctor to diagnose, and improve the utilization rate of original sign data of health.
In the present embodiment, the index value that can also be treated in classification sign data is normalized, and is actually answering
In, the index value that can be treated according to the index value of original sign data of health in classification sign data is normalized, from
And the index value of different dimensions is divided into homogeneous classification data, specific implementation may refer to original sign data of health index value
Normalization processing method, details are not described herein for the present embodiment.
It should be noted that in the present embodiment, if during generating sign data disaggregated model, to original sign number
Index value in has carried out normalized, then when treating classification data using sign data disaggregated model and being classified,
It is also required to treat the index value in classification data to be normalized;If do not returned to the index value of original sign data of health
One change processing is then normalized without treating the index value in classification sign data, thus unified input data, it is ensured that
Disaggregated model can accurately identify input data, guarantee the accuracy for obtaining classification results.
The case where being directed to acquired sign data to be sorted missing index value, the embodiment of the present application provides one kind and fills out
It fills a vacancy the method for losing index value, is introduced below in conjunction with attached drawing.
Referring to Fig. 5, it illustrates another data filling methods provided by the embodiments of the present application, as shown in figure 5, this method
May include:
S501: if there are index value missings for sign data to be sorted, the finger lacked in sign data to be sorted is determined
The corresponding index item of scale value.
In the present embodiment, judging sign data to be sorted by S402, there are when index value deletion condition, it is thus necessary to determine that
The corresponding index item of the index value, to carry out subsequent fill up using the index item.
S502: it is generated according to the corresponding index value of the index item in original sign data of health using a variety of data filling algorithms
Multiple data filling results of the index item.
By S501, the corresponding index item of missing index value is determined, index item can use after determining a variety of fills up algorithm
Data filling is carried out, the data utilized when filling up are the corresponding index value of the index item in original sign data of health, by the index
Value is calculated as multiple parameters for filling up algorithm, obtains multiple data filling results of the index item.For example, obtain to
Classify and lack average hemoglobin amount in sign data, then can use in table 1 that average hemoglobin amount is corresponding in 4 datas
Value obtains multiple data filling results by multiple algorithms of filling up.
Wherein, a variety of algorithms of filling up can be in Maximum Likelihood Estimation Method, average value completion method and approximate polishing method
Any number of, the specific implementation about each algorithm may refer to above-described embodiment, and details are not described herein for the present embodiment.
S503: calculating the average value of multiple data filling results, the index item that will be lacked in sign data to be sorted
Index value fill up as the average value, the sign data to be sorted after being filled up.
In the present embodiment, by S502, obtains multiple data fillings and asked as a result, multiple data filling results be added
It is averaged, using the average value as missing index value, to obtain complete sign data to be sorted, and then this is to be sorted
Sign data is input in sign disaggregated model as input data and obtains classification results.
What is provided through this embodiment fills up algorithm, can quickly and accurately to missing index value data to be sorted into
Row data filling improves the accuracy of final classification to guarantee the integrality of sign data to be sorted.
For ease of understanding in the application sign data to be sorted assorting process, referring to Fig. 6, it illustrates the application implementations
The flow chart for the data classification that example provides during data classification, obtains sign data to be sorted, so as described in Figure 6 first
After treat classification sign data be normalized, then judge the sign data to be sorted with the presence or absence of missing index value feelings
Condition, if it does, carrying out data filling, the sign data to be sorted after being filled up is then input to sign data classification mould
In type;If it does not exist, then sign data to be sorted is input in the tired model of sign data point, finally, output category result.
Based on above method embodiment, present invention also provides a kind of disaggregated model generating means, below in conjunction with attached drawing
The device is illustrated.
Referring to Fig. 7, it illustrates a kind of disaggregated model generating means structure charts provided by the embodiments of the present application, can wrap
It includes:
Acquiring unit 701, for obtaining original sign data of health, every original sign data of health includes at least one index
Value;
Searching unit 702, for searching in the original sign data of health, there are the sign datas that index value lacks;
Shim 703, for being counted to the index value lacked in the sign data lacked there are index value
According to filling up, the sign data after filling up is generated;
Generation unit 704, for sign data and the institute of index value missing will to be not present in the original sign data of health
The sign data after filling up is stated as training data, according to the training data and the corresponding data of every training data
Tag along sort is trained preliminary classification model, generates sign data disaggregated model.
In some possible implementations of the application, described device further include:
Normalization unit, the index value for being lacked in the sign data lacked there are index value count
According to before filling up, the index value in every original sign data of health is normalized.
In some possible implementations of the application, the shim is specifically included:
Determine subelement, any described there are the sign data of index value missing for being directed to, determining this, there are index values
The corresponding index item of the index value lacked in the sign data of missing;
Subelement is generated to fill out for the index value according to the index item in other original sign data of health using a variety of data
Algorithm is mended, multiple data filling results of the index item are generated;
Subelement is filled up, the average value of the multiple data filling result is calculated, by the sign there are index value missing
The index value of the index item lacked in data is filled up as the average value, and the sign data after this fills up is generated.
In some possible implementations of the application, the data filling algorithm includes Maximum Likelihood Estimation Method, is averaged
It is any number of in value completion method and approximate polishing method.
In some possible implementations of the application, the preliminary classification model is using model-naive Bayesian or certainly
Plan tree-model.
As can be seen from the above-described embodiment, after the embodiment of the present application is by obtaining original sign data of health, by original sign number
Each index value in fills up complete and generates training data, using the tag along sort of training data and training data to initial point
Class model is trained, and generates sign data disaggregated model, sign data disaggregated model generated can be to any sign number
According to classifying, classification results can assist doctor to diagnose, so that the embodiment of the present application is directed to a large amount of original sign data of health,
It has excavated its internal connection and has established sign data disaggregated model, improved the utilization rate of original sign data of health.
Referring to Fig. 8, it illustrates a kind of device for classifying data structure chart provided by the embodiments of the present application, which be can wrap
It includes:
Acquiring unit 801, for obtaining sign data to be sorted, the sign data to be sorted includes at least one index
Value;
First obtains unit 802, if for the sign data to be sorted there are index value missing, the finger to lacking in
Scale value carries out data filling, and the sign data to be sorted after filling up inputs sign data disaggregated model, obtains described to be sorted
The classification results of sign data;
Second obtaining unit 803 will be described wait divide if there is no index value missings for the sign data to be sorted
Class sign data inputs the sign data disaggregated model, obtains the classification results of the sign data to be sorted;
The sign data disaggregated model is generated according to the disaggregated model generating means.
In some possible implementations of the application, described device further include:
Normalization unit, for the index value in the sign data to be sorted to be normalized.
In some possible implementations of the application, the first obtains unit is specifically included:
Subelement is determined, for determining the corresponding index item of index value lacked in the sign data to be sorted;
Subelement is generated, for the index value according to the index item in original sign data of health, is calculated using a variety of data fillings
Method generates multiple data filling results of the index item;
Subelement is filled up, for calculating the average value of the multiple data filling result, by the sign data to be sorted
The index value of middle the lacked index item is filled up as the average value, the sign data to be sorted after being filled up.
In addition, the embodiment of the present application also provides a kind of computer readable storage medium, the computer readable storage medium storing program for executing
In be stored with instruction, when described instruction is run on the terminal device, so that the terminal device executes above-mentioned disaggregated model
Generation method or above-mentioned data classification method.
The embodiment of the present application also provides a kind of computer program product, and the computer program product is transported on the terminal device
When row, so that the terminal device executes above-mentioned method of generating classification model or above-mentioned data classification method.
As can be seen from the above-described embodiment, sign data to be sorted is obtained first, then judges that the sign data to be sorted is
No there are index value deletion conditions, if it is, filling up to missing index value, the sign data to be sorted after filling up is defeated
Enter in sign data disaggregated model;If it is not, then directly sign data to be sorted is input in sign data disaggregated model, into
And the classification results of sign data to be sorted are obtained, to realize that quickly treating classification sign data classifies, and knot of classifying
Fruit can assist doctor to diagnose, and improve the utilization rate of original sign data of health.
It should be noted that each embodiment in this specification is described in a progressive manner, each embodiment emphasis is said
Bright is the difference from other embodiments, and the same or similar parts in each embodiment may refer to each other.For reality
For applying system or device disclosed in example, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, phase
Place is closed referring to method part illustration.
It should be appreciated that in this application, " at least one (item) " refers to one or more, and " multiple " refer to two or two
More than a."and/or" indicates may exist three kinds of relationships, for example, " A and/or B " for describing the incidence relation of affiliated partner
It can indicate: only exist A, only exist B and exist simultaneously tri- kinds of situations of A and B, wherein A, B can be odd number or plural number.Word
Symbol "/" typicallys represent the relationship that forward-backward correlation object is a kind of "or"." at least one of following (a) " or its similar expression, refers to
Any combination in these, any combination including individual event (a) or complex item (a).At least one of for example, in a, b or c
(a) can indicate: a, b, c, " a and b ", " a and c ", " b and c ", or " a and b and c ", and wherein a, b, c can be individually, can also
To be multiple.
It should also be noted that, herein, relational terms such as first and second and the like are used merely to one
Entity or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation
There are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant are intended to contain
Lid non-exclusive inclusion, so that the process, method, article or equipment including a series of elements is not only wanted including those
Element, but also including other elements that are not explicitly listed, or further include for this process, method, article or equipment
Intrinsic element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that
There is also other identical elements in process, method, article or equipment including the element.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor
The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit
Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology
In any other form of storage medium well known in field.
The foregoing description of the disclosed embodiments makes professional and technical personnel in the field can be realized or use the application.
Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the application.Therefore, the application
It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one
The widest scope of cause.
Claims (10)
1. a kind of method of generating classification model, which is characterized in that the described method includes:
Original sign data of health is obtained, every original sign data of health includes at least one index value;
There are the sign datas that index value lacks for lookup in the original sign data of health;
To described there are the index value progress data filling lacked in the sign data of index value missing, the body after filling up is generated
Levy data;
By in the original sign data of health be not present index value missing sign data and it is described fill up after sign data make
For training data, according to the training data and the corresponding data classification label of every training data to preliminary classification mould
Type is trained, and generates sign data disaggregated model.
2. the method according to claim 1, wherein the method also includes:
Before the index value lacked in the sign data lacked there are index value carries out data filling, by every institute
The index value stated in original sign data of health is normalized.
3. method according to claim 1 or 2, which is characterized in that it is described to it is described there are index value missing sign number
Data filling is carried out according to middle lacked index value, generates the sign data after filling up, comprising:
For it is any it is described there are index value missing sign data, determine this there are index value missing sign data in lack
The corresponding index item of the index value of mistake;
The index item is generated using a variety of data filling algorithms according to the index value of the index item in other original sign data of health
Multiple data filling results;
The average value for calculating the multiple data filling result, should by what is lacked in the sign data there are index value missing
The index value of index item is filled up as the average value, and the sign data after this fills up is generated.
4. according to the method described in claim 3, it is characterized in that, the data filling algorithm include Maximum Likelihood Estimation Method,
It is any number of in average value completion method and approximate polishing method.
5. the method according to claim 1, wherein the preliminary classification model using model-naive Bayesian or
Person's decision-tree model.
6. a kind of data classification method, which is characterized in that the described method includes:
Sign data to be sorted is obtained, the sign data to be sorted includes at least one index value;
If there are index value missings for the sign data to be sorted, index value carries out data filling to lacking in, will fill up
Sign data to be sorted afterwards inputs sign data disaggregated model, obtains the classification results of the sign data to be sorted;
If there is no index value missings for the sign data to be sorted, the sign data to be sorted is inputted into the sign number
According to disaggregated model, the classification results of the sign data to be sorted are obtained;
The sign data disaggregated model is that method of generating classification model according to claim 1-5 is generated
's.
7. a kind of disaggregated model generating means, which is characterized in that described device includes:
Acquiring unit, for obtaining original sign data of health, every original sign data of health includes at least one index value;
Searching unit, for searching in the original sign data of health, there are the sign datas that index value lacks;
Shim, for carrying out data filling to the index value lacked in the sign data lacked there are index value,
Generate the sign data after filling up;
Generation unit, for by the original sign data of health be not present index value missing sign data and it is described fill up after
Sign data as training data, according to the training data and the corresponding data classification label of every training data
Preliminary classification model is trained, sign data disaggregated model is generated.
8. a kind of device for classifying data, which is characterized in that described device includes:
Acquiring unit, for obtaining sign data to be sorted, the sign data to be sorted includes at least one index value;
First obtains unit, if for the sign data to be sorted there are index value missing, to lacking in index value into
Row data filling, the sign data to be sorted after filling up input sign data disaggregated model, obtain the sign number to be sorted
According to classification results;
Second obtaining unit, if there is no index value missings for the sign data to be sorted, by the sign to be sorted
Data input the sign data disaggregated model, obtain the classification results of the sign data to be sorted;
The sign data disaggregated model is that disaggregated model generating means according to claim 7 are generated.
9. a kind of computer readable storage medium, which is characterized in that it is stored with instruction in the computer readable storage medium storing program for executing, when
When described instruction is run on the terminal device, so that the terminal device perform claim requires the described in any item classification moulds of 1-5
Type generation method or data classification method as claimed in claim 6.
10. a kind of computer program product, which is characterized in that when the computer program product is run on the terminal device, make
It obtains the terminal device perform claim and requires the described in any item method of generating classification model or as claimed in claim 6 of 1-5
Data classification method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810712862.5A CN109102896A (en) | 2018-06-29 | 2018-06-29 | A kind of method of generating classification model, data classification method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810712862.5A CN109102896A (en) | 2018-06-29 | 2018-06-29 | A kind of method of generating classification model, data classification method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109102896A true CN109102896A (en) | 2018-12-28 |
Family
ID=64845413
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810712862.5A Pending CN109102896A (en) | 2018-06-29 | 2018-06-29 | A kind of method of generating classification model, data classification method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109102896A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111597444A (en) * | 2020-05-13 | 2020-08-28 | 北京达佳互联信息技术有限公司 | Searching method, searching device, server and storage medium |
CN112052914A (en) * | 2020-09-29 | 2020-12-08 | 中国银行股份有限公司 | Classification model prediction method and device |
JP7623120B2 (en) | 2020-10-01 | 2025-01-28 | キヤノンメディカルシステムズ株式会社 | Medical Support System |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090319244A1 (en) * | 2002-10-24 | 2009-12-24 | Mike West | Binary prediction tree modeling with many predictors and its uses in clinical and genomic applications |
CN106156809A (en) * | 2015-04-24 | 2016-11-23 | 阿里巴巴集团控股有限公司 | For updating the method and device of disaggregated model |
CN107480721A (en) * | 2017-08-21 | 2017-12-15 | 上海中信信息发展股份有限公司 | A kind of ox only ill data analysing method and device |
CN107595243A (en) * | 2017-07-28 | 2018-01-19 | 深圳和而泰智能控制股份有限公司 | A kind of illness appraisal procedure and terminal device |
-
2018
- 2018-06-29 CN CN201810712862.5A patent/CN109102896A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090319244A1 (en) * | 2002-10-24 | 2009-12-24 | Mike West | Binary prediction tree modeling with many predictors and its uses in clinical and genomic applications |
CN106156809A (en) * | 2015-04-24 | 2016-11-23 | 阿里巴巴集团控股有限公司 | For updating the method and device of disaggregated model |
CN107595243A (en) * | 2017-07-28 | 2018-01-19 | 深圳和而泰智能控制股份有限公司 | A kind of illness appraisal procedure and terminal device |
CN107480721A (en) * | 2017-08-21 | 2017-12-15 | 上海中信信息发展股份有限公司 | A kind of ox only ill data analysing method and device |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111597444A (en) * | 2020-05-13 | 2020-08-28 | 北京达佳互联信息技术有限公司 | Searching method, searching device, server and storage medium |
CN111597444B (en) * | 2020-05-13 | 2024-03-05 | 北京达佳互联信息技术有限公司 | Searching method, searching device, server and storage medium |
CN112052914A (en) * | 2020-09-29 | 2020-12-08 | 中国银行股份有限公司 | Classification model prediction method and device |
CN112052914B (en) * | 2020-09-29 | 2023-12-01 | 中国银行股份有限公司 | Classification model prediction method and device |
JP7623120B2 (en) | 2020-10-01 | 2025-01-28 | キヤノンメディカルシステムズ株式会社 | Medical Support System |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Iftikhar et al. | An evolution based hybrid approach for heart diseases classification and associated risk factors identification | |
Karthiga et al. | Early prediction of heart disease using decision tree algorithm | |
Khezri et al. | A fuzzy rule-based expert system for the prognosis of the risk of development of the breast cancer | |
CN110136836A (en) | A Disease Prediction Method Based on Cluster Analysis of Physical Examination Reports | |
Higa | Diagnosis of breast cancer using decision tree and artificial neural network algorithms | |
CN110767279A (en) | Method and system for missing data completion in electronic health record based on LSTM | |
Andreeva | Data modelling and specific rule generation via data mining techniques | |
CN110610766A (en) | Apparatus and storage medium for deriving probability of disease based on symptom feature weight | |
CN109102896A (en) | A kind of method of generating classification model, data classification method and device | |
Cahyani et al. | Increasing Accuracy of C4. 5 Algorithm by applying discretization and correlation-based feature selection for chronic kidney disease diagnosis | |
CN117393144A (en) | Prediction method and system for death risk of infant suffering from PICU sepsis | |
Priyadharsini et al. | Efficient thyroid disease prediction using features selection and meta-classifiers | |
Chandra et al. | Application of machine learning k-nearest neighbour algorithm to predict diabetes | |
Sudharson et al. | Performance analysis of enhanced adaboost framework in multifacet medical dataset. | |
US11961204B2 (en) | State visualization device, state visualization method, and state visualization program | |
Gulhane et al. | Fusion of Various Machine Learning Algorithms for Early Heart Attack Prediction | |
Azar et al. | Inductive learning based on rough set theory for medical decision making | |
Bindushree | Prediction of cardiovascular risk analysis and performance evaluation using various data mining techniques: A review | |
Christopher et al. | Knowledge-based systems and interestingness measures: Analysis with clinical datasets | |
CN114048320B (en) | A Multi-label International Classification of Diseases Training Method Based on Curriculum Learning | |
Magoev et al. | Application of clustering methods for detecting critical acute coronary syndrome patients | |
TWI599896B (en) | Multiple decision attribute selection and data discretization classification method | |
Setiawan et al. | ANALYSIS OF CLASSIFICATION OF LUNG CANCER USING THE DECISION TREE CLASSIFIER METHOD | |
CN111599427B (en) | Recommendation method and device for unified diagnosis, electronic equipment and storage medium | |
CN112686091B (en) | Two-step arrhythmia classification method based on deep neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181228 |