CN103544392B - Medical science Gas Distinguishing Method based on degree of depth study - Google Patents
Medical science Gas Distinguishing Method based on degree of depth study Download PDFInfo
- Publication number
- CN103544392B CN103544392B CN201310503402.9A CN201310503402A CN103544392B CN 103544392 B CN103544392 B CN 103544392B CN 201310503402 A CN201310503402 A CN 201310503402A CN 103544392 B CN103544392 B CN 103544392B
- Authority
- CN
- China
- Prior art keywords
- layer
- theta
- sigma
- sample
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 63
- 238000012549 training Methods 0.000 claims abstract description 34
- 230000008569 process Effects 0.000 claims abstract description 33
- 230000004044 response Effects 0.000 claims abstract description 19
- 238000010606 normalization Methods 0.000 claims abstract description 7
- 230000036961 partial effect Effects 0.000 claims description 35
- 230000005284 excitation Effects 0.000 claims description 29
- 230000006870 function Effects 0.000 claims description 28
- 239000011159 matrix material Substances 0.000 claims description 21
- 238000004364 calculation method Methods 0.000 claims description 18
- 239000004576 sand Substances 0.000 claims description 9
- 238000005457 optimization Methods 0.000 claims description 8
- 238000004422 calculation algorithm Methods 0.000 claims description 6
- 238000013135 deep learning Methods 0.000 claims description 6
- 238000011478 gradient descent method Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 abstract description 12
- 230000009467 reduction Effects 0.000 abstract description 4
- 230000001629 suppression Effects 0.000 abstract 2
- 239000007789 gas Substances 0.000 description 32
- 238000001514 detection method Methods 0.000 description 10
- 238000013461 design Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 238000012937 correction Methods 0.000 description 3
- 239000000126 substance Substances 0.000 description 3
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 238000009640 blood culture Methods 0.000 description 2
- 238000007635 classification algorithm Methods 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 230000008786 sensory perception of smell Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 1
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 244000052616 bacterial pathogen Species 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 230000005764 inhibitory process Effects 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 210000000214 mouth Anatomy 0.000 description 1
- 230000000737 periodic effect Effects 0.000 description 1
- 238000000513 principal component analysis Methods 0.000 description 1
- 230000002035 prolonged effect Effects 0.000 description 1
- 238000011946 reduction process Methods 0.000 description 1
- 230000002829 reductive effect Effects 0.000 description 1
- 210000002345 respiratory system Anatomy 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 210000000115 thoracic cavity Anatomy 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000017105 transposition Effects 0.000 description 1
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of medical science Gas Distinguishing Method based on degree of depth study, specifically used original frequency response signal, it is carried out simple normalization, then input stack autoencoder network, by successively extracting, final study obtains the abstract characteristics of initial data, whole network externally shields the processes such as extraction feature, dimensionality reduction, suppression drift, finally add a classification layer at network so that these features can be directly entered grader and classify simultaneously.Training process is divided into two steps of pre-training and fine setting, can be effectively improved the learning capacity of network, and after having trained, new samples input network can directly obtain the classification of prediction.The method of the present invention can automatically extract the effective distinguishing characteristic of medical science gas, the steps such as feature extraction, feature selection and suppression drift is merged into one, greatly simplifies the complexity of traditional method, improve the efficiency of gas detecting and identification.
Description
Technical Field
The invention belongs to the technical field of biomedicine, and particularly relates to a medical gas identification method.
Background
The machine olfaction is an artificial intelligence system, and the basic principle is as follows: the odor molecules are adsorbed by the sensor array to generate electric signals, then various signal processing technologies are used for extracting characteristics, and judgment is made through a computer mode identification system to finish the work of gas identification, concentration measurement and the like. The electronic nose system is a typical application of machine olfaction, and plays a very important role in the medical field, such as diagnosing certain diseases, identifying bacterial species in blood, detecting gases harmful to the respiratory system, and the like.
Sensing gas detection and identification have important applications in the medical field, for example, electronic nose equipment can be used for collecting sample data in oral cavity, chest cavity and blood, various signal processing technologies are used for analysis and processing, and a computer mode identification system can make judgment so as to complete tasks such as disease diagnosis, germ identification, drug concentration determination and the like.
The conventional gas detection and identification method generally comprises the steps of feature extraction, feature selection and the like, and finally achieves a preset target through means of classification, regression, clustering and the like. For devices that require long term use, effective sensor drift compensation techniques must also be used to suppress the effects of drift. In medical applications, a compromise between accuracy and real-time is often required due to the complexity and inefficiency of these conventional methods.
The data sampled by the sensor can be regarded as a time series signal which is complex in structure, difficult to interpret and often high in dimensionality. For better identification, it is usually necessary to design features according to various attributes of the signal, then select features, such as dimension reduction, and classify them as input of a classification algorithm, such as a support vector machine.
Sensor drift is the slow and random variation in the response of a sensor over time. The change causes that the mode obtained by the mode recognition system at present by learning cannot be well applied to the subsequent sample to be detected, so that the accuracy of gas detection and recognition is gradually reduced. In medical applications, in order to suppress the influence of sensor drift, there are generally two measures: the method develops an effective drift compensation technology. The process is often independent of feature extraction, and the operation is complex and the efficiency is very low; due to the fact that the drift degree in a short time is small, the electronic nose device is maintained and updated regularly to guarantee that sampling data are stable and reliable, but the cost is undoubtedly greatly increased, and the service life of the device is shortened.
In fact, some well-designed features are very robust to drift. From this perspective, the two processes can be fused together simply by extracting better features to suppress sensor drift. The deep learning technology is used for analyzing, learning and interpreting data by establishing an artificial neural network comprising a plurality of hidden layers and simulating the human brain, can obtain a highly abstract representation of the data, is good at finding potential modes in the data, and is very suitable for solving the problems.
In the literature "m.trincolvilleli, s.coradeschi, a.loutfi, B.A method for effectively identifying pathogenic bacteria in a blood culture specimen is provided in P.thunberg, Direct identification of bacteria in blood culture using an electronic nose device, IEEE Trans biological Engineering57(12), 2884. sub.2890, 2010. the method comprises the steps of firstly obtaining sample data by sampling through an electronic nose device, then carrying out feature extraction and dimension reduction, and finally completing classification by using a support vector machine, wherein in a feature extraction part, two feature extraction methods of steady-state response and response derivative are adopted aiming at a signal overall waveform.
In order to obtain a higher recognition accuracy in a complicated problem, it is sometimes necessary to analyze a signal waveform more finely and extract a feature having a higher dimension. The literature "a.vergara, s.vembua, t.ayhanb, m.a.r.vitae, m.l.h.vitae and r.huertaa.chemical gas sensor drift compensation using classifier entities, Sensors and detectors B: Chemical, vol.166-167, pp.320-329, May 2012" studies how to improve the recognition accuracy of gases such as ethanol under drift, and 8 different features were designed.
In the case of classification algorithm determination, the recognition accuracy of the gas depends only on how good the features are. Compared with the frequency response value in the original signal, the well-designed features can greatly reduce the dimension redundancy, simultaneously highlight the difference between different categories, and generally can obtain better identification accuracy.
However, the characteristics of manual design are usually specific to some specific applications (gas type, sensor type, external environment, etc.), and thus the purpose is very strong and the universality is very poor. And due to the cross sensitivity of the sensor, the dimension of the finally extracted feature is still very high, and an efficient dimension reduction algorithm, such as PCA, LDA and the like, is usually required to be searched. If the recognition accuracy using existing features is not satisfactory in a new application, better features need to be designed, which undoubtedly further increases the complexity of the task.
The most effective way to suppress drift is to compensate for drift by periodic recalibration, the general idea being to find a linear transformation to normalize the sensor response so that the classifier can be applied directly to the transformed data.
CN1514239A discloses a method for realizing gas sensor drift detection and correction. The method improves the sensitivity and accuracy of the sensor drift detection by comprehensively utilizing the principal component analysis and wavelet transform technology. The detected drift sensor is corrected on line by adopting a correction method based on an adaptive drift model, and the drift model can be updated on line, so that the purposes of improving the reliability of a sensor system and prolonging the service life of the system are achieved.
In the literature "T.I.P.M.Holmberg, "Drift correction for gas sensors using multivariable methods," j.chemim., vol.14, No. 5-6, pp.711-723,2000, "approximate the Drift direction with a reference gas, and then modify the response of the gas to be analyzed as follows.
However, these methods assume that the drift law of the sensor is linear, which is not proven, and often require a reference gas whose chemical properties are relatively stable over time and highly similar to the gas to be analyzed in terms of sensor behavior, which is certainly quite demanding in practical applications. In addition, these methods are complicated and inefficient in practical applications.
Disclosure of Invention
The invention aims to simplify the complexity of the traditional gas detection and identification method, and develop a simpler, more efficient and more robust gas detection and identification method for sensor drift.
The method uses an original frequency response signal to simply normalize the original frequency response signal, then inputs the frequency response signal into a stack type self-coding network, finally learns to obtain abstract characteristics of original data through layer-by-layer extraction, externally shields the processes of extracting the characteristics, reducing dimension, inhibiting drift and the like for the whole network, and simultaneously adds a classification layer at the last of the network so that the characteristics can directly enter a classifier for classification. The training process is divided into two steps of pre-training and fine-tuning, the learning capacity of the network can be effectively improved, and after the training is finished, a new sample is input into the network to directly obtain the predicted category.
The technical scheme of the invention is as follows: a medical gas identification method based on deep learning comprises the following specific steps:
step 1, data normalization, wherein m samples are set, each sample is organized according to the following form, and v ═ s1,s2···,st]Wherein s isiIs the ith frequency response value, for a total of t response values, the entire gas data set and corresponding label can be represented as:
Y=[y1,y2,…,yi,…,ym]T
t represents the transposition of the vector, the ith column vi in the matrix V represents the ith sample, and the ith element in Y is the category label of the corresponding sample;
utilizing typeNormalizing a dataset to [0,1],
Wherein, Vi,jDenotes the ith frequency response value of the jth sample, L is the normalized lower bound of 0, U is the normalized upper bound of 1, maxiAnd miniFor the maximum and minimum values of each row in the matrix, the normalized data set is denoted by a(0)Represents;
step 2, pre-training a stacked self-coding network, wherein v, h and y respectively represent an input layer, a hidden layer and an output layer, and W(i)To connect the weight matrices of the layers, b(i)A bias vector for the hidden layer;
step 2.1, train the first layer, namely the first automatic coding machine, the objective function is:
wherein the first term is a reconstruction error term representing the degree of difference between the input and the output, wherein viRepresents the ith input sample after normalization in step 1,representing a sample viOutput at the output layer after passing through the network; the second term is called weight attenuation term, and is used for reducing the amplitude of the weight and preventing overfitting, wherein WijRepresenting a weight value between the jth unit of the current layer and the ith unit of the next layer, wherein the third item is a sparse penalty item, pj represents the average excitation of the hidden layer unit J, and lambda, β and rho are preset parameters;
optimizing an objective function, wherein for an automatic coding machine with n layers, the specific optimization steps are as follows:
step 2.1.1. random initialization parameter W(i)、b(i)Initializing a matrix or vector of all zeros, i.e. AW(i)=0,Δb(i)=0;
Step 2.1.2. orderFor each sample, the partial derivatives are calculated using a back propagation algorithmThe specific process is as follows:
feedforward calculation to obtain each layer excitation a(i)The calculation formula is a(i)=σ(W(i)a(i-1)+b(i)) Whereinis sigmoid function with output range of [0,1 ]];
For the output layer, the residual is calculated:(n)=-(v-a(n))·σ′(z(n)) Wherein "·" represents a vector dot product, wherein z is(n)=W(n-1)a(n-1)+b(n-1)σ' denotes the derivative of σ (x);
for each layer of l ═ n-1, n-2,.., 2, the following is calculated:
calculating a partial derivative value:
whereinAndboth represent J (W, b) to W(i)The partial derivatives of (a) are,andall represent J (W, b) to b(i)Partial derivatives of (a);
step 2.1.3, respectively adding the obtained partial derivatives to delta W(i),Δb(i)To do so, i.e.
Step 2.1.4, update parameter W(i),b(i),Wherein α is the learning rate;
step 2.1.5, repeating the steps 2.1.2 to 2.1.4, gradually reducing the value of the target function until the set threshold value is reached to obtain the parameters (W, b) of the coding layer and the parameters of the decoding layer
Step 2.2, discarding the decoding layer after trainingTaking the parameters (W, b) of the coding layer as initial parameters of the corresponding layer in the stacked self-coding network, namely W(1)=W,b(1)=b;
Step 2.3, calculating hidden layer excitation of the current automatic coding machine: a is(1)=σ(W(1)a(0)+b(1));
Step 2.4. at excitation a(1)Training a second layer, i.e. a second automatic coding machine, wherein the hidden layer of the first automatic coding machine is used as the input layer of the second automatic coding machine, the training process is the same as the process of training the first layer, but the input is changed into a(1). The initial parameter W of the second layer of the network is obtained after the training is finished(2),b(2)And hidden layer excitation a(2);
Step 2.5 for the third layer to the nth layer, the process of steps 2.1 to 2.4 is repeated to obtain the initial parameters of each hidden layer, and finally obtain the excitation a of the nth hidden layer(n)The excitation of this layer is also used as input for the softmax layer, denoted aS;
Step 2.6 Using a obtained in step 2.5SAnd training the last layer of the network, namely the softmax classifier by using the label Y to obtain the initial parameter W of the last layerS;
A is represented by x and thetaSAnd WSAnd assuming that there are k types, for the ith sample, the probability that the predicted class is labeled as the jth class is:
wherein,the jth line in theta is represented as a row vector connecting the weights between the jth output cell and all input cells. l is a constant variable, l is more than or equal to 1 and less than or equal to k, and k is input a of the softmax layerSAnd initial parameters W of the softmax classifierSNumber of classes of (2), xiThe input value of softmax layer for the ith sample. The final output is a probability column vector P, the jth component represents the probability that the prediction sample is judged as the jth class, and the weight matrix theta is trained by using a minimization loss function:
wherein logP (y)iJ) represents a certain probability value P (y)iJ) is the natural logarithm of the number of the pairs,is an indication function, and takes value as 1 when the condition in the brackets is true, and takes value as 0 if the condition in the brackets is not true; m is the number of samples, and n is the number of layers of the automatic coding machine;
step 3, fine adjustment, namely regarding the network as a whole, calculating the partial derivative of each layer parameter by using back propagation, and then performing iterative optimization by using a gradient descent method, wherein the specific process is as follows:
step 3.1. use formula a(i)=σ(W(i)a(i-1)+b(i)) Performing feedforward calculation to obtain the excitation a of each layer(i);
Step 3.2, calculating a softmax layer parameter WSPartial derivatives of (a):wherein, P is the conditional probability vector calculated in the step 2.6;
and 3.3, calculating the residual error of the last hidden layer as follows:wherein Denotes J (W, b) vs. a(n)Partial derivatives of (a)(n)Excitation for the nth hidden layer;
step 3.4. for each layer of l ═ n-1, n-2.
Step 3.5, calculating partial derivatives of all hidden layers:
and 3.6, updating parameters of each layer by using the partial derivatives obtained in the step:
whereinRepresents J (W, b) to WSPartial derivatives of (W)SAs an initial parameter of the softmax classifier,represents J (W, b) to W(i)The partial derivatives of (a) are,represents J (W, b) to b(i)Partial derivatives of (a);
step 3.7, repeating the steps, and reducing the value of the target function through iteration until the value reaches a set threshold value;
and 4, predicting the category of the prediction sample, wherein the specific process is as follows:
step 4.1. predict sample vpNormalized to [0,1 ]];
Step 4.2. for hidden layers, formula a is used(i)=σ(W(i)a(i-1)+b(i)) Performing feedforward calculation layer by layer to obtain input a of the softmax layerS;
And 4.3, calculating a conditional probability vector P according to the probability calculation formula in the step 2.5, wherein the class corresponding to the maximum component is the class to which the sample is predicted to belong.
Each of i and j in the above formulas represents a constant parameter for counting.
The invention has the beneficial effects that: the invention designs a network structure suitable for medical gas signal processing, and performs layer-by-layer feature extraction on an input sample, and finally the feature dimension entering a classification layer is lower and has good robustness on drift. Compared with the traditional feature extraction method, the method can automatically extract the medical gas to effectively distinguish the features, integrates the steps of feature extraction, feature selection, drift inhibition and the like, greatly simplifies the complexity of the traditional method, and improves the efficiency of gas detection and identification. The concrete aspects are as follows:
except for training the softmax classifier in the step 2, class labels are not needed in other processes, so that the process of extracting the features is unsupervised; if the samples are scarce, a large number of unlabelled samples can be used for training all layers before the classification layer, and finally, a small number of labeled samples are used for fine adjustment;
secondly, as seen from the network structure, the number of units in each layer is less than that of the units in the previous layer, so that the input dimension finally entering the classifier is lower and far smaller than the original input, and the method can be regarded as a dimension reduction process;
the extraction features are automatically completed, and manual intervention is not introduced, so that the complexity of manual feature design is eliminated, and the method has wide applicability;
the extracted features have good robustness to drift, the gas detection and identification accuracy under drift can be effectively improved, and the service life of the device is prolonged.
Drawings
Fig. 1 is a flow chart of a medical gas identification method according to an embodiment of the invention.
Fig. 2 is a stacked self-encoding network for medical gas identification according to an embodiment of the present invention.
FIG. 3 is a block diagram of an exemplary embodiment of an automatic coding machine including a hidden layer.
Detailed Description
The embodiments of the present invention will be further described with reference to the accompanying drawings.
The gas identification method of the present invention has a general flow as shown in fig. 1:
step 1, data normalization, wherein m samples are set, each sample is organized according to the following form, and v ═ s1,s2..·,st]Wherein s isiIs the ith frequency response value, for a total of t response values, the entire gas data set and corresponding label can be represented as:
Y=[y1,y2,…,yi,…,ym]T
t denotes the transpose of the vector, V being the ith column in the matrix ViRepresenting the ith sample, wherein the ith element in Y is the category label of the corresponding sample;
utilizing typeNormalizing a dataset to [0,1],
Wherein, Vi,jDenotes the ith frequency response value of the jth sample, L is the normalized lower bound of 0, U is the normalized upper bound of 1, maxiAnd miniFor the maximum and minimum values of each row in the matrix, the normalized data set is denoted by a(0)And (4) showing.
Step 2, pre-training a stacked self-coding network, wherein v, h and y respectively represent an input layer, a hidden layer and an output layer, and W(i)To connect the weight matrices of the layers, b(i)Is the bias vector of the hidden layer.
The present invention uses a network structure similar to that shown in fig. 2, and the number of layers of the network and the number of units in each layer can be changed according to different specific tasks, so that the corresponding parameter forms can also be changed.
Such networks are often very deep in hierarchy, have many parameters, and are difficult to directly train, so a pre-training method is firstly adopted to train the parameters of each layer by layer. Compared with random initialization, pre-training can enable the parameters of each layer to be located at better positions in the parameter space.
The rest of the network, except for the softmax layer for classification, can be seen as a stack of several single-hidden-layer autocodes, where the output of the previous layer is connected to the input of the next layer. This kind of automatic coding machine acquires the excitation of hidden layer units by reconstructing (symbolized by the symbol ^) the input and represents as the characteristics of the original input, as shown in FIG. 3.
After the training of each automatic coding machine is completed, only the parameters of the coding layer, namely W and b, are reserved as the initial parameters of the corresponding layer in the stacked self-coding network, and the specific process is as follows:
step 2.1, train the first layer, namely the first automatic coding machine, the objective function is:
wherein the first term is a reconstruction error term representing the degree of difference between the input and the output, wherein viRepresents the ith input sample after normalization in step 1,representing a sample viThrough the netOutput at the output layer after being wound; the second term is called weight attenuation term, and is used for reducing the amplitude of the weight and preventing overfitting, wherein WijRepresenting the weight between the jth unit of the current layer and the ith unit of the next layer, and the third item is a sparse penalty item, wherein pjThe mean excitation of hidden layer unit J is represented by a preset parameter, and λ, β and ρ are parameters, which are used to make the mean excitation of all hidden layer units approach a very small number ρ, i.e., only a few hidden layer units are activated.
The objective function is optimized by using a gradient descent method, wherein the partial derivative of each parameter needs to be calculated during iteration, and the calculation process is completed by a back propagation algorithm (back propagation).
Optimizing an objective function, wherein for an automatic coding machine with n layers, the specific optimization steps are as follows:
step 2.1.1. random initialization parameter W(i)、b(i)Initializing a matrix or vector of all zeros, i.e. AW(i)=0,Δb(i)=0;
Step 2.1.2. orderFor each sample, the partial derivatives are calculated using a back propagation algorithmThe specific process is as follows:
feedforward calculation to obtain each layer excitation a(i)The calculation formula is a(i)=σ(W(i)a(i-1)+b(i)) Whereinis sigmoid function with output range of [0,1 ]];
For the output layer, the residual is calculated:(n)=-(v-a(n))·σ′(z(n)) Wherein "·" represents a vector dot product, wherein z is(n)=W(n-1)a(n-1)+b(n-1)σ' denotes the derivative of σ (x);
for each layer of l ═ n-1, n-2,.., 2, the following is calculated:
calculating a partial derivative value:
whereinAndboth represent J (W, b) to W(i)The partial derivatives of (a) are,andall represent J (W, b) to b(i)The partial derivatives of (1).
Step 2.1.3, respectively adding the obtained partial derivatives to delta W(i),Δb(i)To do so, i.e.
Step 2.1.4, update parameter W(i),b(i),Where α is the learning rate.
Step 2.1.5. repeat stepStep 2.1.2 to step 2.1.4, the value of the objective function is gradually decreased until a set threshold value. At this time, parameters (W, b) of the coding layer and parameters of the decoding layer can be obtained
Step 2.2, discarding the decoding layer after trainingTaking the parameters (W, b) of the coding layer as initial parameters of the corresponding layer in the stacked self-coding network, namely W(1)=W,b(1)=b。
Step 2.3, calculating hidden layer excitation of the current automatic coding machine: a is(1)=σ(W(1)a(0)+b(1))。
Step 2.4. at excitation a(1)Training a second layer, i.e. a second automatic coding machine, wherein the hidden layer of the first automatic coding machine is used as the input layer of the second automatic coding machine, the training process is the same as the process of training the first layer, but the input is changed into a(1). The initial parameter W of the second layer of the network is obtained after the training is finished(2),b(2)And hidden layer excitation a(2)。
Step 2.5 for the third layer to the nth layer, the process of steps 2.1 to 2.4 is repeated to obtain the initial parameters of each hidden layer, and finally obtain the excitation a of the nth hidden layer(n)The excitation of this layer is also used as input for the softmax layer, denoted aS(ii) a The auto-encoder in this embodiment is 3 layers.
Step 2.6 Using a obtained in step 2.5SAnd training the last layer of the network, namely the softmax classifier by using the label Y to obtain the initial parameter W of the last layerS;
softmax regression is a generalization of Logistics regression over multi-classification problems. For convenience of representation, a is represented by x and θ, respectivelySAnd WSAnd assume that there are k classes in common. For the firstThe probability of the predicted class marked as the jth class of the i samples is as follows:
wherein,a j line in theta is represented, a line vector connecting weight values between the j output unit and all input units is represented, l is a constant variable, l is more than or equal to 1 and less than or equal to k, and k is input a of the softmax layerSAnd initial parameters W of the softmax classifierSNumber of classes of (2), xiThe input value of softmax layer for the ith sample. The final output is a probability column vector P, the jth component represents the probability that the prediction sample is judged as the jth class, and the weight matrix theta is trained by using a minimization loss function:
wherein logP (y)iJ) represents the natural logarithm of some probability value,is an indication function, and takes value as 1 when the condition in the brackets is true, and takes value as 0 if the condition in the brackets is not true; the loss function is a strict convex function, and a global optimal solution can be obtained by using an optimization algorithm such as gradient descent or lbfgs. m is the number of samples and n is the number of layers of the automatic coding machine.
Here, the parameters connecting the two layers are all weight matrices, WSI.e. theta is the weight matrix connecting the last two layers.
The specific process of training the weight matrix θ by using the minimization loss function in this embodiment is as follows:
step 2.6.1, initializing a parameter matrix theta randomly;
step 2.6.2 direct calculation of the derivative of J (θ), where θjRepresents the jth column in the matrix;
step 2.6.3 updates the parameter θ:wherein α is the learning rate;represents J (theta) to thetajPartial derivatives of (a);
step 2.6.4 repeating steps 2.6.2 to 2.6.3, gradually reducing the value of J (theta) until reaching a set threshold value, wherein the obtained theta is the final weight matrix, namely WS。
And step 3, fine adjustment, namely regarding the network as a whole, calculating the partial derivative of each layer parameter by using back propagation, and then performing iterative optimization by using a gradient descent method.
After the pre-training is completed, the initial parameters of each layer in the network are determined, and at this time, fine tuning needs to be performed on all the parameters once to improve the classification capability of the network. The fine tuning process is to regard the network as a whole, calculate the partial derivative of each layer parameter by back propagation, and then iteratively optimize by a gradient descent method. The network is no longer in order to be reconstructed, so the objective function is the same as that of the softmax layer, which is regarded as an additional layer and processed separately, and the optimization process of each hidden layer is basically the same as that described in step 2.1.
The specific process is as follows:
step 3.1. use formula a(i)=σ(W(i)a(i-1)+b(i)) Performing feedforward calculation to obtain the excitation a of each layer(i);
Step 3.2, calculating a softmax layer parameter WSPartial derivatives of (a):wherein, P is the conditional probability vector calculated in the step 2.6;
and 3.3, calculating the residual error of the last hidden layer as follows:wherein Denotes J (W, b) vs. a(n)Partial derivatives of (a)(n)Excitation for the nth hidden layer;
step 3.4. for each layer of l ═ n-1, n-2.(l)=((W(l))T (l+1))·σ′(z(l));
Step 3.5, calculating partial derivatives of all hidden layers:
and 3.6, updating parameters of each layer by using the partial derivatives obtained in the step:
whereinRepresents J (W, b) to WSPartial derivatives of (W)SAs an initial parameter of the softmax classifier,represents J (W, b) to W(i)The partial derivatives of (a) are,represents J (W, b) to b(i)Partial derivatives of (a);
step 3.7, repeating the steps, and reducing the value of the target function through iteration until the value reaches a set threshold value;
and 4, predicting the category of the prediction sample, wherein the specific process is as follows:
step 4.1. predict sample vpNormalized to [0,1 ]];
Step 4.2. for hidden layers, formula a is used(i)=σ(W(i)a(i-1)+b(i)) Performing feedforward calculation layer by layer to obtain input a of the softmax layerS;
And 4.3, calculating a conditional probability vector P according to the probability calculation formula in the step 2.5, wherein the class corresponding to the maximum component is the class to which the sample is predicted to belong.
Each of i and j in the above formulas represents a constant parameter for counting.
The core content of the invention is to design a network structure suitable for medical gas signal processing, and process the patient gas data sampled by the electronic nose by adopting deep learning, thereby automatically extracting the more universal characteristic and the more robust characteristic to the sensor drift, finally simply and effectively completing the task of gas detection and identification, and the invention has great application value in the medical field requiring accuracy and real-time.
Claims (2)
1. A medical gas identification method based on deep learning comprises the following specific steps:
step 1, data normalization, wherein m samples are set, each sample is organized according to the following form, and v ═ s1,s2...,st]Wherein s isiIs the ith frequency response value, for a total of t response values, the entire gas data set and corresponding label can be represented as:
Y=[y1,y2,…,yi,…,ym]T
t denotes the transpose of the vector, V being the ith column in the matrix ViRepresenting the ith sample, wherein the ith element in Y is the category label of the corresponding sample;
utilizing typeNormalizing a dataset to [0,1],
Wherein v isi,jDenotes the ith frequency response value of the jth sample, L is the normalized lower bound of 0, U is the normalized upper bound of 1, maxiAnd miniFor the maximum and minimum values of each row in the matrix, the normalized data set is denoted by a(0)Represents;
step 2, pre-training a stacked self-coding network, wherein v, h and y respectively represent an input layer, a hidden layer and an output layer, and W(i)To connect the weight matrices of the layers, b(i)A bias vector for the hidden layer;
step 2.1, train the first layer, namely the first automatic coding machine, the objective function is:
wherein the first term is a reconstruction error term representing the degree of difference between the input and the output, wherein viRepresents the ith input sample after normalization in step 1,representing a sample viOutput at the output layer after passing through the network; the second term is called weight attenuation term, and is used for reducing the amplitude of the weight and preventing overfitting, wherein WijRepresenting the weight between the jth unit of the current layer and the ith unit of the next layer, and the third item is a sparse penalty item, wherein pjRepresenting the average excitation of the hidden layer unit J, wherein lambda, β and rho are preset parameters, m is the number of samples, and J represents the objective function of the first automatic coding machine;
optimizing an objective function, wherein for an automatic coding machine with n layers, the specific optimization steps are as follows:
step 2.1.1. random initialization parameter W(i)、b(i)Initializing a matrix or vector of all zeros, i.e. AW(i)=0,Δb(i)=0;
Step 2.1.2. orderFor each sample, the partial derivatives are calculated using a back propagation algorithmThe specific process is as follows:
feedforward calculation to obtain each layer excitation a(i)The calculation formula is a(i)=σ(W(i)a(i-1)+b(i)) Whereinis sigmoid function with output range of [0,1 ]];
For the output layer, the residual is calculated:(n)=-(v-a(n))·σ′(z(n)) Wherein "·" represents a vector dot product, wherein z is(n)=W(n-1)a(n-1)+b(n-1)σ' denotes the derivative of σ (x);
for each layer of l ═ n-1, n-2,.., 2, the following is calculated:(l)=((W(l))T (l+1))·σ′(z(l));
calculating a partial derivative value:
whereinAndboth represent J (W, b) to W(i)The partial derivatives of (a) are,andall represent J (W, b) to b(i)Partial derivatives of (a);
step 2.1.3, respectively adding the obtained partial derivatives to delta W(i),Δb(i)To do so, i.e.
Step 2.1.4, update parameter W(i),b(i),Wherein α is the learning rate;
step 2.1.5, repeating the steps 2.1.2 to 2.1.4, gradually reducing the value of the target function until the set threshold value is reached to obtain the parameters (W, b) of the coding layer and the parameters of the decoding layer
Step 2.2, discarding the decoding layer after trainingTaking the parameters (W, b) of the coding layer as initial parameters of the corresponding layer in the stacked self-coding network, namely W(1)=W,b(1)=b;
Step 2.3, calculating hidden layer excitation of the current automatic coding machine: a is(1)=σ(W(1)a(0)+b(1));
Step 2.4. at excitation a(1)Training a second layer, i.e. a second automatic coding machine, wherein the hidden layer of the first automatic coding machine is used as the input layer of the second automatic coding machine, the training process is the same as the process of training the first layer, but the input is changed into a(1)And obtaining initial parameters W of the second layer of the network after training(2),b(2)And hidden layer excitation a(2);
Step 2.5 for the third layer to the nth layer, the process of steps 2.1 to 2.4 is repeated to obtain the initial parameters of each hidden layer, and finally obtain the excitation a of the nth hidden layer(n)The excitation of this layer is also used as input for the softmax layer, denoted aS;
Step 2.6 Using a obtained in step 2.5SAnd label Y training the last layer of the network, namely the softmax classifier to obtainInitial parameter W to the last layerS;
A is represented by x and thetaSAnd WSAnd assuming that there are k types, for the ith sample, the probability that the predicted class is labeled as the jth class is:
wherein,denotes the jth line in theta, denotes a line vector connecting weight values between the jth output cell and all input cells, and k is the input a of the softmax layerSAnd initial parameters W of the softmax classifierSThe number of categories of (2); the final output is a probability column vector P, the jth component represents the probability that the prediction sample is judged as the jth class, and the weight matrix theta is trained by using a minimization loss function:
wherein logP (y)iJ) represents a certain probability value P (y)iJ) is the natural logarithm of the number of the pairs,is an indication function, and takes value as 1 when the condition in the brackets is true, and takes value as 0 if the condition in the brackets is not true; m is the number of samples, and n is the number of layers of the automatic coding machine;
step 3, fine adjustment, namely regarding the network as a whole, calculating the partial derivative of each layer parameter by using back propagation, and then performing iterative optimization by using a gradient descent method, wherein the specific process is as follows:
step 3.1. use formula a(i)=σ(W(i)a(i-1)+b(i)) Performing feedforward calculation to obtain the excitation a of each layer(i);
Step 3.2, calculating a softmax layer parameter WSPartial derivatives of (a):wherein, P is the conditional probability vector calculated in the step 2.6;
and 3.3, calculating the residual error of the last hidden layer as follows:wherein Denotes J (W, b) vs. a(n)Partial derivatives of (a)(n)Excitation for the nth hidden layer;
step 3.4. for each layer of l ═ n-1, n-2.(l)=((W(l))T (l+1))·σ′(z(l));
Step 3.5, calculating partial derivatives of all hidden layers:
and 3.6, updating parameters of each layer by using the partial derivatives obtained in the step:
whereinRepresents J (W, b) to WSPartial derivatives of (W)SAs an initial parameter of the softmax classifier,represents J (W, b) to W(i)The partial derivatives of (a) are,represents J (W, b) to b(i)Partial derivatives of (a); step 3.7, repeating the steps, and reducing the value of the target function through iteration until the value reaches a set threshold value;
and 4, predicting the category of the prediction sample, wherein the specific process is as follows:
step 4.1. predict sample vpNormalized to [0,1 ]];
Step 4.2. for hidden layers, formula a is used(i)=σ(W(i)a(i-1)+b(i)) Performing feedforward calculation layer by layer to obtain input a of the softmax layerS;
And 4.3, calculating a conditional probability vector P according to the probability calculation formula in the step 2.5, wherein the class corresponding to the maximum component is the class to which the sample is predicted to belong.
2. The deep learning-based medical gas identification method according to claim 1, wherein the process of training the weight matrix θ in step 2.6 by using the minimization loss function is as follows:
step 2.6.1, initializing a parameter matrix theta randomly;
step 2.6.2 direct calculation of the derivative of J (θ), where θjRepresents the jth column in the matrix;
step 2.6.3 updates the parameter θ:wherein α is the learning rate;represents J (theta) to thetajPartial derivatives of (a);
step 2.6.4 repeating steps 2.6.2 to 2.6.3, gradually reducing the value of J (theta) until reaching a set threshold value, wherein the obtained theta is the final weight matrix, namely WS。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310503402.9A CN103544392B (en) | 2013-10-23 | 2013-10-23 | Medical science Gas Distinguishing Method based on degree of depth study |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310503402.9A CN103544392B (en) | 2013-10-23 | 2013-10-23 | Medical science Gas Distinguishing Method based on degree of depth study |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103544392A CN103544392A (en) | 2014-01-29 |
CN103544392B true CN103544392B (en) | 2016-08-24 |
Family
ID=49967837
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310503402.9A Expired - Fee Related CN103544392B (en) | 2013-10-23 | 2013-10-23 | Medical science Gas Distinguishing Method based on degree of depth study |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103544392B (en) |
Families Citing this family (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103996056B (en) * | 2014-04-08 | 2017-05-24 | 浙江工业大学 | Tattoo image classification method based on deep learning |
CN104021224A (en) * | 2014-06-25 | 2014-09-03 | 中国科学院自动化研究所 | Image labeling method based on layer-by-layer label fusing deep network |
CN104484684B (en) * | 2015-01-05 | 2018-11-02 | 苏州大学 | A kind of Manuscripted Characters Identification Method and system |
CN105844331B (en) * | 2015-01-15 | 2018-05-25 | 富士通株式会社 | The training method of nerve network system and the nerve network system |
CN104866727A (en) * | 2015-06-02 | 2015-08-26 | 陈宽 | Deep learning-based method for analyzing medical data and intelligent analyzer thereof |
CN105913079B (en) * | 2016-04-08 | 2019-04-23 | 重庆大学 | Electronic nose isomeric data recognition methods based on the study of the target domain migration limit |
CN106202054B (en) * | 2016-07-25 | 2018-12-14 | 哈尔滨工业大学 | A kind of name entity recognition method towards medical field based on deep learning |
CN106264460B (en) * | 2016-07-29 | 2019-11-19 | 北京医拍智能科技有限公司 | The coding/decoding method and device of cerebration multidimensional time-series signal based on self study |
CN106156530A (en) * | 2016-08-03 | 2016-11-23 | 北京好运到信息科技有限公司 | Health check-up data analysing method based on stack own coding device and device |
CN108122035B (en) * | 2016-11-29 | 2019-10-18 | 科大讯飞股份有限公司 | End-to-end modeling method and system |
CN107368670A (en) * | 2017-06-07 | 2017-11-21 | 万香波 | Stomach cancer pathology diagnostic support system and method based on big data deep learning |
CN107368671A (en) * | 2017-06-07 | 2017-11-21 | 万香波 | System and method are supported in benign gastritis pathological diagnosis based on big data deep learning |
WO2019005611A1 (en) * | 2017-06-26 | 2019-01-03 | D5Ai Llc | Selective training for decorrelation of errors |
CN108416439B (en) * | 2018-02-09 | 2020-01-03 | 中南大学 | Oil refining process product prediction method and system based on variable weighted deep learning |
CN109472303A (en) * | 2018-10-30 | 2019-03-15 | 浙江工商大学 | A kind of gas sensor drift compensation method based on autoencoder network decision |
DE102019220455A1 (en) * | 2019-12-20 | 2021-06-24 | Robert Bosch Gesellschaft mit beschränkter Haftung | Method and device for operating a gas sensor |
CN111474297B (en) * | 2020-03-09 | 2022-05-03 | 重庆邮电大学 | Online drift compensation method for sensor in bionic olfaction system |
CN111340132B (en) * | 2020-03-10 | 2024-02-02 | 南京工业大学 | Machine olfaction mode identification method based on DA-SVM |
CN111915069B (en) * | 2020-07-17 | 2021-12-07 | 天津理工大学 | Deep learning-based detection method for distribution of lightweight toxic and harmful gases |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101135639A (en) * | 2007-09-27 | 2008-03-05 | 中国人民解放军空军工程大学 | Mixture gas component concentration infrared spectrum analysis method based on supporting vector quantities machine correct model |
CN102411687A (en) * | 2011-11-22 | 2012-04-11 | 华北电力大学 | Deep learning detection method for unknown malicious codes |
CN103267793A (en) * | 2013-05-03 | 2013-08-28 | 浙江工商大学 | Carbon nano-tube ionization self-resonance type gas sensitive sensor |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101042339B (en) * | 2006-03-21 | 2012-05-30 | 深圳迈瑞生物医疗电子股份有限公司 | Device for recognizing zone classification of anesthetic gas type and method thereof |
-
2013
- 2013-10-23 CN CN201310503402.9A patent/CN103544392B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101135639A (en) * | 2007-09-27 | 2008-03-05 | 中国人民解放军空军工程大学 | Mixture gas component concentration infrared spectrum analysis method based on supporting vector quantities machine correct model |
CN102411687A (en) * | 2011-11-22 | 2012-04-11 | 华北电力大学 | Deep learning detection method for unknown malicious codes |
CN103267793A (en) * | 2013-05-03 | 2013-08-28 | 浙江工商大学 | Carbon nano-tube ionization self-resonance type gas sensitive sensor |
Non-Patent Citations (3)
Title |
---|
SVM和BP算法在气体识别中的对比研究;汪丹等;《传感技术学报》;20050326;第18卷(第1期);全文 * |
基于支持向量机和小波分解的气体识别研究;葛海峰;《仪器仪表学报》;20060630;第27卷(第6期);全文 * |
基于支持向量机算法的气体识别研究;汪丹;《传感器技术》;20050220;第24卷(第2期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN103544392A (en) | 2014-01-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103544392B (en) | Medical science Gas Distinguishing Method based on degree of depth study | |
CN108231201B (en) | Construction method, system and application method of disease data analysis processing model | |
CN111443165B (en) | Odor identification method based on gas sensor and deep learning | |
CN109543763B (en) | Raman spectrum analysis method based on convolutional neural network | |
CN103412003B (en) | Gas detection method based on self-adaption of semi-supervised domain | |
Wang et al. | Research on Healthy Anomaly Detection Model Based on Deep Learning from Multiple Time‐Series Physiological Signals | |
CN110298264B (en) | Human body daily behavior activity recognition optimization method based on stacked noise reduction self-encoder | |
CN113673346B (en) | Motor vibration data processing and state identification method based on multiscale SE-Resnet | |
CN112699960A (en) | Semi-supervised classification method and equipment based on deep learning and storage medium | |
CN110895705B (en) | Abnormal sample detection device, training device and training method thereof | |
CN111340132B (en) | Machine olfaction mode identification method based on DA-SVM | |
Gokhale et al. | GeneViT: Gene vision transformer with improved DeepInsight for cancer classification | |
CN110880369A (en) | Gas marker detection method based on radial basis function neural network and application | |
CN114372493B (en) | Computer cable electromagnetic leakage characteristic analysis method | |
WO2020247200A1 (en) | Likelihood ratios for out-of-distribution detection | |
CN115659174A (en) | Multi-sensor fault diagnosis method, medium and equipment based on graph regularization CNN-BilSTM | |
CN111105877A (en) | Chronic disease accurate intervention method and system based on deep belief network | |
CN117520914A (en) | Single cell classification method, system, equipment and computer readable storage medium | |
Eluri et al. | Cancer data classification by quantum-inspired immune clone optimization-based optimal feature selection using gene expression data: deep learning approach | |
Saha et al. | Graph convolutional network-based approach for parkinson’s disease classification using euclidean distance graphs | |
Jia et al. | Study on optimized Elman neural network classification algorithm based on PLS and CA | |
CN112580539A (en) | Long-term drift suppression method for electronic nose signals based on PSVM-LSTM | |
CN116738330A (en) | Semi-supervision domain self-adaptive electroencephalogram signal classification method | |
CN113673323B (en) | Aquatic target identification method based on multi-deep learning model joint judgment system | |
CN114781484A (en) | Cancer serum SERS spectrum classification method based on convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20160824 Termination date: 20171023 |
|
CF01 | Termination of patent right due to non-payment of annual fee |