Nothing Special   »   [go: up one dir, main page]

CN103544392B - Medical science Gas Distinguishing Method based on degree of depth study - Google Patents

Medical science Gas Distinguishing Method based on degree of depth study Download PDF

Info

Publication number
CN103544392B
CN103544392B CN201310503402.9A CN201310503402A CN103544392B CN 103544392 B CN103544392 B CN 103544392B CN 201310503402 A CN201310503402 A CN 201310503402A CN 103544392 B CN103544392 B CN 103544392B
Authority
CN
China
Prior art keywords
layer
theta
sigma
sample
network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310503402.9A
Other languages
Chinese (zh)
Other versions
CN103544392A (en
Inventor
刘启和
陈雷霆
蔡洪斌
邱航
蒲晓蓉
胡晓楠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN201310503402.9A priority Critical patent/CN103544392B/en
Publication of CN103544392A publication Critical patent/CN103544392A/en
Application granted granted Critical
Publication of CN103544392B publication Critical patent/CN103544392B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of medical science Gas Distinguishing Method based on degree of depth study, specifically used original frequency response signal, it is carried out simple normalization, then input stack autoencoder network, by successively extracting, final study obtains the abstract characteristics of initial data, whole network externally shields the processes such as extraction feature, dimensionality reduction, suppression drift, finally add a classification layer at network so that these features can be directly entered grader and classify simultaneously.Training process is divided into two steps of pre-training and fine setting, can be effectively improved the learning capacity of network, and after having trained, new samples input network can directly obtain the classification of prediction.The method of the present invention can automatically extract the effective distinguishing characteristic of medical science gas, the steps such as feature extraction, feature selection and suppression drift is merged into one, greatly simplifies the complexity of traditional method, improve the efficiency of gas detecting and identification.

Description

Medical gas identification method based on deep learning
Technical Field
The invention belongs to the technical field of biomedicine, and particularly relates to a medical gas identification method.
Background
The machine olfaction is an artificial intelligence system, and the basic principle is as follows: the odor molecules are adsorbed by the sensor array to generate electric signals, then various signal processing technologies are used for extracting characteristics, and judgment is made through a computer mode identification system to finish the work of gas identification, concentration measurement and the like. The electronic nose system is a typical application of machine olfaction, and plays a very important role in the medical field, such as diagnosing certain diseases, identifying bacterial species in blood, detecting gases harmful to the respiratory system, and the like.
Sensing gas detection and identification have important applications in the medical field, for example, electronic nose equipment can be used for collecting sample data in oral cavity, chest cavity and blood, various signal processing technologies are used for analysis and processing, and a computer mode identification system can make judgment so as to complete tasks such as disease diagnosis, germ identification, drug concentration determination and the like.
The conventional gas detection and identification method generally comprises the steps of feature extraction, feature selection and the like, and finally achieves a preset target through means of classification, regression, clustering and the like. For devices that require long term use, effective sensor drift compensation techniques must also be used to suppress the effects of drift. In medical applications, a compromise between accuracy and real-time is often required due to the complexity and inefficiency of these conventional methods.
The data sampled by the sensor can be regarded as a time series signal which is complex in structure, difficult to interpret and often high in dimensionality. For better identification, it is usually necessary to design features according to various attributes of the signal, then select features, such as dimension reduction, and classify them as input of a classification algorithm, such as a support vector machine.
Sensor drift is the slow and random variation in the response of a sensor over time. The change causes that the mode obtained by the mode recognition system at present by learning cannot be well applied to the subsequent sample to be detected, so that the accuracy of gas detection and recognition is gradually reduced. In medical applications, in order to suppress the influence of sensor drift, there are generally two measures: the method develops an effective drift compensation technology. The process is often independent of feature extraction, and the operation is complex and the efficiency is very low; due to the fact that the drift degree in a short time is small, the electronic nose device is maintained and updated regularly to guarantee that sampling data are stable and reliable, but the cost is undoubtedly greatly increased, and the service life of the device is shortened.
In fact, some well-designed features are very robust to drift. From this perspective, the two processes can be fused together simply by extracting better features to suppress sensor drift. The deep learning technology is used for analyzing, learning and interpreting data by establishing an artificial neural network comprising a plurality of hidden layers and simulating the human brain, can obtain a highly abstract representation of the data, is good at finding potential modes in the data, and is very suitable for solving the problems.
In the literature "m.trincolvilleli, s.coradeschi, a.loutfi, B.A method for effectively identifying pathogenic bacteria in a blood culture specimen is provided in P.thunberg, Direct identification of bacteria in blood culture using an electronic nose device, IEEE Trans biological Engineering57(12), 2884. sub.2890, 2010. the method comprises the steps of firstly obtaining sample data by sampling through an electronic nose device, then carrying out feature extraction and dimension reduction, and finally completing classification by using a support vector machine, wherein in a feature extraction part, two feature extraction methods of steady-state response and response derivative are adopted aiming at a signal overall waveform.
In order to obtain a higher recognition accuracy in a complicated problem, it is sometimes necessary to analyze a signal waveform more finely and extract a feature having a higher dimension. The literature "a.vergara, s.vembua, t.ayhanb, m.a.r.vitae, m.l.h.vitae and r.huertaa.chemical gas sensor drift compensation using classifier entities, Sensors and detectors B: Chemical, vol.166-167, pp.320-329, May 2012" studies how to improve the recognition accuracy of gases such as ethanol under drift, and 8 different features were designed.
In the case of classification algorithm determination, the recognition accuracy of the gas depends only on how good the features are. Compared with the frequency response value in the original signal, the well-designed features can greatly reduce the dimension redundancy, simultaneously highlight the difference between different categories, and generally can obtain better identification accuracy.
However, the characteristics of manual design are usually specific to some specific applications (gas type, sensor type, external environment, etc.), and thus the purpose is very strong and the universality is very poor. And due to the cross sensitivity of the sensor, the dimension of the finally extracted feature is still very high, and an efficient dimension reduction algorithm, such as PCA, LDA and the like, is usually required to be searched. If the recognition accuracy using existing features is not satisfactory in a new application, better features need to be designed, which undoubtedly further increases the complexity of the task.
The most effective way to suppress drift is to compensate for drift by periodic recalibration, the general idea being to find a linear transformation to normalize the sensor response so that the classifier can be applied directly to the transformed data.
CN1514239A discloses a method for realizing gas sensor drift detection and correction. The method improves the sensitivity and accuracy of the sensor drift detection by comprehensively utilizing the principal component analysis and wavelet transform technology. The detected drift sensor is corrected on line by adopting a correction method based on an adaptive drift model, and the drift model can be updated on line, so that the purposes of improving the reliability of a sensor system and prolonging the service life of the system are achieved.
In the literature "T.I.P.M.Holmberg, "Drift correction for gas sensors using multivariable methods," j.chemim., vol.14, No. 5-6, pp.711-723,2000, "approximate the Drift direction with a reference gas, and then modify the response of the gas to be analyzed as follows.
However, these methods assume that the drift law of the sensor is linear, which is not proven, and often require a reference gas whose chemical properties are relatively stable over time and highly similar to the gas to be analyzed in terms of sensor behavior, which is certainly quite demanding in practical applications. In addition, these methods are complicated and inefficient in practical applications.
Disclosure of Invention
The invention aims to simplify the complexity of the traditional gas detection and identification method, and develop a simpler, more efficient and more robust gas detection and identification method for sensor drift.
The method uses an original frequency response signal to simply normalize the original frequency response signal, then inputs the frequency response signal into a stack type self-coding network, finally learns to obtain abstract characteristics of original data through layer-by-layer extraction, externally shields the processes of extracting the characteristics, reducing dimension, inhibiting drift and the like for the whole network, and simultaneously adds a classification layer at the last of the network so that the characteristics can directly enter a classifier for classification. The training process is divided into two steps of pre-training and fine-tuning, the learning capacity of the network can be effectively improved, and after the training is finished, a new sample is input into the network to directly obtain the predicted category.
The technical scheme of the invention is as follows: a medical gas identification method based on deep learning comprises the following specific steps:
step 1, data normalization, wherein m samples are set, each sample is organized according to the following form, and v ═ s1,s2···,st]Wherein s isiIs the ith frequency response value, for a total of t response values, the entire gas data set and corresponding label can be represented as:
V = [ v 1 T , v 2 T , ... , v i T , ... , v m T ]
Y=[y1,y2,…,yi,…,ym]T
t represents the transposition of the vector, the ith column vi in the matrix V represents the ith sample, and the ith element in Y is the category label of the corresponding sample;
utilizing typeNormalizing a dataset to [0,1],
Wherein, Vi,jDenotes the ith frequency response value of the jth sample, L is the normalized lower bound of 0, U is the normalized upper bound of 1, maxiAnd miniFor the maximum and minimum values of each row in the matrix, the normalized data set is denoted by a(0)Represents;
step 2, pre-training a stacked self-coding network, wherein v, h and y respectively represent an input layer, a hidden layer and an output layer, and W(i)To connect the weight matrices of the layers, b(i)A bias vector for the hidden layer;
step 2.1, train the first layer, namely the first automatic coding machine, the objective function is:
J = 1 2 m Σ i ( v i - v ^ i ) + λ 2 Σ i Σ j W i j 2 + β Σ j [ ρ log ρ p j + ( 1 - ρ ) log 1 - ρ 1 - p j ] ;
wherein the first term is a reconstruction error term representing the degree of difference between the input and the output, wherein viRepresents the ith input sample after normalization in step 1,representing a sample viOutput at the output layer after passing through the network; the second term is called weight attenuation term, and is used for reducing the amplitude of the weight and preventing overfitting, wherein WijRepresenting a weight value between the jth unit of the current layer and the ith unit of the next layer, wherein the third item is a sparse penalty item, pj represents the average excitation of the hidden layer unit J, and lambda, β and rho are preset parameters;
optimizing an objective function, wherein for an automatic coding machine with n layers, the specific optimization steps are as follows:
step 2.1.1. random initialization parameter W(i)、b(i)Initializing a matrix or vector of all zeros, i.e. AW(i)=0,Δb(i)=0;
Step 2.1.2. orderFor each sample, the partial derivatives are calculated using a back propagation algorithmThe specific process is as follows:
feedforward calculation to obtain each layer excitation a(i)The calculation formula is a(i)=σ(W(i)a(i-1)+b(i)) Whereinis sigmoid function with output range of [0,1 ]];
For the output layer, the residual is calculated:(n)=-(v-a(n))·σ′(z(n)) Wherein "·" represents a vector dot product, wherein z is(n)=W(n-1)a(n-1)+b(n-1)σ' denotes the derivative of σ (x);
for each layer of l ═ n-1, n-2,.., 2, the following is calculated:
calculating a partial derivative value:
whereinAndboth represent J (W, b) to W(i)The partial derivatives of (a) are,andall represent J (W, b) to b(i)Partial derivatives of (a);
step 2.1.3, respectively adding the obtained partial derivatives to delta W(i),Δb(i)To do so, i.e.
Step 2.1.4, update parameter W(i),b(i)Wherein α is the learning rate;
step 2.1.5, repeating the steps 2.1.2 to 2.1.4, gradually reducing the value of the target function until the set threshold value is reached to obtain the parameters (W, b) of the coding layer and the parameters of the decoding layer
Step 2.2, discarding the decoding layer after trainingTaking the parameters (W, b) of the coding layer as initial parameters of the corresponding layer in the stacked self-coding network, namely W(1)=W,b(1)=b;
Step 2.3, calculating hidden layer excitation of the current automatic coding machine: a is(1)=σ(W(1)a(0)+b(1));
Step 2.4. at excitation a(1)Training a second layer, i.e. a second automatic coding machine, wherein the hidden layer of the first automatic coding machine is used as the input layer of the second automatic coding machine, the training process is the same as the process of training the first layer, but the input is changed into a(1). The initial parameter W of the second layer of the network is obtained after the training is finished(2),b(2)And hidden layer excitation a(2)
Step 2.5 for the third layer to the nth layer, the process of steps 2.1 to 2.4 is repeated to obtain the initial parameters of each hidden layer, and finally obtain the excitation a of the nth hidden layer(n)The excitation of this layer is also used as input for the softmax layer, denoted aS
Step 2.6 Using a obtained in step 2.5SAnd training the last layer of the network, namely the softmax classifier by using the label Y to obtain the initial parameter W of the last layerS
A is represented by x and thetaSAnd WSAnd assuming that there are k types, for the ith sample, the probability that the predicted class is labeled as the jth class is:
P ( y i = j | x i ; θ ) = expθ j T x i Σ l = 1 k expθ l T x i
wherein,the jth line in theta is represented as a row vector connecting the weights between the jth output cell and all input cells. l is a constant variable, l is more than or equal to 1 and less than or equal to k, and k is input a of the softmax layerSAnd initial parameters W of the softmax classifierSNumber of classes of (2), xiThe input value of softmax layer for the ith sample. The final output is a probability column vector P, the jth component represents the probability that the prediction sample is judged as the jth class, and the weight matrix theta is trained by using a minimization loss function:
J ( θ ) = - 1 m [ Σ i = 1 m Σ j = 1 k 1 { y i = j } log P ( y i = j ) ] + λ 2 Σ i = 1 m Σ j = 1 n θ i j 2
wherein logP (y)iJ) represents a certain probability value P (y)iJ) is the natural logarithm of the number of the pairs,is an indication function, and takes value as 1 when the condition in the brackets is true, and takes value as 0 if the condition in the brackets is not true; m is the number of samples, and n is the number of layers of the automatic coding machine;
step 3, fine adjustment, namely regarding the network as a whole, calculating the partial derivative of each layer parameter by using back propagation, and then performing iterative optimization by using a gradient descent method, wherein the specific process is as follows:
step 3.1. use formula a(i)=σ(W(i)a(i-1)+b(i)) Performing feedforward calculation to obtain the excitation a of each layer(i)
Step 3.2, calculating a softmax layer parameter WSPartial derivatives of (a):wherein, P is the conditional probability vector calculated in the step 2.6;
and 3.3, calculating the residual error of the last hidden layer as follows:wherein Denotes J (W, b) vs. a(n)Partial derivatives of (a)(n)Excitation for the nth hidden layer;
step 3.4. for each layer of l ═ n-1, n-2.
Step 3.5, calculating partial derivatives of all hidden layers:
and 3.6, updating parameters of each layer by using the partial derivatives obtained in the step:
W S ′ = W S - α ( 1 m Σ ▿ W S J + λ θ ) W ( i ) ′ = W ( i ) - α ( 1 m Σ ▿ W ( i ) J ) b ( i ) ′ = b ( i ) - α ( 1 m Σ ▿ b ( i ) J ) ;
whereinRepresents J (W, b) to WSPartial derivatives of (W)SAs an initial parameter of the softmax classifier,represents J (W, b) to W(i)The partial derivatives of (a) are,represents J (W, b) to b(i)Partial derivatives of (a);
step 3.7, repeating the steps, and reducing the value of the target function through iteration until the value reaches a set threshold value;
and 4, predicting the category of the prediction sample, wherein the specific process is as follows:
step 4.1. predict sample vpNormalized to [0,1 ]];
Step 4.2. for hidden layers, formula a is used(i)=σ(W(i)a(i-1)+b(i)) Performing feedforward calculation layer by layer to obtain input a of the softmax layerS
And 4.3, calculating a conditional probability vector P according to the probability calculation formula in the step 2.5, wherein the class corresponding to the maximum component is the class to which the sample is predicted to belong.
Each of i and j in the above formulas represents a constant parameter for counting.
The invention has the beneficial effects that: the invention designs a network structure suitable for medical gas signal processing, and performs layer-by-layer feature extraction on an input sample, and finally the feature dimension entering a classification layer is lower and has good robustness on drift. Compared with the traditional feature extraction method, the method can automatically extract the medical gas to effectively distinguish the features, integrates the steps of feature extraction, feature selection, drift inhibition and the like, greatly simplifies the complexity of the traditional method, and improves the efficiency of gas detection and identification. The concrete aspects are as follows:
except for training the softmax classifier in the step 2, class labels are not needed in other processes, so that the process of extracting the features is unsupervised; if the samples are scarce, a large number of unlabelled samples can be used for training all layers before the classification layer, and finally, a small number of labeled samples are used for fine adjustment;
secondly, as seen from the network structure, the number of units in each layer is less than that of the units in the previous layer, so that the input dimension finally entering the classifier is lower and far smaller than the original input, and the method can be regarded as a dimension reduction process;
the extraction features are automatically completed, and manual intervention is not introduced, so that the complexity of manual feature design is eliminated, and the method has wide applicability;
the extracted features have good robustness to drift, the gas detection and identification accuracy under drift can be effectively improved, and the service life of the device is prolonged.
Drawings
Fig. 1 is a flow chart of a medical gas identification method according to an embodiment of the invention.
Fig. 2 is a stacked self-encoding network for medical gas identification according to an embodiment of the present invention.
FIG. 3 is a block diagram of an exemplary embodiment of an automatic coding machine including a hidden layer.
Detailed Description
The embodiments of the present invention will be further described with reference to the accompanying drawings.
The gas identification method of the present invention has a general flow as shown in fig. 1:
step 1, data normalization, wherein m samples are set, each sample is organized according to the following form, and v ═ s1,s2..·,st]Wherein s isiIs the ith frequency response value, for a total of t response values, the entire gas data set and corresponding label can be represented as:
V = [ v 1 T , v 2 T , ... , v i T , ... , v m T ]
Y=[y1,y2,…,yi,…,ym]T
t denotes the transpose of the vector, V being the ith column in the matrix ViRepresenting the ith sample, wherein the ith element in Y is the category label of the corresponding sample;
utilizing typeNormalizing a dataset to [0,1],
Wherein, Vi,jDenotes the ith frequency response value of the jth sample, L is the normalized lower bound of 0, U is the normalized upper bound of 1, maxiAnd miniFor the maximum and minimum values of each row in the matrix, the normalized data set is denoted by a(0)And (4) showing.
Step 2, pre-training a stacked self-coding network, wherein v, h and y respectively represent an input layer, a hidden layer and an output layer, and W(i)To connect the weight matrices of the layers, b(i)Is the bias vector of the hidden layer.
The present invention uses a network structure similar to that shown in fig. 2, and the number of layers of the network and the number of units in each layer can be changed according to different specific tasks, so that the corresponding parameter forms can also be changed.
Such networks are often very deep in hierarchy, have many parameters, and are difficult to directly train, so a pre-training method is firstly adopted to train the parameters of each layer by layer. Compared with random initialization, pre-training can enable the parameters of each layer to be located at better positions in the parameter space.
The rest of the network, except for the softmax layer for classification, can be seen as a stack of several single-hidden-layer autocodes, where the output of the previous layer is connected to the input of the next layer. This kind of automatic coding machine acquires the excitation of hidden layer units by reconstructing (symbolized by the symbol ^) the input and represents as the characteristics of the original input, as shown in FIG. 3.
After the training of each automatic coding machine is completed, only the parameters of the coding layer, namely W and b, are reserved as the initial parameters of the corresponding layer in the stacked self-coding network, and the specific process is as follows:
step 2.1, train the first layer, namely the first automatic coding machine, the objective function is:
J = 1 2 m Σ i ( v i - v ^ i ) + λ 2 Σ i Σ j W i j 2 + β Σ j [ ρ l o g ρ p j + ( 1 - ρ ) l o g 1 - ρ 1 - p j ] ;
wherein the first term is a reconstruction error term representing the degree of difference between the input and the output, wherein viRepresents the ith input sample after normalization in step 1,representing a sample viThrough the netOutput at the output layer after being wound; the second term is called weight attenuation term, and is used for reducing the amplitude of the weight and preventing overfitting, wherein WijRepresenting the weight between the jth unit of the current layer and the ith unit of the next layer, and the third item is a sparse penalty item, wherein pjThe mean excitation of hidden layer unit J is represented by a preset parameter, and λ, β and ρ are parameters, which are used to make the mean excitation of all hidden layer units approach a very small number ρ, i.e., only a few hidden layer units are activated.
The objective function is optimized by using a gradient descent method, wherein the partial derivative of each parameter needs to be calculated during iteration, and the calculation process is completed by a back propagation algorithm (back propagation).
Optimizing an objective function, wherein for an automatic coding machine with n layers, the specific optimization steps are as follows:
step 2.1.1. random initialization parameter W(i)、b(i)Initializing a matrix or vector of all zeros, i.e. AW(i)=0,Δb(i)=0;
Step 2.1.2. orderFor each sample, the partial derivatives are calculated using a back propagation algorithmThe specific process is as follows:
feedforward calculation to obtain each layer excitation a(i)The calculation formula is a(i)=σ(W(i)a(i-1)+b(i)) Whereinis sigmoid function with output range of [0,1 ]];
For the output layer, the residual is calculated:(n)=-(v-a(n))·σ′(z(n)) Wherein "·" represents a vector dot product, wherein z is(n)=W(n-1)a(n-1)+b(n-1)σ' denotes the derivative of σ (x);
for each layer of l ═ n-1, n-2,.., 2, the following is calculated:
calculating a partial derivative value:
whereinAndboth represent J (W, b) to W(i)The partial derivatives of (a) are,andall represent J (W, b) to b(i)The partial derivatives of (1).
Step 2.1.3, respectively adding the obtained partial derivatives to delta W(i),Δb(i)To do so, i.e.
Step 2.1.4, update parameter W(i),b(i)Where α is the learning rate.
Step 2.1.5. repeat stepStep 2.1.2 to step 2.1.4, the value of the objective function is gradually decreased until a set threshold value. At this time, parameters (W, b) of the coding layer and parameters of the decoding layer can be obtained
Step 2.2, discarding the decoding layer after trainingTaking the parameters (W, b) of the coding layer as initial parameters of the corresponding layer in the stacked self-coding network, namely W(1)=W,b(1)=b。
Step 2.3, calculating hidden layer excitation of the current automatic coding machine: a is(1)=σ(W(1)a(0)+b(1))。
Step 2.4. at excitation a(1)Training a second layer, i.e. a second automatic coding machine, wherein the hidden layer of the first automatic coding machine is used as the input layer of the second automatic coding machine, the training process is the same as the process of training the first layer, but the input is changed into a(1). The initial parameter W of the second layer of the network is obtained after the training is finished(2),b(2)And hidden layer excitation a(2)
Step 2.5 for the third layer to the nth layer, the process of steps 2.1 to 2.4 is repeated to obtain the initial parameters of each hidden layer, and finally obtain the excitation a of the nth hidden layer(n)The excitation of this layer is also used as input for the softmax layer, denoted aS(ii) a The auto-encoder in this embodiment is 3 layers.
Step 2.6 Using a obtained in step 2.5SAnd training the last layer of the network, namely the softmax classifier by using the label Y to obtain the initial parameter W of the last layerS
softmax regression is a generalization of Logistics regression over multi-classification problems. For convenience of representation, a is represented by x and θ, respectivelySAnd WSAnd assume that there are k classes in common. For the firstThe probability of the predicted class marked as the jth class of the i samples is as follows:
P ( y i = j | x i ; θ ) = expθ j T x i Σ l = 1 k expθ l T x i
wherein,a j line in theta is represented, a line vector connecting weight values between the j output unit and all input units is represented, l is a constant variable, l is more than or equal to 1 and less than or equal to k, and k is input a of the softmax layerSAnd initial parameters W of the softmax classifierSNumber of classes of (2), xiThe input value of softmax layer for the ith sample. The final output is a probability column vector P, the jth component represents the probability that the prediction sample is judged as the jth class, and the weight matrix theta is trained by using a minimization loss function:
J ( θ ) = - 1 m [ Σ i = 1 m Σ j = 1 k 1 { y i = j } log P ( y i = j ) ] + λ 2 Σ i = 1 m Σ j = 1 n θ i j 2
wherein logP (y)iJ) represents the natural logarithm of some probability value,is an indication function, and takes value as 1 when the condition in the brackets is true, and takes value as 0 if the condition in the brackets is not true; the loss function is a strict convex function, and a global optimal solution can be obtained by using an optimization algorithm such as gradient descent or lbfgs. m is the number of samples and n is the number of layers of the automatic coding machine.
Here, the parameters connecting the two layers are all weight matrices, WSI.e. theta is the weight matrix connecting the last two layers.
The specific process of training the weight matrix θ by using the minimization loss function in this embodiment is as follows:
step 2.6.1, initializing a parameter matrix theta randomly;
step 2.6.2 direct calculation of the derivative of J (θ), where θjRepresents the jth column in the matrix;
▿ θ j J ( θ ) = - 1 m Σ i = 1 m [ x i ( 1 { y i = j } - P ( y i = j | x i ; θ ) ) ] + λθ j
step 2.6.3 updates the parameter θ:wherein α is the learning rate;represents J (theta) to thetajPartial derivatives of (a);
step 2.6.4 repeating steps 2.6.2 to 2.6.3, gradually reducing the value of J (theta) until reaching a set threshold value, wherein the obtained theta is the final weight matrix, namely WS
And step 3, fine adjustment, namely regarding the network as a whole, calculating the partial derivative of each layer parameter by using back propagation, and then performing iterative optimization by using a gradient descent method.
After the pre-training is completed, the initial parameters of each layer in the network are determined, and at this time, fine tuning needs to be performed on all the parameters once to improve the classification capability of the network. The fine tuning process is to regard the network as a whole, calculate the partial derivative of each layer parameter by back propagation, and then iteratively optimize by a gradient descent method. The network is no longer in order to be reconstructed, so the objective function is the same as that of the softmax layer, which is regarded as an additional layer and processed separately, and the optimization process of each hidden layer is basically the same as that described in step 2.1.
The specific process is as follows:
step 3.1. use formula a(i)=σ(W(i)a(i-1)+b(i)) Performing feedforward calculation to obtain the excitation a of each layer(i)
Step 3.2, calculating a softmax layer parameter WSPartial derivatives of (a):wherein, P is the conditional probability vector calculated in the step 2.6;
and 3.3, calculating the residual error of the last hidden layer as follows:wherein Denotes J (W, b) vs. a(n)Partial derivatives of (a)(n)Excitation for the nth hidden layer;
step 3.4. for each layer of l ═ n-1, n-2.(l)=((W(l))T (l+1))·σ′(z(l));
Step 3.5, calculating partial derivatives of all hidden layers:
and 3.6, updating parameters of each layer by using the partial derivatives obtained in the step:
W S ′ = W S - α ( 1 m Σ ▿ W S J + λ θ ) W ( i ) ′ = W ( i ) - α ( 1 m Σ ▿ W ( i ) J ) b ( i ) ′ = b ( i ) - α ( 1 m Σ ▿ b ( i ) J ) ;
whereinRepresents J (W, b) to WSPartial derivatives of (W)SAs an initial parameter of the softmax classifier,represents J (W, b) to W(i)The partial derivatives of (a) are,represents J (W, b) to b(i)Partial derivatives of (a);
step 3.7, repeating the steps, and reducing the value of the target function through iteration until the value reaches a set threshold value;
and 4, predicting the category of the prediction sample, wherein the specific process is as follows:
step 4.1. predict sample vpNormalized to [0,1 ]];
Step 4.2. for hidden layers, formula a is used(i)=σ(W(i)a(i-1)+b(i)) Performing feedforward calculation layer by layer to obtain input a of the softmax layerS
And 4.3, calculating a conditional probability vector P according to the probability calculation formula in the step 2.5, wherein the class corresponding to the maximum component is the class to which the sample is predicted to belong.
Each of i and j in the above formulas represents a constant parameter for counting.
The core content of the invention is to design a network structure suitable for medical gas signal processing, and process the patient gas data sampled by the electronic nose by adopting deep learning, thereby automatically extracting the more universal characteristic and the more robust characteristic to the sensor drift, finally simply and effectively completing the task of gas detection and identification, and the invention has great application value in the medical field requiring accuracy and real-time.

Claims (2)

1. A medical gas identification method based on deep learning comprises the following specific steps:
step 1, data normalization, wherein m samples are set, each sample is organized according to the following form, and v ═ s1,s2...,st]Wherein s isiIs the ith frequency response value, for a total of t response values, the entire gas data set and corresponding label can be represented as:
V = [ v 1 T , v 2 T , ... , v i T , ... , v m T ]
Y=[y1,y2,…,yi,…,ym]T
t denotes the transpose of the vector, V being the ith column in the matrix ViRepresenting the ith sample, wherein the ith element in Y is the category label of the corresponding sample;
utilizing typeNormalizing a dataset to [0,1],
Wherein v isi,jDenotes the ith frequency response value of the jth sample, L is the normalized lower bound of 0, U is the normalized upper bound of 1, maxiAnd miniFor the maximum and minimum values of each row in the matrix, the normalized data set is denoted by a(0)Represents;
step 2, pre-training a stacked self-coding network, wherein v, h and y respectively represent an input layer, a hidden layer and an output layer, and W(i)To connect the weight matrices of the layers, b(i)A bias vector for the hidden layer;
step 2.1, train the first layer, namely the first automatic coding machine, the objective function is:
J = 1 2 m Σ i ( v i - v ^ i ) + λ 2 Σ i Σ j W i j 2 + β Σ j [ ρ l o g ρ p j + ( 1 - ρ ) l o g 1 - ρ 1 - p j ] ;
wherein the first term is a reconstruction error term representing the degree of difference between the input and the output, wherein viRepresents the ith input sample after normalization in step 1,representing a sample viOutput at the output layer after passing through the network; the second term is called weight attenuation term, and is used for reducing the amplitude of the weight and preventing overfitting, wherein WijRepresenting the weight between the jth unit of the current layer and the ith unit of the next layer, and the third item is a sparse penalty item, wherein pjRepresenting the average excitation of the hidden layer unit J, wherein lambda, β and rho are preset parameters, m is the number of samples, and J represents the objective function of the first automatic coding machine;
optimizing an objective function, wherein for an automatic coding machine with n layers, the specific optimization steps are as follows:
step 2.1.1. random initialization parameter W(i)、b(i)Initializing a matrix or vector of all zeros, i.e. AW(i)=0,Δb(i)=0;
Step 2.1.2. orderFor each sample, the partial derivatives are calculated using a back propagation algorithmThe specific process is as follows:
feedforward calculation to obtain each layer excitation a(i)The calculation formula is a(i)=σ(W(i)a(i-1)+b(i)) Whereinis sigmoid function with output range of [0,1 ]];
For the output layer, the residual is calculated:(n)=-(v-a(n))·σ′(z(n)) Wherein "·" represents a vector dot product, wherein z is(n)=W(n-1)a(n-1)+b(n-1)σ' denotes the derivative of σ (x);
for each layer of l ═ n-1, n-2,.., 2, the following is calculated:(l)=((W(l))T (l+1))·σ′(z(l));
calculating a partial derivative value:
whereinAndboth represent J (W, b) to W(i)The partial derivatives of (a) are,andall represent J (W, b) to b(i)Partial derivatives of (a);
step 2.1.3, respectively adding the obtained partial derivatives to delta W(i),Δb(i)To do so, i.e.
Step 2.1.4, update parameter W(i),b(i)Wherein α is the learning rate;
step 2.1.5, repeating the steps 2.1.2 to 2.1.4, gradually reducing the value of the target function until the set threshold value is reached to obtain the parameters (W, b) of the coding layer and the parameters of the decoding layer
Step 2.2, discarding the decoding layer after trainingTaking the parameters (W, b) of the coding layer as initial parameters of the corresponding layer in the stacked self-coding network, namely W(1)=W,b(1)=b;
Step 2.3, calculating hidden layer excitation of the current automatic coding machine: a is(1)=σ(W(1)a(0)+b(1));
Step 2.4. at excitation a(1)Training a second layer, i.e. a second automatic coding machine, wherein the hidden layer of the first automatic coding machine is used as the input layer of the second automatic coding machine, the training process is the same as the process of training the first layer, but the input is changed into a(1)And obtaining initial parameters W of the second layer of the network after training(2),b(2)And hidden layer excitation a(2)
Step 2.5 for the third layer to the nth layer, the process of steps 2.1 to 2.4 is repeated to obtain the initial parameters of each hidden layer, and finally obtain the excitation a of the nth hidden layer(n)The excitation of this layer is also used as input for the softmax layer, denoted aS
Step 2.6 Using a obtained in step 2.5SAnd label Y training the last layer of the network, namely the softmax classifier to obtainInitial parameter W to the last layerS
A is represented by x and thetaSAnd WSAnd assuming that there are k types, for the ith sample, the probability that the predicted class is labeled as the jth class is:
P ( y i = j | x i ; θ ) = expθ j T x i Σ l = 1 k expθ l T x i
wherein,denotes the jth line in theta, denotes a line vector connecting weight values between the jth output cell and all input cells, and k is the input a of the softmax layerSAnd initial parameters W of the softmax classifierSThe number of categories of (2); the final output is a probability column vector P, the jth component represents the probability that the prediction sample is judged as the jth class, and the weight matrix theta is trained by using a minimization loss function:
J ( θ ) = - 1 m [ Σ i = 1 m Σ j = 1 k 1 { y i = j } log P ( y i = j ) ] + λ 2 Σ i = 1 m Σ j = 1 n θ i j 2
wherein logP (y)iJ) represents a certain probability value P (y)iJ) is the natural logarithm of the number of the pairs,is an indication function, and takes value as 1 when the condition in the brackets is true, and takes value as 0 if the condition in the brackets is not true; m is the number of samples, and n is the number of layers of the automatic coding machine;
step 3, fine adjustment, namely regarding the network as a whole, calculating the partial derivative of each layer parameter by using back propagation, and then performing iterative optimization by using a gradient descent method, wherein the specific process is as follows:
step 3.1. use formula a(i)=σ(W(i)a(i-1)+b(i)) Performing feedforward calculation to obtain the excitation a of each layer(i)
Step 3.2, calculating a softmax layer parameter WSPartial derivatives of (a):wherein, P is the conditional probability vector calculated in the step 2.6;
and 3.3, calculating the residual error of the last hidden layer as follows:wherein Denotes J (W, b) vs. a(n)Partial derivatives of (a)(n)Excitation for the nth hidden layer;
step 3.4. for each layer of l ═ n-1, n-2.(l)=((W(l))T (l+1))·σ′(z(l));
Step 3.5, calculating partial derivatives of all hidden layers:
and 3.6, updating parameters of each layer by using the partial derivatives obtained in the step:
W S ′ = W S - α ( 1 m Σ ▿ W S J + λ θ ) W ( i ) ′ = W ( i ) - α ( 1 m Σ ▿ W ( i ) J ) b ( i ) ′ = b ( i ) - α ( 1 m Σ ▿ b ( i ) J ) ;
whereinRepresents J (W, b) to WSPartial derivatives of (W)SAs an initial parameter of the softmax classifier,represents J (W, b) to W(i)The partial derivatives of (a) are,represents J (W, b) to b(i)Partial derivatives of (a); step 3.7, repeating the steps, and reducing the value of the target function through iteration until the value reaches a set threshold value;
and 4, predicting the category of the prediction sample, wherein the specific process is as follows:
step 4.1. predict sample vpNormalized to [0,1 ]];
Step 4.2. for hidden layers, formula a is used(i)=σ(W(i)a(i-1)+b(i)) Performing feedforward calculation layer by layer to obtain input a of the softmax layerS
And 4.3, calculating a conditional probability vector P according to the probability calculation formula in the step 2.5, wherein the class corresponding to the maximum component is the class to which the sample is predicted to belong.
2. The deep learning-based medical gas identification method according to claim 1, wherein the process of training the weight matrix θ in step 2.6 by using the minimization loss function is as follows:
step 2.6.1, initializing a parameter matrix theta randomly;
step 2.6.2 direct calculation of the derivative of J (θ), where θjRepresents the jth column in the matrix;
▿ θ j J ( θ ) = - 1 m Σ i = 1 m [ x i ( 1 { y i = j } - P ( y i = j | x i ; θ ) ) ] + λθ j
step 2.6.3 updates the parameter θ:wherein α is the learning rate;represents J (theta) to thetajPartial derivatives of (a);
step 2.6.4 repeating steps 2.6.2 to 2.6.3, gradually reducing the value of J (theta) until reaching a set threshold value, wherein the obtained theta is the final weight matrix, namely WS
CN201310503402.9A 2013-10-23 2013-10-23 Medical science Gas Distinguishing Method based on degree of depth study Expired - Fee Related CN103544392B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310503402.9A CN103544392B (en) 2013-10-23 2013-10-23 Medical science Gas Distinguishing Method based on degree of depth study

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310503402.9A CN103544392B (en) 2013-10-23 2013-10-23 Medical science Gas Distinguishing Method based on degree of depth study

Publications (2)

Publication Number Publication Date
CN103544392A CN103544392A (en) 2014-01-29
CN103544392B true CN103544392B (en) 2016-08-24

Family

ID=49967837

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310503402.9A Expired - Fee Related CN103544392B (en) 2013-10-23 2013-10-23 Medical science Gas Distinguishing Method based on degree of depth study

Country Status (1)

Country Link
CN (1) CN103544392B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103996056B (en) * 2014-04-08 2017-05-24 浙江工业大学 Tattoo image classification method based on deep learning
CN104021224A (en) * 2014-06-25 2014-09-03 中国科学院自动化研究所 Image labeling method based on layer-by-layer label fusing deep network
CN104484684B (en) * 2015-01-05 2018-11-02 苏州大学 A kind of Manuscripted Characters Identification Method and system
CN105844331B (en) * 2015-01-15 2018-05-25 富士通株式会社 The training method of nerve network system and the nerve network system
CN104866727A (en) * 2015-06-02 2015-08-26 陈宽 Deep learning-based method for analyzing medical data and intelligent analyzer thereof
CN105913079B (en) * 2016-04-08 2019-04-23 重庆大学 Electronic nose isomeric data recognition methods based on the study of the target domain migration limit
CN106202054B (en) * 2016-07-25 2018-12-14 哈尔滨工业大学 A kind of name entity recognition method towards medical field based on deep learning
CN106264460B (en) * 2016-07-29 2019-11-19 北京医拍智能科技有限公司 The coding/decoding method and device of cerebration multidimensional time-series signal based on self study
CN106156530A (en) * 2016-08-03 2016-11-23 北京好运到信息科技有限公司 Health check-up data analysing method based on stack own coding device and device
CN108122035B (en) * 2016-11-29 2019-10-18 科大讯飞股份有限公司 End-to-end modeling method and system
CN107368670A (en) * 2017-06-07 2017-11-21 万香波 Stomach cancer pathology diagnostic support system and method based on big data deep learning
CN107368671A (en) * 2017-06-07 2017-11-21 万香波 System and method are supported in benign gastritis pathological diagnosis based on big data deep learning
WO2019005611A1 (en) * 2017-06-26 2019-01-03 D5Ai Llc Selective training for decorrelation of errors
CN108416439B (en) * 2018-02-09 2020-01-03 中南大学 Oil refining process product prediction method and system based on variable weighted deep learning
CN109472303A (en) * 2018-10-30 2019-03-15 浙江工商大学 A kind of gas sensor drift compensation method based on autoencoder network decision
DE102019220455A1 (en) * 2019-12-20 2021-06-24 Robert Bosch Gesellschaft mit beschränkter Haftung Method and device for operating a gas sensor
CN111474297B (en) * 2020-03-09 2022-05-03 重庆邮电大学 Online drift compensation method for sensor in bionic olfaction system
CN111340132B (en) * 2020-03-10 2024-02-02 南京工业大学 Machine olfaction mode identification method based on DA-SVM
CN111915069B (en) * 2020-07-17 2021-12-07 天津理工大学 Deep learning-based detection method for distribution of lightweight toxic and harmful gases

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101135639A (en) * 2007-09-27 2008-03-05 中国人民解放军空军工程大学 Mixture gas component concentration infrared spectrum analysis method based on supporting vector quantities machine correct model
CN102411687A (en) * 2011-11-22 2012-04-11 华北电力大学 Deep learning detection method for unknown malicious codes
CN103267793A (en) * 2013-05-03 2013-08-28 浙江工商大学 Carbon nano-tube ionization self-resonance type gas sensitive sensor

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101042339B (en) * 2006-03-21 2012-05-30 深圳迈瑞生物医疗电子股份有限公司 Device for recognizing zone classification of anesthetic gas type and method thereof

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101135639A (en) * 2007-09-27 2008-03-05 中国人民解放军空军工程大学 Mixture gas component concentration infrared spectrum analysis method based on supporting vector quantities machine correct model
CN102411687A (en) * 2011-11-22 2012-04-11 华北电力大学 Deep learning detection method for unknown malicious codes
CN103267793A (en) * 2013-05-03 2013-08-28 浙江工商大学 Carbon nano-tube ionization self-resonance type gas sensitive sensor

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
SVM和BP算法在气体识别中的对比研究;汪丹等;《传感技术学报》;20050326;第18卷(第1期);全文 *
基于支持向量机和小波分解的气体识别研究;葛海峰;《仪器仪表学报》;20060630;第27卷(第6期);全文 *
基于支持向量机算法的气体识别研究;汪丹;《传感器技术》;20050220;第24卷(第2期);全文 *

Also Published As

Publication number Publication date
CN103544392A (en) 2014-01-29

Similar Documents

Publication Publication Date Title
CN103544392B (en) Medical science Gas Distinguishing Method based on degree of depth study
CN108231201B (en) Construction method, system and application method of disease data analysis processing model
CN111443165B (en) Odor identification method based on gas sensor and deep learning
CN109543763B (en) Raman spectrum analysis method based on convolutional neural network
CN103412003B (en) Gas detection method based on self-adaption of semi-supervised domain
Wang et al. Research on Healthy Anomaly Detection Model Based on Deep Learning from Multiple Time‐Series Physiological Signals
CN110298264B (en) Human body daily behavior activity recognition optimization method based on stacked noise reduction self-encoder
CN113673346B (en) Motor vibration data processing and state identification method based on multiscale SE-Resnet
CN112699960A (en) Semi-supervised classification method and equipment based on deep learning and storage medium
CN110895705B (en) Abnormal sample detection device, training device and training method thereof
CN111340132B (en) Machine olfaction mode identification method based on DA-SVM
Gokhale et al. GeneViT: Gene vision transformer with improved DeepInsight for cancer classification
CN110880369A (en) Gas marker detection method based on radial basis function neural network and application
CN114372493B (en) Computer cable electromagnetic leakage characteristic analysis method
WO2020247200A1 (en) Likelihood ratios for out-of-distribution detection
CN115659174A (en) Multi-sensor fault diagnosis method, medium and equipment based on graph regularization CNN-BilSTM
CN111105877A (en) Chronic disease accurate intervention method and system based on deep belief network
CN117520914A (en) Single cell classification method, system, equipment and computer readable storage medium
Eluri et al. Cancer data classification by quantum-inspired immune clone optimization-based optimal feature selection using gene expression data: deep learning approach
Saha et al. Graph convolutional network-based approach for parkinson’s disease classification using euclidean distance graphs
Jia et al. Study on optimized Elman neural network classification algorithm based on PLS and CA
CN112580539A (en) Long-term drift suppression method for electronic nose signals based on PSVM-LSTM
CN116738330A (en) Semi-supervision domain self-adaptive electroencephalogram signal classification method
CN113673323B (en) Aquatic target identification method based on multi-deep learning model joint judgment system
CN114781484A (en) Cancer serum SERS spectrum classification method based on convolutional neural network

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160824

Termination date: 20171023

CF01 Termination of patent right due to non-payment of annual fee