CN111612084A

CN111612084A - An Optimization Method for Deep Non-negative Matrix Factorization Networks with Classifiers

Info

Publication number: CN111612084A
Application number: CN202010456563.7A
Authority: CN
Inventors: 张焱; 郭京龙; 黄庆卿; 陈俊华; 李帅永
Original assignee: Chongqing University of Post and Telecommunications
Current assignee: Chongqing University of Post and Telecommunications
Priority date: 2020-05-26
Filing date: 2020-05-26
Publication date: 2020-09-01

Abstract

The invention relates to an optimization method of a deep non-negative matrix factorization (NMF) network with a classifier, and belongs to the technical field of artificial intelligence. The invention includes model construction and parameter optimization. In the aspect of model construction, multiple NMF layers and classification layers are cascaded to form a deep network, that is, the decomposition result of the previous NMF layer is used as the input of the next NMF layer, and a mapping function is used between different NMF layers. connect. In terms of parameter optimization, the unsupervised layer-by-layer pre-training of the deep NMF network is performed based on the multiplicative iterative rule, and the supervised global optimization is based on the BP algorithm to optimize the weight parameters of each NMF layer and the Softmax classification layer as a whole. The test data is analyzed by using a deep NMF network optimized by training, and the classification output results are obtained. The invention is suitable for applications related to classification and identification tasks such as state monitoring and diagnosis.

Description

An Optimization Method for Deep Non-negative Matrix Factorization Networks with Classifiers

技术领域technical field

本发明属于人工智能技术领域，涉及一种带分类器深度非负矩阵分解网络的优化方法。The invention belongs to the technical field of artificial intelligence, and relates to an optimization method for a deep non-negative matrix decomposition network with a classifier.

背景技术Background technique

非负矩阵分解是一种较新颖的矩阵分解思想，其要求分解后的所有分量均为非负值(要求纯加性的描述)，同时能实现非线性的维数约简以及稀疏性特征表示，目前已在图像处理、计算机视觉、模式识别、生物医学、信号处理等领域得到众多成功应用。但是，传统单层网络的特征表达能力有限，为进一步提高网络的分类或回归能力，深度学习是近年来机器学习领域一个重要研究方向，深度网络通过构建具有自学习能力的多重抽象层，可以对输入数据进行逐级提取，实现对数据更高层次的特征表示，目前深度网络已在许多分类识别任务中取得了显著成果。Non-negative matrix factorization is a relatively novel matrix factorization idea, which requires all components after decomposition to be non-negative values (requiring purely additive description), and can achieve nonlinear dimension reduction and sparse feature representation. It has been successfully applied in many fields such as image processing, computer vision, pattern recognition, biomedicine, and signal processing. However, the feature expression ability of traditional single-layer network is limited. In order to further improve the classification or regression ability of the network, deep learning has become an important research direction in the field of machine learning in recent years. By constructing multiple abstract layers with self-learning ability, deep network can The input data is extracted step by step to achieve a higher-level feature representation of the data. At present, deep networks have achieved remarkable results in many classification and recognition tasks.

在这样的背景下，非负矩阵分解思想与深度学习的结合有望兼顾二者的优势，通过构建深度非负矩阵分解网络以提升单层NMF的特征提取性能，同时NMF纯加性描述为深度网络提供了直观可理解的层次特征学习过程。深度NMF网络(Guo Z,Zhang S.Sparse DeepNonnegativeMatrixFactorization[J].BigDataMiningandAnalytics，2020,3(1):13-28.SongHA,Kim B,Xuan T L,et al,Hierarchical feature extraction by multi-layernon-negative matrix factorization network for classification task[J].Neurocomputing，2015,165:63-74.Cichocki A,ZdunekR.Multilayernonnegativematrixfactorization,Electron.Lett.2006,42(16):947–948.)开始得到关注。但是，现有的深度NMF网络仅对NMF特征提取层进行训练优化，未实现NMF特征提取层与分类层统一的模型全局优化。针对这一问题，本发明提出一种具有深度学习结构的带分类器的多层非负矩阵分解网络，并设计了无监督逐层预训练和有监督全局优化相结合的模型参数优化方法。In this context, the combination of non-negative matrix factorization and deep learning is expected to take into account the advantages of both. By constructing a deep non-negative matrix factorization network to improve the feature extraction performance of single-layer NMF, NMF is purely additively described as a deep network. Provides an intuitive and understandable hierarchical feature learning process. Deep NMF Network (Guo Z, Zhang S. Sparse DeepNonnegative Matrix Factorization [J]. Big Data Mining and Analytics, 2020, 3(1): 13-28. SongHA, Kim B, Xuan T L, et al, Hierarchical feature extraction by multi-layernon-negative matrix factorization network for classification task [J]. Neurocomputing, 2015, 165: 63-74. Cichocki A, Zdunek R. Multilayer non negative matrix factorization, Electron. Lett. 2006, 42(16): 947–948.) began to gain attention. However, the existing deep NMF network only trains and optimizes the NMF feature extraction layer, and does not realize the unified model global optimization of the NMF feature extraction layer and the classification layer. In response to this problem, the present invention proposes a multi-layer non-negative matrix factorization network with a classifier with a deep learning structure, and designs a model parameter optimization method combining unsupervised layer-by-layer pre-training and supervised global optimization.

发明内容SUMMARY OF THE INVENTION

有鉴于此，本发明的目的在于提供一种带分类器深度非负矩阵分解网络的优化方法。通过分析非负矩阵分解和深度网络基本结构，采用级联思想构造具有深度学习结构的带分类器的多层非负矩阵分解网络，利用无监督逐层预训练和有监督全局优化项结合方法实现深度模型特征提取层与分类层统一的模型全局优化。In view of this, the purpose of the present invention is to provide an optimization method for a deep non-negative matrix factorization network with a classifier. By analyzing the basic structure of non-negative matrix factorization and deep network, the cascade idea is used to construct a multi-layer non-negative matrix factorization network with classifiers with deep learning structure. The global optimization of the model with the unified feature extraction layer and classification layer of the deep model.

为达到上述目的，本发明提供如下技术方案：To achieve the above object, the present invention provides the following technical solutions:

一种带分类器深度非负矩阵分解网络的优化方法，该方法包括以下步骤：An optimization method for a deep non-negative matrix factorization network with a classifier, the method includes the following steps:

S1：输入原始数据，训练集为{X_k,G_k}，其中k＝1,2,...,n，n为训练样本数，对模型输入数据进行预处理，记预处理后数据为

S1: Input the original data, the training set is {X _k ,G _k }, where k=1,2,...,n, n is the number of training samples, preprocess the input data of the model, record the preprocessed data as

S2：设置深度网络中NMF层数为L，各NMF层低维特征空间维数分别为r⁽¹⁾,r⁽²⁾,...,r^(L)，对多个NMF层以及分类层进行级联，构造带分类器深度NMF网络，前一NMF层分解结果作为后一NMF层的输入，不同NMF层间以映射函数相连接；S2: Set the number of NMF layers in the deep network to L, and the low-dimensional feature space dimensions of each NMF layer are r ⁽¹⁾ , r ⁽²⁾ ,..., r ^(L) respectively, for multiple NMF layers and classification layers Concatenation is performed to construct a deep NMF network with a classifier. The decomposition result of the previous NMF layer is used as the input of the next NMF layer, and the different NMF layers are connected by a mapping function;

S3：对步骤S2构造的深度NMF网络，基于乘性迭代规则对各层NMF进行无监督预训练；S3: For the deep NMF network constructed in step S2, perform unsupervised pre-training on each layer of NMF based on the multiplicative iterative rule;

S4：对步骤S3得到的预训练深度NMF网络，基于BP算法对各NMF层以及Softmax分类层连接权值参数进行有监督全局优化；S4: For the pre-trained deep NMF network obtained in step S3, perform supervised global optimization on the connection weight parameters of each NMF layer and the Softmax classification layer based on the BP algorithm;

S5：依据步骤S4的所得训练优化的深度NMF网络，对输入的测试数据样本进行分析，得到相应的分类输出结果。S5: According to the obtained training and optimized deep NMF network in step S4, analyze the input test data samples to obtain corresponding classification output results.

可选的，在所述步骤S1中，采用短时傅里叶变换时频分析方法对输入数据计算时频幅值谱进行预处理。Optionally, in the step S1, a short-time Fourier transform time-frequency analysis method is used to preprocess the input data to calculate the time-frequency amplitude spectrum.

在所述步骤S2中，采用Softmax作为分类层函数构建带分类器深度NMF网络。In the step S2, Softmax is used as the classification layer function to construct a deep NMF network with a classifier.

可选的，在所述步骤S3中，基于乘性迭代规则对深度NMF网络进行无监督预训练：Optionally, in the step S3, unsupervised pre-training is performed on the deep NMF network based on the multiplicative iterative rule:

S31：将数据X^(i-1)输入第i层NMF网络；S31: Input the data X ^(i-1) into the i-th layer NMF network;

S32：设置算法终止阈值e和最大迭代次数t_max，初始化基向量矩阵W⁽ⁱ⁾和低维特征矩阵H⁽ⁱ⁾；设

为N维向量数据集合，则基向量矩阵

低维特征矩阵

r为低维特征空间维数，一般情况下，r比N和n小很多，且满足((N+n)r＜Nn；S32: set the algorithm termination threshold e and the maximum number of iterations t _max , initialize the basis vector matrix W ⁽ⁱ⁾ and the low-dimensional feature matrix H ⁽ⁱ⁾ ; set

is an N-dimensional vector data set, then the basis vector matrix

low-dimensional feature matrix

r is the dimension of the low-dimensional feature space. In general, r is much smaller than N and n, and satisfies ((N+n)r<Nn;

S33：更新基向量矩阵W⁽ⁱ⁾和低维特征矩阵H⁽ⁱ⁾，迭代更新规则定义为：S33: Update the basis vector matrix W ⁽ⁱ⁾ and the low-dimensional feature matrix H ⁽ⁱ⁾ , the iterative update rule is defined as:

S34：计算预训练阶段NMF层目标函数值C，C定义为S34: Calculate the objective function value C of the NMF layer in the pre-training stage, and C is defined as

S35：比较目标函数值C^(t)与C^(t+1)，若||C^(t+1)-C^(t)||<e成立或者达到最大迭代次数t_max，则算法终止，并得到第i层NMF网络的基向量矩阵W⁽ⁱ⁾和低维特征H⁽ⁱ⁾，否则，循环步骤S33至步骤S35；S35: Compare the objective function values C ^(t) and C ^(t+1) , if ||C ^(t+1) -C ^(t) ||<e is established or the maximum number of iterations t _max is reached, the algorithm terminates, and Obtain the basis vector matrix W ⁽ⁱ⁾ and the low-dimensional feature H ⁽ⁱ⁾ of the i-th layer NMF network, otherwise, cycle from step S33 to step S35;

S36：对低维特征H⁽ⁱ⁾进行映射处理得到第i+1层NMF的输入数据X⁽ⁱ⁾，基于Sigmoid函数的非线性映射为：S36: Perform the mapping process on the low-dimensional feature H ⁽ⁱ⁾ to obtain the input data X ⁽ⁱ⁾ of the i+1th layer NMF, and the nonlinear mapping based on the sigmoid function is:

f(x)＝1/(1+exp(-x))f(x)=1/(1+exp(-x))

S37：重复步骤S31至步骤S37，直至i＞L完成对各NMF层无监督逐层预训练。S37: Repeat steps S31 to S37 until i>L to complete the unsupervised layer-by-layer pre-training of each NMF layer.

可选的，在所述步骤S4中，基于BP算法对深度NMF网络进行有监督全局优化：Optionally, in the step S4, supervised global optimization is performed on the deep NMF network based on the BP algorithm:

S41：将带标签数据X^(i-1)输入第i层NMF网络；S41: Input the labeled data X ^(i-1) into the i-th layer NMF network;

S42：设置算法终止阈值e和最大迭代次数t_max，初始化低维特征矩阵H⁽ⁱ⁾；S42: Set the algorithm termination threshold e and the maximum number of iterations t _max , and initialize the low-dimensional feature matrix H ⁽ⁱ⁾ ;

S43：固定基向量矩阵W⁽ⁱ⁾，对低维特征矩阵H⁽ⁱ⁾迭代更新，迭代规则为：S43: fix the basis vector matrix W ⁽ⁱ⁾ , iteratively update the low-dimensional feature matrix H ⁽ⁱ⁾ , the iterative rule is:

S44：计算有监督全局优化阶段NMF层目标函数值C，C定义为S44: Calculate the objective function value C of the NMF layer in the supervised global optimization stage, where C is defined as

S45：比较目标函数值C^(t)与C^(t+1)，若||C^(t+1)-C^(t)||<e成立或者达到最大迭代次数t_max，则算法终止，并得到第i层NMF网络低维特征H⁽ⁱ⁾，否则，循环步骤S43至步骤S45；S45: Compare the objective function values C ^(t) and C ^(t+1) , if ||C ^(t+1) -C ^(t) ||<e is established or the maximum number of iterations t _max is reached, the algorithm terminates, and Obtain the low-dimensional feature H ⁽ⁱ⁾ of the i-th layer NMF network, otherwise, loop steps S43 to S45;

S46：计算第i层NMF网络代价函数S46: Calculate the cost function of the i-th layer NMF network

其中，

为权重约束项，α为平衡权重约束项的系数，f(w_ij)定义为：in,

is the weight constraint term, α is the coefficient of the balance weight constraint term, f(w _ij ) is defined as:

S47：对低维特征H⁽ⁱ⁾进行映射处理得到第i+1层NMF的输入数据X⁽ⁱ⁾，基于Sigmoid函数的非线性映射为：S47: Perform the mapping process on the low-dimensional feature H ⁽ⁱ⁾ to obtain the input data X ⁽ⁱ⁾ of the i+1th layer NMF, and the nonlinear mapping based on the sigmoid function is:

f(x)＝1/(1+exp(-x))f(x)=1/(1+exp(-x))

S48：重复步骤S41至步骤S47，直至i＞L；S48: Repeat steps S41 to S47 until i>L;

S49：将第L层NMF的输出X^(L)输入Softmax分类器，计算分类层代价函数值，公式如下：S49: Input the output X ^(L) of the L-th layer NMF into the Softmax classifier, and calculate the cost function value of the classification layer. The formula is as follows:

其中，

为Softmax错误分类代价函数，K为类别数，y_r为样本类标签；in,

is the Softmax misclassification cost function, K is the number of categories, and y _r is the sample class label;

S410：计算带分类器深度NMF网络总体代价函数值，公式如下：S410: Calculate the overall cost function value of the deep NMF network with the classifier, the formula is as follows:

其中，W_DN包括各NMF层以及Softmax分类层权值参数。Among them, W _DN includes each NMF layer and Softmax classification layer weight parameters.

S411：将带分类器深度NMF网络所有层视为一个模型，基于梯度下降算法，经过多次迭代以期使网络的总体代价函数值最小。计算各层的输出，每层的重构误差，根据误差修正相应的参数，优化各NMF层以及Softmax分类层权值参数。S411: Treat all layers of the deep NMF network with a classifier as a model, and based on the gradient descent algorithm, after multiple iterations, it is hoped that the overall cost function value of the network is minimized. Calculate the output of each layer, the reconstruction error of each layer, correct the corresponding parameters according to the error, and optimize the weight parameters of each NMF layer and Softmax classification layer.

本发明的有益效果在于：NMF算法具有收敛速度快、左右非负矩阵存储空间小的特点，它能将高维的数据矩阵降维处理，分解得到的低维矩阵具有天然的稀疏性和鲁棒性，适合处理大规模数据。深度NMF网络提升了单层NMF的特征提取性能，同时为深度网络提供了直观可理解的层次特征学习过程。The beneficial effects of the invention are: the NMF algorithm has the characteristics of fast convergence speed and small storage space for left and right non-negative matrices, it can reduce the dimension of high-dimensional data matrix, and the decomposed low-dimensional matrix has natural sparsity and robustness It is suitable for processing large-scale data. The deep NMF network improves the feature extraction performance of single-layer NMF, while providing an intuitive and understandable hierarchical feature learning process for the deep network.

本发明的其他优点、目标和特征在某种程度上将在随后的说明书中进行阐述，并且在某种程度上，基于对下文的考察研究对本领域技术人员而言将是显而易见的，或者可以从本发明的实践中得到教导。本发明的目标和其他优点可以通过下面的说明书来实现和获得。Other advantages, objects, and features of the present invention will be set forth in the description that follows, and will be apparent to those skilled in the art based on a study of the following, to the extent that is taught in the practice of the present invention. The objectives and other advantages of the present invention may be realized and attained by the following description.

附图说明Description of drawings

为了使本发明的目的、技术方案和优点更加清楚，下面将结合附图对本发明作优选的详细描述，其中：In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be preferably described in detail below with reference to the accompanying drawings, wherein:

图1为本发明所述带分类器深度非负矩阵分解网络结构示意图；1 is a schematic diagram of the deep non-negative matrix decomposition network structure with a classifier according to the present invention;

图2为本发明基于深度非负矩阵分解网络的分类识别流程图。FIG. 2 is a flow chart of the classification and recognition based on the deep non-negative matrix factorization network of the present invention.

具体实施方式Detailed ways

以下通过特定的具体实例说明本发明的实施方式，本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用，本说明书中的各项细节也可以基于不同观点与应用，在没有背离本发明的精神下进行各种修饰或改变。需要说明的是，以下实施例中所提供的图示仅以示意方式说明本发明的基本构想，在不冲突的情况下，以下实施例及实施例中的特征可以相互组合。The embodiments of the present invention are described below through specific specific examples, and those skilled in the art can easily understand other advantages and effects of the present invention from the contents disclosed in this specification. The present invention can also be implemented or applied through other different specific embodiments, and various details in this specification can also be modified or changed based on different viewpoints and applications without departing from the spirit of the present invention. It should be noted that the drawings provided in the following embodiments are only used to illustrate the basic idea of the present invention in a schematic manner, and the following embodiments and features in the embodiments can be combined with each other without conflict.

其中，附图仅用于示例性说明，表示的仅是示意图，而非实物图，不能理解为对本发明的限制；为了更好地说明本发明的实施例，附图某些部件会有省略、放大或缩小，并不代表实际产品的尺寸；对本领域技术人员来说，附图中某些公知结构及其说明可能省略是可以理解的。Among them, the accompanying drawings are only used for exemplary description, and represent only schematic diagrams, not physical drawings, and should not be construed as limitations of the present invention; in order to better illustrate the embodiments of the present invention, some parts of the accompanying drawings will be omitted, The enlargement or reduction does not represent the size of the actual product; it is understandable to those skilled in the art that some well-known structures and their descriptions in the accompanying drawings may be omitted.

本发明实施例的附图中相同或相似的标号对应相同或相似的部件；在本发明的描述中，需要理解的是，若有术语“上”、“下”、“左”、“右”、“前”、“后”等指示的方位或位置关系为基于附图所示的方位或位置关系，仅是为了便于描述本发明和简化描述，而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作，因此附图中描述位置关系的用语仅用于示例性说明，不能理解为对本发明的限制，对于本领域的普通技术人员而言，可以根据具体情况理解上述术语的具体含义。The same or similar numbers in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there are terms “upper”, “lower”, “left” and “right” , "front", "rear" and other indicated orientations or positional relationships are based on the orientations or positional relationships shown in the accompanying drawings, and are only for the convenience of describing the present invention and simplifying the description, rather than indicating or implying that the indicated device or element must be It has a specific orientation, is constructed and operated in a specific orientation, so the terms describing the positional relationship in the accompanying drawings are only used for exemplary illustration, and should not be construed as a limitation of the present invention. situation to understand the specific meaning of the above terms.

请参阅图1～图2，一种带分类器深度NMF网络的优化方法，对深度NMF网络基于乘性迭代规则进行无监督逐层预训练，具体包括以下步骤：Please refer to Figure 1 to Figure 2, an optimization method of a deep NMF network with a classifier, which performs unsupervised layer-by-layer pre-training of the deep NMF network based on multiplicative iterative rules, which specifically includes the following steps:

1)将数据X^(i-1)输入第i层NMF网络；1) Input the data X ^(i-1) into the i-th layer NMF network;

2)设置算法终止阈值e和最大迭代次数t_max，初始化基向量矩阵W⁽ⁱ⁾和低维特征矩阵H⁽ⁱ⁾；设

为N维向量数据集合，则基向量矩阵

低维特征矩阵

r为低维特征空间维数，一般情况下，r比N和n小很多，且满足((N+n)r＜Nn；2) Set the algorithm termination threshold e and the maximum number of iterations t _max , initialize the basis vector matrix W ⁽ⁱ⁾ and the low-dimensional feature matrix H ⁽ⁱ⁾ ; set

is an N-dimensional vector data set, then the basis vector matrix

low-dimensional feature matrix

r is the dimension of low-dimensional feature space. In general, r is much smaller than N and n, and satisfies ((N+n)r<Nn;

3)更新基向量矩阵W⁽ⁱ⁾和低维特征矩阵H⁽ⁱ⁾，迭代更新规则定义为：3) Update the basis vector matrix W ⁽ⁱ⁾ and the low-dimensional feature matrix H ⁽ⁱ⁾ , the iterative update rule is defined as:

4)计算预训练阶段NMF层目标函数值C，C定义为4) Calculate the objective function value C of the NMF layer in the pre-training stage, and C is defined as

5)比较目标函数值C^(t)与C^(t+1)，若||C^(t+1)-C^(t)||<e成立或者达到最大迭代次数t_max，则算法终止，并得到第i层NMF网络的基向量矩阵W⁽ⁱ⁾和低维特征H⁽ⁱ⁾，否则，循环步骤3)至步骤5)；5) Compare the objective function value C ^(t) with C ^(t+1) , if ||C ^(t+1) -C ^(t) ||<e is established or the maximum number of iterations t _max is reached, the algorithm terminates, and Obtain the basis vector matrix W ⁽ⁱ⁾ and the low-dimensional feature H ⁽ⁱ⁾ of the i-th layer NMF network, otherwise, loop step 3) to step 5);

6)对低维特征H⁽ⁱ⁾进行映射处理得到第i+1层NMF的输入数据X⁽ⁱ⁾，基于Sigmoid函数的非线性映射为：6) The low-dimensional feature H ⁽ⁱ⁾ is mapped to obtain the input data X ⁽ⁱ⁾ of the i+1th layer NMF, and the nonlinear mapping based on the sigmoid function is:

f(x)＝1/(1+exp(-x))f(x)=1/(1+exp(-x))

7)重复步骤S31至步骤S37，直至i＞L完成对各NMF层无监督逐层预训练。7) Repeat steps S31 to S37 until i>L to complete the unsupervised layer-by-layer pre-training of each NMF layer.

对深度NMF网络基于BP算法进行有监督全局优化，具体步骤如下：The supervised global optimization of the deep NMF network based on the BP algorithm, the specific steps are as follows:

1)将带标签数据X^(i-1)输入第i层NMF网络；1) Input the labeled data X ^(i-1) into the i-th layer NMF network;

2)设置算法终止阈值e和最大迭代次数t_max，初始化低维特征矩阵H⁽ⁱ⁾；2) Set the algorithm termination threshold e and the maximum number of iterations t _max , and initialize the low-dimensional feature matrix H ⁽ⁱ⁾ ;

3)固定基向量矩阵W⁽ⁱ⁾，对低维特征矩阵H⁽ⁱ⁾迭代更新，迭代规则为：3) Fix the basis vector matrix W ⁽ⁱ⁾ , and iteratively update the low-dimensional feature matrix H ⁽ⁱ⁾ , the iterative rules are:

4)S44：计算有监督全局优化阶段NMF层目标函数值C，C定义为4) S44: Calculate the objective function value C of the NMF layer in the supervised global optimization stage, where C is defined as

5)比较目标函数值C^(t)与C^(t+1)，若||C^(t+1)-C^(t)||<e成立或者达到最大迭代次数t_max，则算法终止，并得到第i层NMF网络低维特征H⁽ⁱ⁾，否则，循环步骤3)至步骤5)；5) Compare the objective function value C ^(t) with C ^(t+1) , if ||C ^(t+1) -C ^(t) ||<e is established or the maximum number of iterations t _max is reached, the algorithm terminates, and Obtain the low-dimensional feature H ⁽ⁱ⁾ of the i-th layer NMF network, otherwise, cycle from step 3) to step 5);

6)计算第i层NMF网络代价函数6) Calculate the cost function of the i-th layer NMF network

其中，

为权重约束项，α为平衡权重约束项的系数，f(w_ij)定义为in,

is the weight constraint term, α is the coefficient of the balance weight constraint term, f(w _ij ) is defined as

7)对低维特征H⁽ⁱ⁾进行映射处理得到第i+1层NMF的输入数据X⁽ⁱ⁾，基于Sigmoid函数的非线性映射为：7) The low-dimensional feature H ⁽ⁱ⁾ is mapped to obtain the input data X ⁽ⁱ⁾ of the i+1th layer NMF, and the nonlinear mapping based on the sigmoid function is:

f(x)＝1/(1+exp(-x))f(x)=1/(1+exp(-x))

8)重复步骤1)至步骤7)，直至i＞L；8) Repeat step 1) to step 7) until i>L;

9)将第L层NMF的输出X^(L)输入Softmax分类器，计算分类层代价函数值，公式如下：9) Input the output X ^(L) of the L-th layer of NMF into the Softmax classifier, and calculate the cost function value of the classification layer. The formula is as follows:

其中，

10)计算带分类器深度NMF网络总体代价函数值，公式如下：10) Calculate the overall cost function value of the deep NMF network with the classifier, the formula is as follows:

其中，W_DN包括各NMF层以及Softmax分类层权值参数；Among them, W _DN includes each NMF layer and Softmax classification layer weight parameters;

11)将带分类器深度NMF网络所有层视为一个模型，基于梯度下降算法，经过多次迭代以期使网络的总体代价函数值最小。计算各层的输出，每层的重构误差，根据误差修正相应的参数，进一步优化各NMF层以及Softmax分类层权值参数。11) Considering all layers of the deep NMF network with classifiers as a model, based on the gradient descent algorithm, after many iterations, the overall cost function value of the network is expected to be minimized. Calculate the output of each layer, the reconstruction error of each layer, correct the corresponding parameters according to the error, and further optimize the weight parameters of each NMF layer and Softmax classification layer.

最后说明的是，以上实施例仅用以说明本发明的技术方案而非限制，尽管参照较佳实施例对本发明进行了详细说明，本领域的普通技术人员应当理解，可以对本发明的技术方案进行修改或者等同替换，而不脱离本技术方案的宗旨和范围，其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present invention can be Modifications or equivalent replacements, without departing from the spirit and scope of the technical solution, should all be included in the scope of the claims of the present invention.

Claims

1. an optimization method of a deep non-negative matrix decomposition network with a classifier, the non-negative matrix decomposition network with a deep classifier comprises an input layer, an NMF layer 1, an NMF layer 2, ..., an NMF layer L and a classification layer, Wherein, a mapping function is used to connect different NMF layers, and it is characterized in that: the method includes the following steps:

S2: Set the number of NMF layers in the deep network to L, and the low-dimensional feature space dimensions of each NMF layer are r ⁽¹⁾ , r ⁽²⁾ ,..., r ^(L) respectively, for multiple NMF layers and classification layers Concatenation is performed to construct a deep NMF network with a classifier. The decomposition result of the previous NMF layer is used as the input of the next NMF layer, and the different NMF layers are connected by a mapping function;

S3: For the deep NMF network constructed in step S2, perform unsupervised pre-training on each layer of NMF based on the multiplicative iterative rule;

S4: For the pre-trained deep NMF network obtained in step S3, perform supervised global optimization on the connection weight parameters of each NMF layer and the Softmax classification layer based on the BP algorithm;

S5: According to the obtained training and optimized deep NMF network in step S4, analyze the input test data samples to obtain corresponding classification output results.

2. the optimization method of a kind of deep non-negative matrix decomposition network with classifier according to claim 1, is characterized in that: in described step S1, adopt short-time Fourier transform time-frequency analysis method to calculate input data The time-frequency amplitude spectrum is preprocessed.

3. The optimization method of a deep non-negative matrix factorization network with a classifier according to claim 1, characterized in that: in the step S2, Softmax is used as a classification layer function to construct a deep NMF network with a classifier.

4. The optimization method of a deep non-negative matrix factorization network with a classifier according to claim 1, wherein in the step S3, an unsupervised pre-training step is performed on the deep NMF network based on the multiplicative iterative rule as follows:

S31: Input the data X ^(i-1) into the i-th layer NMF network;

S32: set the algorithm termination threshold e and the maximum number of iterations t _max , initialize the basis vector matrix W ⁽ⁱ⁾ and the low-dimensional feature matrix H ⁽ⁱ⁾ ; set

is an N-dimensional vector data set, then the basis vector matrix

low-dimensional feature matrix

S33: Update the basis vector matrix W ⁽ⁱ⁾ and the low-dimensional feature matrix H ⁽ⁱ⁾ , the iterative update rule is defined as:

S34: Calculate the objective function value C of the NMF layer in the pre-training stage, and C is defined as

S35: Compare the objective function values C ^(t) and C ^(t+1) , if ||C ^(t+1) -C ^(t) ||<e is established or the maximum number of iterations t _max is reached, the algorithm terminates, and Obtain the basis vector matrix W ⁽ⁱ⁾ and the low-dimensional feature H ⁽ⁱ⁾ of the i-th layer NMF network, otherwise, cycle from step S33 to step S35;

S36: Perform the mapping process on the low-dimensional feature H ⁽ⁱ⁾ to obtain the input data X ⁽ⁱ⁾ of the i+1th layer NMF, and the nonlinear mapping based on the sigmoid function is:

f(x)=1/(1+exp(-x))

S37: Repeat steps S31 to S37 until i>L to complete the unsupervised layer-by-layer pre-training of each NMF layer.

5. the optimization method of a kind of deep non-negative matrix decomposition network with classifier according to claim 1, is characterized in that: in described step S4, based on BP algorithm, deep NMF network is carried out supervised global optimization, supervised The steps of global optimization are as follows:

S41: Input the labeled data X ^(i-1) into the i-th layer NMF network;

S42: Set the algorithm termination threshold e and the maximum number of iterations t _max , and initialize the low-dimensional feature matrix H ⁽ⁱ⁾ ;

S43: fix the basis vector matrix W ⁽ⁱ⁾ , iteratively update the low-dimensional feature matrix H ⁽ⁱ⁾ , the iterative rule is:

S44: Calculate the objective function value C of the NMF layer in the supervised global optimization stage, where C is defined as:

S45: Compare the objective function values C ^(t) and C ^(t+1) , if ||C ^(t+1) -C ^(t) ||<e is established or the maximum number of iterations t _max is reached, the algorithm terminates, and Obtain the low-dimensional feature H ⁽ⁱ⁾ of the i-th layer NMF network, otherwise, loop steps S43 to S45;

S46: Calculate the cost function of the i-th layer NMF network

in,

S47: Perform the mapping process on the low-dimensional feature H ⁽ⁱ⁾ to obtain the input data X ⁽ⁱ⁾ of the i+1th layer NMF, and the nonlinear mapping based on the sigmoid function is:

f(x)=1/(1+exp(-x))

S48: Repeat steps S41 to S47 until i>L;

S49: Input the output X ^(L) of the L-th layer NMF into the Softmax classifier, and calculate the cost function value of the classification layer. The formula is as follows:

in,

S410: Calculate the overall cost function value of the deep NMF network with the classifier, the formula is as follows:

Among them, W _DN includes each NMF layer and Softmax classification layer weight parameters;

S411: Treat all layers of the deep NMF network with a classifier as a model, based on the gradient descent algorithm, after several iterations to minimize the overall cost function value of the network; calculate the output of each layer, the reconstruction error of each layer, according to the error Correct the corresponding parameters, and optimize the weight parameters of each NMF layer and Softmax classification layer.