Nothing Special   »   [go: up one dir, main page]

CN111612084A - An Optimization Method for Deep Non-negative Matrix Factorization Networks with Classifiers - Google Patents

An Optimization Method for Deep Non-negative Matrix Factorization Networks with Classifiers Download PDF

Info

Publication number
CN111612084A
CN111612084A CN202010456563.7A CN202010456563A CN111612084A CN 111612084 A CN111612084 A CN 111612084A CN 202010456563 A CN202010456563 A CN 202010456563A CN 111612084 A CN111612084 A CN 111612084A
Authority
CN
China
Prior art keywords
nmf
layer
network
deep
classifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010456563.7A
Other languages
Chinese (zh)
Inventor
张焱
郭京龙
黄庆卿
陈俊华
李帅永
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Post and Telecommunications
Original Assignee
Chongqing University of Post and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Post and Telecommunications filed Critical Chongqing University of Post and Telecommunications
Priority to CN202010456563.7A priority Critical patent/CN111612084A/en
Publication of CN111612084A publication Critical patent/CN111612084A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2133Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on naturality criteria, e.g. with non-negative factorisation or negative correlation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Probability & Statistics with Applications (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Algebra (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明涉及一种带分类器深度非负矩阵分解(Nonnegative Matrix Factorization,NMF)网络的优化方法,属于人工智能技术领域。本发明包括模型构建和参数优化,模型构建方面,对多个NMF层以及分类层进行级联构成深度网络,即前一NMF层分解结果作为后一NMF层的输入,不同NMF层间采用映射函数连接。参数优化方面,基于乘性迭代规则对深度NMF网络进行无监督逐层预训练,有监督全局优化基于BP算法对各NMF层以及Softmax分类层权值参数进行整体优化。利用训练优化的深度NMF网络对测试数据进行分析,得到分类输出结果。本发明适用于状态监测与诊断等分类识别任务相关应用。

Figure 202010456563

The invention relates to an optimization method of a deep non-negative matrix factorization (NMF) network with a classifier, and belongs to the technical field of artificial intelligence. The invention includes model construction and parameter optimization. In the aspect of model construction, multiple NMF layers and classification layers are cascaded to form a deep network, that is, the decomposition result of the previous NMF layer is used as the input of the next NMF layer, and a mapping function is used between different NMF layers. connect. In terms of parameter optimization, the unsupervised layer-by-layer pre-training of the deep NMF network is performed based on the multiplicative iterative rule, and the supervised global optimization is based on the BP algorithm to optimize the weight parameters of each NMF layer and the Softmax classification layer as a whole. The test data is analyzed by using a deep NMF network optimized by training, and the classification output results are obtained. The invention is suitable for applications related to classification and identification tasks such as state monitoring and diagnosis.

Figure 202010456563

Description

一种带分类器深度非负矩阵分解网络的优化方法An Optimization Method for Deep Non-negative Matrix Factorization Networks with Classifiers

技术领域technical field

本发明属于人工智能技术领域,涉及一种带分类器深度非负矩阵分解网络的优化方法。The invention belongs to the technical field of artificial intelligence, and relates to an optimization method for a deep non-negative matrix decomposition network with a classifier.

背景技术Background technique

非负矩阵分解是一种较新颖的矩阵分解思想,其要求分解后的所有分量均为非负值(要求纯加性的描述),同时能实现非线性的维数约简以及稀疏性特征表示,目前已在图像处理、计算机视觉、模式识别、生物医学、信号处理等领域得到众多成功应用。但是,传统单层网络的特征表达能力有限,为进一步提高网络的分类或回归能力,深度学习是近年来机器学习领域一个重要研究方向,深度网络通过构建具有自学习能力的多重抽象层,可以对输入数据进行逐级提取,实现对数据更高层次的特征表示,目前深度网络已在许多分类识别任务中取得了显著成果。Non-negative matrix factorization is a relatively novel matrix factorization idea, which requires all components after decomposition to be non-negative values (requiring purely additive description), and can achieve nonlinear dimension reduction and sparse feature representation. It has been successfully applied in many fields such as image processing, computer vision, pattern recognition, biomedicine, and signal processing. However, the feature expression ability of traditional single-layer network is limited. In order to further improve the classification or regression ability of the network, deep learning has become an important research direction in the field of machine learning in recent years. By constructing multiple abstract layers with self-learning ability, deep network can The input data is extracted step by step to achieve a higher-level feature representation of the data. At present, deep networks have achieved remarkable results in many classification and recognition tasks.

在这样的背景下,非负矩阵分解思想与深度学习的结合有望兼顾二者的优势,通过构建深度非负矩阵分解网络以提升单层NMF的特征提取性能,同时NMF纯加性描述为深度网络提供了直观可理解的层次特征学习过程。深度NMF网络(Guo Z,Zhang S.Sparse DeepNonnegativeMatrixFactorization[J].BigDataMiningandAnalytics,2020,3(1):13-28.SongHA,Kim B,Xuan T L,et al,Hierarchical feature extraction by multi-layernon-negative matrix factorization network for classification task[J].Neurocomputing,2015,165:63-74.Cichocki A,ZdunekR.Multilayernonnegativematrixfactorization,Electron.Lett.2006,42(16):947–948.)开始得到关注。但是,现有的深度NMF网络仅对NMF特征提取层进行训练优化,未实现NMF特征提取层与分类层统一的模型全局优化。针对这一问题,本发明提出一种具有深度学习结构的带分类器的多层非负矩阵分解网络,并设计了无监督逐层预训练和有监督全局优化相结合的模型参数优化方法。In this context, the combination of non-negative matrix factorization and deep learning is expected to take into account the advantages of both. By constructing a deep non-negative matrix factorization network to improve the feature extraction performance of single-layer NMF, NMF is purely additively described as a deep network. Provides an intuitive and understandable hierarchical feature learning process. Deep NMF Network (Guo Z, Zhang S. Sparse DeepNonnegative Matrix Factorization [J]. Big Data Mining and Analytics, 2020, 3(1): 13-28. SongHA, Kim B, Xuan T L, et al, Hierarchical feature extraction by multi-layernon-negative matrix factorization network for classification task [J]. Neurocomputing, 2015, 165: 63-74. Cichocki A, Zdunek R. Multilayer non negative matrix factorization, Electron. Lett. 2006, 42(16): 947–948.) began to gain attention. However, the existing deep NMF network only trains and optimizes the NMF feature extraction layer, and does not realize the unified model global optimization of the NMF feature extraction layer and the classification layer. In response to this problem, the present invention proposes a multi-layer non-negative matrix factorization network with a classifier with a deep learning structure, and designs a model parameter optimization method combining unsupervised layer-by-layer pre-training and supervised global optimization.

发明内容SUMMARY OF THE INVENTION

有鉴于此,本发明的目的在于提供一种带分类器深度非负矩阵分解网络的优化方法。通过分析非负矩阵分解和深度网络基本结构,采用级联思想构造具有深度学习结构的带分类器的多层非负矩阵分解网络,利用无监督逐层预训练和有监督全局优化项结合方法实现深度模型特征提取层与分类层统一的模型全局优化。In view of this, the purpose of the present invention is to provide an optimization method for a deep non-negative matrix factorization network with a classifier. By analyzing the basic structure of non-negative matrix factorization and deep network, the cascade idea is used to construct a multi-layer non-negative matrix factorization network with classifiers with deep learning structure. The global optimization of the model with the unified feature extraction layer and classification layer of the deep model.

为达到上述目的,本发明提供如下技术方案:To achieve the above object, the present invention provides the following technical solutions:

一种带分类器深度非负矩阵分解网络的优化方法,该方法包括以下步骤:An optimization method for a deep non-negative matrix factorization network with a classifier, the method includes the following steps:

S1:输入原始数据,训练集为{Xk,Gk},其中k=1,2,...,n,n为训练样本数,对模型输入数据进行预处理,记预处理后数据为

Figure BDA0002509441560000021
S1: Input the original data, the training set is {X k ,G k }, where k=1,2,...,n, n is the number of training samples, preprocess the input data of the model, record the preprocessed data as
Figure BDA0002509441560000021

S2:设置深度网络中NMF层数为L,各NMF层低维特征空间维数分别为r(1),r(2),...,r(L),对多个NMF层以及分类层进行级联,构造带分类器深度NMF网络,前一NMF层分解结果作为后一NMF层的输入,不同NMF层间以映射函数相连接;S2: Set the number of NMF layers in the deep network to L, and the low-dimensional feature space dimensions of each NMF layer are r (1) , r (2) ,..., r (L) respectively, for multiple NMF layers and classification layers Concatenation is performed to construct a deep NMF network with a classifier. The decomposition result of the previous NMF layer is used as the input of the next NMF layer, and the different NMF layers are connected by a mapping function;

S3:对步骤S2构造的深度NMF网络,基于乘性迭代规则对各层NMF进行无监督预训练;S3: For the deep NMF network constructed in step S2, perform unsupervised pre-training on each layer of NMF based on the multiplicative iterative rule;

S4:对步骤S3得到的预训练深度NMF网络,基于BP算法对各NMF层以及Softmax分类层连接权值参数进行有监督全局优化;S4: For the pre-trained deep NMF network obtained in step S3, perform supervised global optimization on the connection weight parameters of each NMF layer and the Softmax classification layer based on the BP algorithm;

S5:依据步骤S4的所得训练优化的深度NMF网络,对输入的测试数据样本进行分析,得到相应的分类输出结果。S5: According to the obtained training and optimized deep NMF network in step S4, analyze the input test data samples to obtain corresponding classification output results.

可选的,在所述步骤S1中,采用短时傅里叶变换时频分析方法对输入数据计算时频幅值谱进行预处理。Optionally, in the step S1, a short-time Fourier transform time-frequency analysis method is used to preprocess the input data to calculate the time-frequency amplitude spectrum.

在所述步骤S2中,采用Softmax作为分类层函数构建带分类器深度NMF网络。In the step S2, Softmax is used as the classification layer function to construct a deep NMF network with a classifier.

可选的,在所述步骤S3中,基于乘性迭代规则对深度NMF网络进行无监督预训练:Optionally, in the step S3, unsupervised pre-training is performed on the deep NMF network based on the multiplicative iterative rule:

S31:将数据X(i-1)输入第i层NMF网络;S31: Input the data X (i-1) into the i-th layer NMF network;

S32:设置算法终止阈值e和最大迭代次数tmax,初始化基向量矩阵W(i)和低维特征矩阵H(i);设

Figure BDA0002509441560000022
为N维向量数据集合,则基向量矩阵
Figure BDA0002509441560000023
低维特征矩阵
Figure BDA0002509441560000024
r为低维特征空间维数,一般情况下,r比N和n小很多,且满足((N+n)r<Nn;S32: set the algorithm termination threshold e and the maximum number of iterations t max , initialize the basis vector matrix W (i) and the low-dimensional feature matrix H (i) ; set
Figure BDA0002509441560000022
is an N-dimensional vector data set, then the basis vector matrix
Figure BDA0002509441560000023
low-dimensional feature matrix
Figure BDA0002509441560000024
r is the dimension of the low-dimensional feature space. In general, r is much smaller than N and n, and satisfies ((N+n)r<Nn;

S33:更新基向量矩阵W(i)和低维特征矩阵H(i),迭代更新规则定义为:S33: Update the basis vector matrix W (i) and the low-dimensional feature matrix H (i) , the iterative update rule is defined as:

Figure BDA0002509441560000025
Figure BDA0002509441560000025

Figure BDA0002509441560000026
Figure BDA0002509441560000026

Figure BDA0002509441560000027
Figure BDA0002509441560000027

S34:计算预训练阶段NMF层目标函数值C,C定义为S34: Calculate the objective function value C of the NMF layer in the pre-training stage, and C is defined as

Figure BDA0002509441560000031
Figure BDA0002509441560000031

S35:比较目标函数值C(t)与C(t+1),若||C(t+1)-C(t)||<e成立或者达到最大迭代次数tmax,则算法终止,并得到第i层NMF网络的基向量矩阵W(i)和低维特征H(i),否则,循环步骤S33至步骤S35;S35: Compare the objective function values C (t) and C (t+1) , if ||C (t+1) -C (t) ||<e is established or the maximum number of iterations t max is reached, the algorithm terminates, and Obtain the basis vector matrix W (i) and the low-dimensional feature H (i) of the i-th layer NMF network, otherwise, cycle from step S33 to step S35;

S36:对低维特征H(i)进行映射处理得到第i+1层NMF的输入数据X(i),基于Sigmoid函数的非线性映射为:S36: Perform the mapping process on the low-dimensional feature H (i) to obtain the input data X (i) of the i+1th layer NMF, and the nonlinear mapping based on the sigmoid function is:

f(x)=1/(1+exp(-x))f(x)=1/(1+exp(-x))

S37:重复步骤S31至步骤S37,直至i>L完成对各NMF层无监督逐层预训练。S37: Repeat steps S31 to S37 until i>L to complete the unsupervised layer-by-layer pre-training of each NMF layer.

可选的,在所述步骤S4中,基于BP算法对深度NMF网络进行有监督全局优化:Optionally, in the step S4, supervised global optimization is performed on the deep NMF network based on the BP algorithm:

S41:将带标签数据X(i-1)输入第i层NMF网络;S41: Input the labeled data X (i-1) into the i-th layer NMF network;

S42:设置算法终止阈值e和最大迭代次数tmax,初始化低维特征矩阵H(i)S42: Set the algorithm termination threshold e and the maximum number of iterations t max , and initialize the low-dimensional feature matrix H (i) ;

S43:固定基向量矩阵W(i),对低维特征矩阵H(i)迭代更新,迭代规则为:S43: fix the basis vector matrix W (i) , iteratively update the low-dimensional feature matrix H (i) , the iterative rule is:

Figure BDA0002509441560000032
Figure BDA0002509441560000032

S44:计算有监督全局优化阶段NMF层目标函数值C,C定义为S44: Calculate the objective function value C of the NMF layer in the supervised global optimization stage, where C is defined as

Figure BDA0002509441560000033
Figure BDA0002509441560000033

S45:比较目标函数值C(t)与C(t+1),若||C(t+1)-C(t)||<e成立或者达到最大迭代次数tmax,则算法终止,并得到第i层NMF网络低维特征H(i),否则,循环步骤S43至步骤S45;S45: Compare the objective function values C (t) and C (t+1) , if ||C (t+1) -C (t) ||<e is established or the maximum number of iterations t max is reached, the algorithm terminates, and Obtain the low-dimensional feature H (i) of the i-th layer NMF network, otherwise, loop steps S43 to S45;

S46:计算第i层NMF网络代价函数S46: Calculate the cost function of the i-th layer NMF network

Figure BDA0002509441560000034
Figure BDA0002509441560000034

其中,

Figure BDA0002509441560000035
为权重约束项,α为平衡权重约束项的系数,f(wij)定义为:in,
Figure BDA0002509441560000035
is the weight constraint term, α is the coefficient of the balance weight constraint term, f(w ij ) is defined as:

Figure BDA0002509441560000036
Figure BDA0002509441560000036

S47:对低维特征H(i)进行映射处理得到第i+1层NMF的输入数据X(i),基于Sigmoid函数的非线性映射为:S47: Perform the mapping process on the low-dimensional feature H (i) to obtain the input data X (i) of the i+1th layer NMF, and the nonlinear mapping based on the sigmoid function is:

f(x)=1/(1+exp(-x))f(x)=1/(1+exp(-x))

S48:重复步骤S41至步骤S47,直至i>L;S48: Repeat steps S41 to S47 until i>L;

S49:将第L层NMF的输出X(L)输入Softmax分类器,计算分类层代价函数值,公式如下:S49: Input the output X (L) of the L-th layer NMF into the Softmax classifier, and calculate the cost function value of the classification layer. The formula is as follows:

Figure BDA0002509441560000041
Figure BDA0002509441560000041

其中,

Figure BDA0002509441560000042
为Softmax错误分类代价函数,K为类别数,yr为样本类标签;in,
Figure BDA0002509441560000042
is the Softmax misclassification cost function, K is the number of categories, and y r is the sample class label;

S410:计算带分类器深度NMF网络总体代价函数值,公式如下:S410: Calculate the overall cost function value of the deep NMF network with the classifier, the formula is as follows:

Figure BDA0002509441560000043
Figure BDA0002509441560000043

其中,WDN包括各NMF层以及Softmax分类层权值参数。Among them, W DN includes each NMF layer and Softmax classification layer weight parameters.

S411:将带分类器深度NMF网络所有层视为一个模型,基于梯度下降算法,经过多次迭代以期使网络的总体代价函数值最小。计算各层的输出,每层的重构误差,根据误差修正相应的参数,优化各NMF层以及Softmax分类层权值参数。S411: Treat all layers of the deep NMF network with a classifier as a model, and based on the gradient descent algorithm, after multiple iterations, it is hoped that the overall cost function value of the network is minimized. Calculate the output of each layer, the reconstruction error of each layer, correct the corresponding parameters according to the error, and optimize the weight parameters of each NMF layer and Softmax classification layer.

本发明的有益效果在于:NMF算法具有收敛速度快、左右非负矩阵存储空间小的特点,它能将高维的数据矩阵降维处理,分解得到的低维矩阵具有天然的稀疏性和鲁棒性,适合处理大规模数据。深度NMF网络提升了单层NMF的特征提取性能,同时为深度网络提供了直观可理解的层次特征学习过程。The beneficial effects of the invention are: the NMF algorithm has the characteristics of fast convergence speed and small storage space for left and right non-negative matrices, it can reduce the dimension of high-dimensional data matrix, and the decomposed low-dimensional matrix has natural sparsity and robustness It is suitable for processing large-scale data. The deep NMF network improves the feature extraction performance of single-layer NMF, while providing an intuitive and understandable hierarchical feature learning process for the deep network.

本发明的其他优点、目标和特征在某种程度上将在随后的说明书中进行阐述,并且在某种程度上,基于对下文的考察研究对本领域技术人员而言将是显而易见的,或者可以从本发明的实践中得到教导。本发明的目标和其他优点可以通过下面的说明书来实现和获得。Other advantages, objects, and features of the present invention will be set forth in the description that follows, and will be apparent to those skilled in the art based on a study of the following, to the extent that is taught in the practice of the present invention. The objectives and other advantages of the present invention may be realized and attained by the following description.

附图说明Description of drawings

为了使本发明的目的、技术方案和优点更加清楚,下面将结合附图对本发明作优选的详细描述,其中:In order to make the objectives, technical solutions and advantages of the present invention clearer, the present invention will be preferably described in detail below with reference to the accompanying drawings, wherein:

图1为本发明所述带分类器深度非负矩阵分解网络结构示意图;1 is a schematic diagram of the deep non-negative matrix decomposition network structure with a classifier according to the present invention;

图2为本发明基于深度非负矩阵分解网络的分类识别流程图。FIG. 2 is a flow chart of the classification and recognition based on the deep non-negative matrix factorization network of the present invention.

具体实施方式Detailed ways

以下通过特定的具体实例说明本发明的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本发明的其他优点与功效。本发明还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本发明的精神下进行各种修饰或改变。需要说明的是,以下实施例中所提供的图示仅以示意方式说明本发明的基本构想,在不冲突的情况下,以下实施例及实施例中的特征可以相互组合。The embodiments of the present invention are described below through specific specific examples, and those skilled in the art can easily understand other advantages and effects of the present invention from the contents disclosed in this specification. The present invention can also be implemented or applied through other different specific embodiments, and various details in this specification can also be modified or changed based on different viewpoints and applications without departing from the spirit of the present invention. It should be noted that the drawings provided in the following embodiments are only used to illustrate the basic idea of the present invention in a schematic manner, and the following embodiments and features in the embodiments can be combined with each other without conflict.

其中,附图仅用于示例性说明,表示的仅是示意图,而非实物图,不能理解为对本发明的限制;为了更好地说明本发明的实施例,附图某些部件会有省略、放大或缩小,并不代表实际产品的尺寸;对本领域技术人员来说,附图中某些公知结构及其说明可能省略是可以理解的。Among them, the accompanying drawings are only used for exemplary description, and represent only schematic diagrams, not physical drawings, and should not be construed as limitations of the present invention; in order to better illustrate the embodiments of the present invention, some parts of the accompanying drawings will be omitted, The enlargement or reduction does not represent the size of the actual product; it is understandable to those skilled in the art that some well-known structures and their descriptions in the accompanying drawings may be omitted.

本发明实施例的附图中相同或相似的标号对应相同或相似的部件;在本发明的描述中,需要理解的是,若有术语“上”、“下”、“左”、“右”、“前”、“后”等指示的方位或位置关系为基于附图所示的方位或位置关系,仅是为了便于描述本发明和简化描述,而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作,因此附图中描述位置关系的用语仅用于示例性说明,不能理解为对本发明的限制,对于本领域的普通技术人员而言,可以根据具体情况理解上述术语的具体含义。The same or similar numbers in the drawings of the embodiments of the present invention correspond to the same or similar components; in the description of the present invention, it should be understood that if there are terms “upper”, “lower”, “left” and “right” , "front", "rear" and other indicated orientations or positional relationships are based on the orientations or positional relationships shown in the accompanying drawings, and are only for the convenience of describing the present invention and simplifying the description, rather than indicating or implying that the indicated device or element must be It has a specific orientation, is constructed and operated in a specific orientation, so the terms describing the positional relationship in the accompanying drawings are only used for exemplary illustration, and should not be construed as a limitation of the present invention. situation to understand the specific meaning of the above terms.

请参阅图1~图2,一种带分类器深度NMF网络的优化方法,对深度NMF网络基于乘性迭代规则进行无监督逐层预训练,具体包括以下步骤:Please refer to Figure 1 to Figure 2, an optimization method of a deep NMF network with a classifier, which performs unsupervised layer-by-layer pre-training of the deep NMF network based on multiplicative iterative rules, which specifically includes the following steps:

1)将数据X(i-1)输入第i层NMF网络;1) Input the data X (i-1) into the i-th layer NMF network;

2)设置算法终止阈值e和最大迭代次数tmax,初始化基向量矩阵W(i)和低维特征矩阵H(i);设

Figure BDA0002509441560000051
为N维向量数据集合,则基向量矩阵
Figure BDA0002509441560000052
低维特征矩阵
Figure BDA0002509441560000053
r为低维特征空间维数,一般情况下,r比N和n小很多,且满足((N+n)r<Nn;2) Set the algorithm termination threshold e and the maximum number of iterations t max , initialize the basis vector matrix W (i) and the low-dimensional feature matrix H (i) ; set
Figure BDA0002509441560000051
is an N-dimensional vector data set, then the basis vector matrix
Figure BDA0002509441560000052
low-dimensional feature matrix
Figure BDA0002509441560000053
r is the dimension of low-dimensional feature space. In general, r is much smaller than N and n, and satisfies ((N+n)r<Nn;

3)更新基向量矩阵W(i)和低维特征矩阵H(i),迭代更新规则定义为:3) Update the basis vector matrix W (i) and the low-dimensional feature matrix H (i) , the iterative update rule is defined as:

Figure BDA0002509441560000054
Figure BDA0002509441560000054

Figure BDA0002509441560000055
Figure BDA0002509441560000055

Figure BDA0002509441560000056
Figure BDA0002509441560000056

4)计算预训练阶段NMF层目标函数值C,C定义为4) Calculate the objective function value C of the NMF layer in the pre-training stage, and C is defined as

Figure BDA0002509441560000061
Figure BDA0002509441560000061

5)比较目标函数值C(t)与C(t+1),若||C(t+1)-C(t)||<e成立或者达到最大迭代次数tmax,则算法终止,并得到第i层NMF网络的基向量矩阵W(i)和低维特征H(i),否则,循环步骤3)至步骤5);5) Compare the objective function value C (t) with C (t+1) , if ||C (t+1) -C (t) ||<e is established or the maximum number of iterations t max is reached, the algorithm terminates, and Obtain the basis vector matrix W (i) and the low-dimensional feature H (i) of the i-th layer NMF network, otherwise, loop step 3) to step 5);

6)对低维特征H(i)进行映射处理得到第i+1层NMF的输入数据X(i),基于Sigmoid函数的非线性映射为:6) The low-dimensional feature H (i) is mapped to obtain the input data X (i) of the i+1th layer NMF, and the nonlinear mapping based on the sigmoid function is:

f(x)=1/(1+exp(-x))f(x)=1/(1+exp(-x))

7)重复步骤S31至步骤S37,直至i>L完成对各NMF层无监督逐层预训练。7) Repeat steps S31 to S37 until i>L to complete the unsupervised layer-by-layer pre-training of each NMF layer.

对深度NMF网络基于BP算法进行有监督全局优化,具体步骤如下:The supervised global optimization of the deep NMF network based on the BP algorithm, the specific steps are as follows:

1)将带标签数据X(i-1)输入第i层NMF网络;1) Input the labeled data X (i-1) into the i-th layer NMF network;

2)设置算法终止阈值e和最大迭代次数tmax,初始化低维特征矩阵H(i)2) Set the algorithm termination threshold e and the maximum number of iterations t max , and initialize the low-dimensional feature matrix H (i) ;

3)固定基向量矩阵W(i),对低维特征矩阵H(i)迭代更新,迭代规则为:3) Fix the basis vector matrix W (i) , and iteratively update the low-dimensional feature matrix H (i) , the iterative rules are:

Figure BDA0002509441560000062
Figure BDA0002509441560000062

4)S44:计算有监督全局优化阶段NMF层目标函数值C,C定义为4) S44: Calculate the objective function value C of the NMF layer in the supervised global optimization stage, where C is defined as

Figure BDA0002509441560000063
Figure BDA0002509441560000063

5)比较目标函数值C(t)与C(t+1),若||C(t+1)-C(t)||<e成立或者达到最大迭代次数tmax,则算法终止,并得到第i层NMF网络低维特征H(i),否则,循环步骤3)至步骤5);5) Compare the objective function value C (t) with C (t+1) , if ||C (t+1) -C (t) ||<e is established or the maximum number of iterations t max is reached, the algorithm terminates, and Obtain the low-dimensional feature H (i) of the i-th layer NMF network, otherwise, cycle from step 3) to step 5);

6)计算第i层NMF网络代价函数6) Calculate the cost function of the i-th layer NMF network

Figure BDA0002509441560000064
Figure BDA0002509441560000064

其中,

Figure BDA0002509441560000065
为权重约束项,α为平衡权重约束项的系数,f(wij)定义为in,
Figure BDA0002509441560000065
is the weight constraint term, α is the coefficient of the balance weight constraint term, f(w ij ) is defined as

Figure BDA0002509441560000066
Figure BDA0002509441560000066

7)对低维特征H(i)进行映射处理得到第i+1层NMF的输入数据X(i),基于Sigmoid函数的非线性映射为:7) The low-dimensional feature H (i) is mapped to obtain the input data X (i) of the i+1th layer NMF, and the nonlinear mapping based on the sigmoid function is:

f(x)=1/(1+exp(-x))f(x)=1/(1+exp(-x))

8)重复步骤1)至步骤7),直至i>L;8) Repeat step 1) to step 7) until i>L;

9)将第L层NMF的输出X(L)输入Softmax分类器,计算分类层代价函数值,公式如下:9) Input the output X (L) of the L-th layer of NMF into the Softmax classifier, and calculate the cost function value of the classification layer. The formula is as follows:

Figure BDA0002509441560000071
Figure BDA0002509441560000071

其中,

Figure BDA0002509441560000072
为Softmax错误分类代价函数,K为类别数,yr为样本类标签;in,
Figure BDA0002509441560000072
is the Softmax misclassification cost function, K is the number of categories, and y r is the sample class label;

10)计算带分类器深度NMF网络总体代价函数值,公式如下:10) Calculate the overall cost function value of the deep NMF network with the classifier, the formula is as follows:

Figure BDA0002509441560000073
Figure BDA0002509441560000073

其中,WDN包括各NMF层以及Softmax分类层权值参数;Among them, W DN includes each NMF layer and Softmax classification layer weight parameters;

11)将带分类器深度NMF网络所有层视为一个模型,基于梯度下降算法,经过多次迭代以期使网络的总体代价函数值最小。计算各层的输出,每层的重构误差,根据误差修正相应的参数,进一步优化各NMF层以及Softmax分类层权值参数。11) Considering all layers of the deep NMF network with classifiers as a model, based on the gradient descent algorithm, after many iterations, the overall cost function value of the network is expected to be minimized. Calculate the output of each layer, the reconstruction error of each layer, correct the corresponding parameters according to the error, and further optimize the weight parameters of each NMF layer and Softmax classification layer.

最后说明的是,以上实施例仅用以说明本发明的技术方案而非限制,尽管参照较佳实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,可以对本发明的技术方案进行修改或者等同替换,而不脱离本技术方案的宗旨和范围,其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and not to limit them. Although the present invention has been described in detail with reference to the preferred embodiments, those of ordinary skill in the art should understand that the technical solutions of the present invention can be Modifications or equivalent replacements, without departing from the spirit and scope of the technical solution, should all be included in the scope of the claims of the present invention.

Claims (5)

1.一种带分类器深度非负矩阵分解网络的优化方法,所述带分类器深度非负矩阵分解网络包括输入层、NMF层1、NMF层2、……、NMF层L及分类层,其中不同NMF层间采用映射函数连接,其特征在于:该方法包括以下步骤:1. an optimization method of a deep non-negative matrix decomposition network with a classifier, the non-negative matrix decomposition network with a deep classifier comprises an input layer, an NMF layer 1, an NMF layer 2, ..., an NMF layer L and a classification layer, Wherein, a mapping function is used to connect different NMF layers, and it is characterized in that: the method includes the following steps: S1:输入原始数据,训练集为{Xk,Gk},其中k=1,2,...,n,n为训练样本数,对模型输入数据进行预处理,记预处理后数据为
Figure FDA0002509441550000011
S1: Input the original data, the training set is {X k ,G k }, where k=1,2,...,n, n is the number of training samples, preprocess the input data of the model, record the preprocessed data as
Figure FDA0002509441550000011
S2:设置深度网络中NMF层数为L,各NMF层低维特征空间维数分别为r(1),r(2),...,r(L),对多个NMF层以及分类层进行级联,构造带分类器深度NMF网络,前一NMF层分解结果作为后一NMF层的输入,不同NMF层间以映射函数相连接;S2: Set the number of NMF layers in the deep network to L, and the low-dimensional feature space dimensions of each NMF layer are r (1) , r (2) ,..., r (L) respectively, for multiple NMF layers and classification layers Concatenation is performed to construct a deep NMF network with a classifier. The decomposition result of the previous NMF layer is used as the input of the next NMF layer, and the different NMF layers are connected by a mapping function; S3:对步骤S2构造的深度NMF网络,基于乘性迭代规则对各层NMF进行无监督预训练;S3: For the deep NMF network constructed in step S2, perform unsupervised pre-training on each layer of NMF based on the multiplicative iterative rule; S4:对步骤S3得到的预训练深度NMF网络,基于BP算法对各NMF层以及Softmax分类层连接权值参数进行有监督全局优化;S4: For the pre-trained deep NMF network obtained in step S3, perform supervised global optimization on the connection weight parameters of each NMF layer and the Softmax classification layer based on the BP algorithm; S5:依据步骤S4的所得训练优化的深度NMF网络,对输入的测试数据样本进行分析,得到相应的分类输出结果。S5: According to the obtained training and optimized deep NMF network in step S4, analyze the input test data samples to obtain corresponding classification output results.
2.根据权利要求1所述的一种带分类器深度非负矩阵分解网络的优化方法,其特征在于:在所述步骤S1中,采用短时傅里叶变换时频分析方法对输入数据计算时频幅值谱进行预处理。2. the optimization method of a kind of deep non-negative matrix decomposition network with classifier according to claim 1, is characterized in that: in described step S1, adopt short-time Fourier transform time-frequency analysis method to calculate input data The time-frequency amplitude spectrum is preprocessed. 3.根据权利要求1所述的一种带分类器深度非负矩阵分解网络的优化方法,其特征在于:在所述步骤S2中,采用Softmax作为分类层函数构建带分类器深度NMF网络。3. The optimization method of a deep non-negative matrix factorization network with a classifier according to claim 1, characterized in that: in the step S2, Softmax is used as a classification layer function to construct a deep NMF network with a classifier. 4.根据权利要求1所述的一种带分类器深度非负矩阵分解网络的优化方法,其特征在于:在所述步骤S3中,基于乘性迭代规则对深度NMF网络进行无监督预训练步骤如下:4. The optimization method of a deep non-negative matrix factorization network with a classifier according to claim 1, wherein in the step S3, an unsupervised pre-training step is performed on the deep NMF network based on the multiplicative iterative rule as follows: S31:将数据X(i-1)输入第i层NMF网络;S31: Input the data X (i-1) into the i-th layer NMF network; S32:设置算法终止阈值e和最大迭代次数tmax,初始化基向量矩阵W(i)和低维特征矩阵H(i);设
Figure FDA0002509441550000012
为N维向量数据集合,则基向量矩阵
Figure FDA0002509441550000013
低维特征矩阵
Figure FDA0002509441550000014
r为低维特征空间维数,一般情况下,r比N和n小很多,且满足((N+n)r<Nn;
S32: set the algorithm termination threshold e and the maximum number of iterations t max , initialize the basis vector matrix W (i) and the low-dimensional feature matrix H (i) ; set
Figure FDA0002509441550000012
is an N-dimensional vector data set, then the basis vector matrix
Figure FDA0002509441550000013
low-dimensional feature matrix
Figure FDA0002509441550000014
r is the dimension of low-dimensional feature space. In general, r is much smaller than N and n, and satisfies ((N+n)r<Nn;
S33:更新基向量矩阵W(i)和低维特征矩阵H(i),迭代更新规则定义为:S33: Update the basis vector matrix W (i) and the low-dimensional feature matrix H (i) , the iterative update rule is defined as:
Figure FDA0002509441550000021
Figure FDA0002509441550000021
Figure FDA0002509441550000022
Figure FDA0002509441550000022
Figure FDA0002509441550000023
Figure FDA0002509441550000023
S34:计算预训练阶段NMF层目标函数值C,C定义为S34: Calculate the objective function value C of the NMF layer in the pre-training stage, and C is defined as
Figure FDA0002509441550000024
Figure FDA0002509441550000024
S35:比较目标函数值C(t)与C(t+1),若||C(t+1)-C(t)||<e成立或者达到最大迭代次数tmax,则算法终止,并得到第i层NMF网络的基向量矩阵W(i)和低维特征H(i),否则,循环步骤S33至步骤S35;S35: Compare the objective function values C (t) and C (t+1) , if ||C (t+1) -C (t) ||<e is established or the maximum number of iterations t max is reached, the algorithm terminates, and Obtain the basis vector matrix W (i) and the low-dimensional feature H (i) of the i-th layer NMF network, otherwise, cycle from step S33 to step S35; S36:对低维特征H(i)进行映射处理得到第i+1层NMF的输入数据X(i),基于Sigmoid函数的非线性映射为:S36: Perform the mapping process on the low-dimensional feature H (i) to obtain the input data X (i) of the i+1th layer NMF, and the nonlinear mapping based on the sigmoid function is: f(x)=1/(1+exp(-x))f(x)=1/(1+exp(-x)) S37:重复步骤S31至步骤S37,直至i>L完成对各NMF层无监督逐层预训练。S37: Repeat steps S31 to S37 until i>L to complete the unsupervised layer-by-layer pre-training of each NMF layer.
5.根据权利要求1所述的一种带分类器深度非负矩阵分解网络的优化方法,其特征在于:在所述步骤S4中,基于BP算法对深度NMF网络进行有监督全局优化,有监督全局优化的步骤如下:5. the optimization method of a kind of deep non-negative matrix decomposition network with classifier according to claim 1, is characterized in that: in described step S4, based on BP algorithm, deep NMF network is carried out supervised global optimization, supervised The steps of global optimization are as follows: S41:将带标签数据X(i-1)输入第i层NMF网络;S41: Input the labeled data X (i-1) into the i-th layer NMF network; S42:设置算法终止阈值e和最大迭代次数tmax,初始化低维特征矩阵H(i)S42: Set the algorithm termination threshold e and the maximum number of iterations t max , and initialize the low-dimensional feature matrix H (i) ; S43:固定基向量矩阵W(i),对低维特征矩阵H(i)迭代更新,迭代规则为:S43: fix the basis vector matrix W (i) , iteratively update the low-dimensional feature matrix H (i) , the iterative rule is:
Figure FDA0002509441550000025
Figure FDA0002509441550000025
S44:计算有监督全局优化阶段NMF层目标函数值C,C定义为:S44: Calculate the objective function value C of the NMF layer in the supervised global optimization stage, where C is defined as:
Figure FDA0002509441550000026
Figure FDA0002509441550000026
S45:比较目标函数值C(t)与C(t+1),若||C(t+1)-C(t)||<e成立或者达到最大迭代次数tmax,则算法终止,并得到第i层NMF网络低维特征H(i),否则,循环步骤S43至步骤S45;S45: Compare the objective function values C (t) and C (t+1) , if ||C (t+1) -C (t) ||<e is established or the maximum number of iterations t max is reached, the algorithm terminates, and Obtain the low-dimensional feature H (i) of the i-th layer NMF network, otherwise, loop steps S43 to S45; S46:计算第i层NMF网络代价函数S46: Calculate the cost function of the i-th layer NMF network
Figure FDA0002509441550000031
Figure FDA0002509441550000031
其中,
Figure FDA0002509441550000032
为权重约束项,α为平衡权重约束项的系数,f(wij)定义为:
in,
Figure FDA0002509441550000032
is the weight constraint term, α is the coefficient of the balance weight constraint term, f(w ij ) is defined as:
Figure FDA0002509441550000033
Figure FDA0002509441550000033
S47:对低维特征H(i)进行映射处理得到第i+1层NMF的输入数据X(i),基于Sigmoid函数的非线性映射为:S47: Perform the mapping process on the low-dimensional feature H (i) to obtain the input data X (i) of the i+1th layer NMF, and the nonlinear mapping based on the sigmoid function is: f(x)=1/(1+exp(-x))f(x)=1/(1+exp(-x)) S48:重复步骤S41至步骤S47,直至i>L;S48: Repeat steps S41 to S47 until i>L; S49:将第L层NMF的输出X(L)输入Softmax分类器,计算分类层代价函数值,公式如下:S49: Input the output X (L) of the L-th layer NMF into the Softmax classifier, and calculate the cost function value of the classification layer. The formula is as follows:
Figure FDA0002509441550000034
Figure FDA0002509441550000034
其中,
Figure FDA0002509441550000035
为Softmax错误分类代价函数,K为类别数,yr为样本类标签;
in,
Figure FDA0002509441550000035
is the Softmax misclassification cost function, K is the number of categories, and y r is the sample class label;
S410:计算带分类器深度NMF网络总体代价函数值,公式如下:S410: Calculate the overall cost function value of the deep NMF network with the classifier, the formula is as follows:
Figure FDA0002509441550000036
Figure FDA0002509441550000036
其中,WDN包括各NMF层以及Softmax分类层权值参数;Among them, W DN includes each NMF layer and Softmax classification layer weight parameters; S411:将带分类器深度NMF网络所有层视为一个模型,基于梯度下降算法,经过多次迭代以期使网络的总体代价函数值最小;计算各层的输出,每层的重构误差,根据误差修正相应的参数,优化各NMF层以及Softmax分类层权值参数。S411: Treat all layers of the deep NMF network with a classifier as a model, based on the gradient descent algorithm, after several iterations to minimize the overall cost function value of the network; calculate the output of each layer, the reconstruction error of each layer, according to the error Correct the corresponding parameters, and optimize the weight parameters of each NMF layer and Softmax classification layer.
CN202010456563.7A 2020-05-26 2020-05-26 An Optimization Method for Deep Non-negative Matrix Factorization Networks with Classifiers Pending CN111612084A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010456563.7A CN111612084A (en) 2020-05-26 2020-05-26 An Optimization Method for Deep Non-negative Matrix Factorization Networks with Classifiers

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010456563.7A CN111612084A (en) 2020-05-26 2020-05-26 An Optimization Method for Deep Non-negative Matrix Factorization Networks with Classifiers

Publications (1)

Publication Number Publication Date
CN111612084A true CN111612084A (en) 2020-09-01

Family

ID=72202352

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010456563.7A Pending CN111612084A (en) 2020-05-26 2020-05-26 An Optimization Method for Deep Non-negative Matrix Factorization Networks with Classifiers

Country Status (1)

Country Link
CN (1) CN111612084A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118096237A (en) * 2024-03-08 2024-05-28 北京嘉华铭品牌策划有限公司广东分公司 A deep learning-driven customer behavior prediction model

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118096237A (en) * 2024-03-08 2024-05-28 北京嘉华铭品牌策划有限公司广东分公司 A deep learning-driven customer behavior prediction model

Similar Documents

Publication Publication Date Title
Wen et al. Ensemble of deep neural networks with probability-based fusion for facial expression recognition
CN111126218B (en) Human behavior recognition method based on zero sample learning
CN108805188A (en) A kind of feature based recalibration generates the image classification method of confrontation network
Schaaf et al. Enhancing decision tree based interpretation of deep neural networks through l1-orthogonal regularization
CN112818861A (en) Emotion classification method and system based on multi-mode context semantic features
CN107885853A (en) A kind of combined type file classification method based on deep learning
CN110580268A (en) A credit scoring integrated classification system and method based on deep learning
CN112199536A (en) A cross-modality-based fast multi-label image classification method and system
Li et al. Two-class 3D-CNN classifiers combination for video copy detection
CN110442802B (en) A multi-behavioral preference prediction method for social users
CN112270345A (en) Clustering Algorithm Based on Self-Supervised Dictionary Learning
CN110009108A (en) A brand new quantum extreme learning machine
CN113254675A (en) Knowledge graph construction method based on self-adaptive few-sample relation extraction
CN116110089A (en) A Facial Expression Recognition Method Based on Deep Adaptive Metric Learning
CN112861626A (en) Fine-grained expression classification method based on small sample learning
CN118799619A (en) A method for batch recognition and automatic classification and archiving of image content
Bandhu et al. Classifying multi-category images using deep learning: A convolutional neural network model
Das et al. Determining attention mechanism for visual sentiment analysis of an image using svm classifier in deep learning based architecture
US20230076290A1 (en) Rounding mechanisms for post-training quantization
CN114398935A (en) A deep learning-based multi-label classification method for medical image reports
CN112270334B (en) Few-sample image classification method and system based on abnormal point exposure
CN113627240A (en) Unmanned aerial vehicle tree species identification method based on improved SSD learning model
CN109409434A (en) The method of liver diseases data classification Rule Extraction based on random forest
CN112465226A (en) User behavior prediction method based on feature interaction and graph neural network
Tao et al. Efficient incremental training for deep convolutional neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200901