CN108647325A

CN108647325A - A kind of Text Classification System of avoidable over-fitting

Info

Publication number: CN108647325A
Application number: CN201810447545.5A
Authority: CN
Inventors: 丰小月; 丰超; 时小虎
Original assignee: Jilin University
Current assignee: Jilin University
Priority date: 2018-05-11
Filing date: 2018-05-11
Publication date: 2018-10-12

Abstract

The present invention relates to a kind of Text Classification Systems of avoidable over-fitting, including over-fitting rate judgment module, text classification module, over-fitting rate judgment module is used to judge that the severity of over-fitting, text classification module to be responsible for classifying to text；The method of the present invention has carried out quantitative description to overfitting problem, whether it is over-fitting rate to have used the parameter RO, RO of description overfitting problem, occur for describing overfitting problem, severity, and this method proposes new method also directed to the classification accuracy improved in overfitting problem.

Description

A kind of Text Classification System of avoidable over-fitting

Technical field

The present invention relates to machine learning fields, are related to a kind of Text Classification System of avoidable over-fitting.

Background technology

Now, over-fitting is a very serious problem in machine learning, and Producing reason is, in actual use, because Most of file classification method all uses vector space model, i.e., each document is regarded as a bag of words, each word conduct Feature.The generation of redundancy vector is inevitably resulted in, and will produce a higher-dimension sparse matrix.But excessively pursue training set Accuracy rate, can increase the complexity of model, learn without calligraphy learning trend, and only by the non-predictable spy in training data Sign.If data do not occur, expression power is deteriorated, and becomes over-fitting.

Invention content

In view of this, the present invention provides a kind of text classification for the avoidable over-fitting for solving or partly solving the above problems System.

To achieve the effect that above-mentioned technical proposal, the technical scheme is that：A kind of text of avoidable over-fitting point Class system, including：

Text Classification System includes over-fitting rate judgment module, text classification module；

Over-fitting rate judgment module is used to judge the severity of over-fitting, is judged using over-fitting rate RO,

Over-fitting rate RO is defined as follows：

Wherein, o indicates that test error, z indicate that training error, u indicate the scale or iterations of training set；Over-fitting The positive and negative of rate determine by test error, and positive and negative opposite with test error；When having served as qualified rates of fitting less than zero, with training set Scale or iterations increase, test error o also increases, and the value of over-fitting rate RO is smaller, then the journey of over-fitting Degree is more serious；Test error o is further decreased if necessary, by increasing compensation term to increase training error z to avoid excessively quasi- It closes；It has served as qualified rates of fitting and has been equal to 0, training set increases, and test error no longer changes；

Over-fitting rate is more than zero, indicates that the increase with training scale, test error reduce；Qualified rates of fitting has been served as to maintain just , no longer there is over-fitting in number；Training sample is equal to training sample set；

Text classification module is responsible for classifying to text, is divided into two steps, the first step, preprocessing process, second step, specifically Processing procedure；

In preprocessing process, the input and output of text classification module are defined, the input of text classification module is training Sample set D={ (x₁,y₁),...,(x_i,y_i),...,(x_m,y_m), wherein D is training sample set, (x₁,y₁),..., (x_i,y_i),...,(x_m,y_m) it is training sample, X={ x₁..., x_i,...,x_mIt is training sample point, Y={ y₁, ..., y_i,...,y_mBe training sample set class label, x be training sample point, y be training sample point class label, to make With, to the classification results of sample point, m is the number of training sample point after grader；Training sample point with same index and class Distinguishing label is mutual corresponding, i.e. i-th of training sample point x_iClassification results be y_i, i is=1 ..., m；Text classification module Output be graderWherein, H (x) is grader H to training sample The classification results of point x, β^(t)Indicate that weights of the Weak Classifier h in the t times iteration, t indicate to be in the t times iteration, β indicates weak The weight of grader h, a kind of qualified relation of I function representations limit the function appeared in before I, will meet item in I function brackets The value output for appearing in the function before I of part；h^(t)(x) indicate Weak Classifier h in the t times iteration to training sample point x minutes Class handling result, k indicate the hypothesis parameter to classification results, temporarily store classification results, and T indicates the maximum times of iteration, is Positive integer；It indicates the weight of the grader of all T iteration being added, makes addition With reach maximum value, obtain the classification results of grader in such cases；

In specific processing procedure, it is as follows：

3) weighted value of initialization training sample isFor i-th of training sample in training sample set This x_iThe weight of 1st iteration, i are positive integer, and subscript (1) indicates that the 1st iteration, m indicate the number of training sample point；

4) start iteration, the total degree that iteration is arranged is T, and T is positive integer；The parameter of iteration is t, and t is positive integer, is indicated In the t times iteration, t=1 ..., T；

2.a) input parameter λ and training sample set D, selects the subset D of D ', it is c to select the number of training sample point =m × λ, wherein 0<λ≤1, detailed process are, by training sample set D={ (x₁,y₁),...,(x_m,y_m) be used as and wait selecting Set, every time from the set for wait for selection select a training sample point, the number selected is c, and c is positive integer, selection Step is：

First, n=0 is initialized, n indicates the count parameter of selection, the number for calculating the training sample having been selected, For positive integer；

Iv. it is each training sample (x_i,y_i) assignment selection weight q_i, i=1 ..., m,Right to choose The initial value q of weight₀=0, wherein q are any one selection weight, and j is positive integer, w_jFor j-th of training sample point x_jWeight；

V. random number p is generated, random number p is equal to rand (0, q_m), that is, it randomly generates and arrives q 0_mBetween a number assignment To p, using selection weight q demarcation interval set, the section that section set includes is [q₀,q₁][q₁,q₂]、[q₂,q₃] ..., [q_i,q_i+1],…,[q_m-1,q_m], section [q_i,q_i+1] correspond to training sample (x_i,y_i)；q_iFor i-th of selection weight；

Vi. judge that random number o belongs to the section for including in the set of section, the instruction of selection is recalled from training sample set D Practice sample and be put into the subset D ＇ of training sample set, set two integer l, r, l is initially 0, r and is initially equal to m-1, respectively with area Between gather in include section [q₀,q₁]、[q_m-1,q_m] corresponding, and meet q_l≤ r and q_r＞ r：The process of cycle 1 is to take Mediant

If q_l≤ r, l=U+1, otherwise r=U；Check whether l is equal with the numerical value of r, if phase Etc. cycle 1 is jumped out, if unequal continue to execute cycle 1；q_lFor first of selection weight, q_rFor r-th of selection weight；

When jumping out cycle 1, by section [q_l,q_l+1] removed from section set, by (x_l,y_l) subset of training sample is added D ', n=n+1, and go to the i-th step；

When the value of n is equal to c, the subset D of training sample set D ' it is finished by selection；

2.b) by the subset D of training sample set D ' and the t times iteration, i-th of training sample point x_iWeight w_i ^(t)Make For input parameter, classify, iterative classification is h^(t)=L (D ', w_t ^(t)), L indicates that weighted value be w_i ^(t), training sample set Close the subset D of D ' carry out learning training, h^(t)For Weak Classifier h the t times Iterative classification as a result, being used in text classification module Learning training model of the ELM models as Weak Classifier h, the error e of the t times Iterative classification^(t)For：

h^(t)(x_i) indicate to the t times iteration using Weak Classifier h to training sample point x_iCarry out classification results, h^(t)(x_i) ≠ yi indicates i-th of training sample point x_iClassification results be not equal to corresponding tag along sort y_i,Indicate i-th A training sample point x_iClassification results are equal to the classification results other than tag along sort,For point other than tag along sort Class as a result,Expression meets h^(t)(x_i)≠y_iOr's The weight of i-th of sample point in the t times iterationThe weight of all sample points of the formula will be met in all iterative process In the sum of numerical value accumulation, and accounted for the error that total weight proportion is calculated as the t times iteration；Expression be not I-th of tag along sort y_iAny one classification results, ifIt is defaulted as sample point x_iIteration is taken turns in t In the process without the classification by grader h；

2.c) weight betas of the calculating Weak Classifier h in the t times iteration^(t), calculating formula is as follows：

In iteration each time, all by above formula calculate Weak Classifier h the t times iteration weight beta^(t)；β indicates weak typing The weight of device h；

The weight of training sample 2.d) is readjusted, the formula of levelling is as follows：

According to the Weak Classifier h of previous step the t times iteration weight beta^(t)Value, be limited, h^(t)(x_i) table Grader h give the impression of weakness in the t times iteration pair, i-th of sample point x_iSorted class label, x_iIndicate i-th of sample point, y_iTable Show i-th of class label；Expression meets h^(t)(x_i)≠y_iOr The t times iteration in i-th of sample point weight beta^(t), as the index of e, then withIt is multiplied and is used as next iteration The weight of training sample,For the weight of i-th of training sample of the t times iteration,For i-th of instruction of the t+1 times iteration Practice the weight of sample；

2.e) to the training sample weight of next roundIt is standardized, standardized calculating formula is as follows：

T=t+1 jumps out iteration when t is equal to T；

Finally, the grader of grader is calculated

The present invention useful achievement be：A kind of Text Classification System of avoidable over-fitting provided by the invention, the present invention Method quantitative description has been carried out to overfitting problem, it is over-fitting rate to have used the parameter RO of description overfitting problem, RO, is used Whether occur in description overfitting problem, severity, and this method is accurate also directed to the classification improved in overfitting problem Rate proposes new method.

Specific implementation mode

In order to make technical problems, technical solutions and advantages to be solved be more clearly understood, tie below Embodiment is closed, the present invention will be described in detail.It should be noted that specific embodiment described herein is only explaining The present invention is not intended to limit the present invention, and can be realized that the product of said function belongs to equivalent replacement and improvement, is all contained in this hair Within bright protection domain.The specific method is as follows：

Embodiment 1：In general, file classification method uses vector space model, in vector space model, including Multiple bag of words, and bag of words include multiple words, the bag of words represent a document, and word represents feature；In the vector space model It can lead to the generation of redundancy vector, and generate high sparse matrix, the redundancy vector and the high sparse matrix can all cause The generation of over-fitting；

The overfitting problem can be construed to, and the accuracy rate of training set can promote the complexity of learning model, to institute It states learning model and only has recorded non-in training data it is contemplated that feature；When the learning model processing new data, effect is bad The phenomenon that；

The deep neural network machine learning model complicated as one, the inside include excessive parameter amount, are easily absorbed in Fitting；

The general solution of the overfitting problem is, using Feature Dimension Reduction, regularization and random drop may be used The method of (drop out) neuron；Now, lack the quantitative description of the overfitting problem；

The present invention provides the quantitative descriptions for solving the overfitting problem, are proposing over-fitting rate among these Whether parameter, over-fitting rate are used for describing overfitting problem occurring and its severity；And the present invention devises one kind New algorithm AdaBELM come avoid overfitting problem and improve classification accuracy rate；

Over-fitting rate is defined as follows：

Wherein, indicate that over-fitting rate, y indicate that test error, z indicate that training error, u indicate the rule of training set using RO Mould or iterations；The positive and negative of the over-fitting rate determine by the test error, and positive and negative with the test error On the contrary；

When the over-fitting rate is less than zero, indicate with the scale of the training set or the increasing of the iterations Greatly, the test error also increases；The value of the i.e. described over-fitting rate is smaller, then the degree of over-fitting is more serious；

Under the premise of avoiding the problem that the over-fitting, by increasing compensation term to increase training error z, to reduce The test error y；

The over-fitting rate is more than zero, indicates that the increase with the trained scale, the test error reduce；

AdaBoost algorithms are proposed by Kearns and Valiant.Detailed process is as follows：

At the beginning, training sample set P={ (x₁,y₁),...,(x_m,y_m) it is all endowed the weight of same size, it should Weight is that initial weight is distributed D₁；Weight is assigned as repeating, and the t times weight distribution is D_t, and Weak Classifier h_t：X → Y is from training sample set and is distributed D_tMiddle generation, Weak Classifier h_tError rate also calculated；Wherein, X={ x₁,..., x_mIt is sample point, Y={ y₁,...,y_mBe training sample set class label, y₁,...,y_m={ 0,1 ..., C }, m are institute The number of sample point is stated, is positive integer；The value of the weight for the sample that do not classified correctly increases, the sample correctly classified The value of weight is reduced；The D in next cycle_t+1Value be updated, to from training sample set and D_t+1, basic machine newly It generates；The above process repeats T cycle altogether, and T is positive integer, and final mask is to carry out most of elections by T weak learners It obtains；The purpose of algorithm is to practise a series of Weak Classifiers or basic classification device from training sample set middle school, then by weak point Class device or basic classification device are combined into a strong classifier.Describing following procedure with step is：

Pretreatment is the input for defining algorithm and the output of algorithm, and the input of algorithm is training sample set D={ (x₁, y₁),...,(x_m,y_m), the number T of iteration；The output of algorithm is grader

1) weighted value of initialization training sample isFor the training sample in the training sample set Weight, i is expressed as i-th of training sample (x in the training sample set_i,y_i)；The weighted value of each training sample It is assignedSubscript 1 indicates the weights of initial training sample；

2) T iteration is carried out, if the number of current iteration is t, t=1 ..., T

3.a) for weight distribution D_tAnd w_i ^(t), obtain Weakly separated device h^(t)=L (D_t,w_t ^(t)), L expressions are to weighted value w_i ^(t), weight distribution D_tTraining sample set carry out learning training, h^(t)For Weak Classifier, Weak Classifier can be that will train Sample X is classified as -1 or 1, the error e of classification^(t)For：

Wherein, I function representations are to w_i ^(t)Limitation, h^(t)(x_i) indicate Weak Classifier h in the t times iteration to x_iClassification after Class label, x_iIndicate i-th of sample point, y_iIndicate i-th of class label；Weak Classifier is classified incorrect trained sample This corresponding weight is added；

2.b) calculate the weight beta of Weak Classifier^(t), calculating formula is as follows：

Iteration is taken turns in t, the weight beta of Weak Classifier is all calculated by above formula^(t)；

The weight of training sample 2.c) is readjusted, the formula of levelling is as follows：

According to the weight beta of the Weak Classifier of previous step^(t)Value, be limited, h^(t)(x_i) indicate that Weak Classifier h exists The t times iteration is to x_iSorted class label, x_iIndicate i-th of sample point, y_iIndicate i-th of class label；Classification is tied The weight of the incorrect Weak Classifier of fruit as e index, then withThe weight being multiplied as next iteration training sample,For the weight of i-th of training sample of the t times iteration,For the weight of i-th of training sample of the t+1 times iteration；

Finally, to the training sample weight of next roundIt is standardized, standardized calculating formula is as follows：

Assembled classifier obtains new graderSeek meeting function most The value for the independent variable being worth greatly.

Embodiment 2：The mechanism of extreme learning machine, ELM models；

ELM is a kind of single layer feedforward network with quickly training training set, and only there are three layers in a network：Input layer The Input Layer, hidden layer the Hidden Layer and output layer the Output Layer.Assuming that having N number of arbitrary Sample (x_i,t_i), wherein x_i=[x_i1,x_i2,...,x_in]^T∈Rⁿ, t_i=[t_i1,t_i2,...,t_in]^T∈R^m.There is L for one The neural networks with single hidden layer of a hidden node can be expressed as

Wherein, g=g (x) is activation primitive, W_i=[w_i,1,w_i,2,···,w_i,n]^TFor input weight, β_iIt is weighed for output Weight, b_iIt is the biasing of i-th of Hidden unit.W_i·x_jIndicate W_iAnd x_jInner product.

The target of neural networks with single hidden layer study is the error minimum so that output, can be expressed as

There is β_i, W_iAnd b_iSo that

It can be expressed as with matrix

H β=T

Wherein, H is the output of hidden node, and β is output weight, and T is desired output.

In order to training neural networks with single hidden layer, it is intended that obtainWithSo that

Wherein, i=1, |, L, this is equivalent to minimize loss function

Algorithm of some traditional based on gradient descent method can be used for solving this problem, but it is basic based on The learning algorithm needs of gradient adjust all parameters during iteration.And in ELM algorithms, once input weight W_iWith it is hidden The biasing b of layer_iIt is determined at random, the output matrix H of hidden layer is just now uniquely determined.Training neural networks with single hidden layer can be converted into Solve a linear system H β=T.And exporting weight beta can be determined

Wherein, H⁺It is the Moore-Penrose generalized inverses of matrix H.And the provable solution acquiredNorm be minimum And it is unique.

The present invention can avoid the problem that over-fitting, obtain the good file classification method of effect, use extreme learning machine As base grader, and training subset is used as base grader, parameter lambda is used for the scale of controlled training collection；Wherein, 0< λ≤1, specific explanations are, when 0<λ<When 1, select a subset of training sample as training set for each base grader, when When λ=1, entire training set is used.Selection for training subset.For no selected data, increase it in next round The probability being selected in cycle.The present invention not only can effectively handle big data problem, while can avoid overfitting problem.

The method is divided into two steps, the first step, preprocessing process, second step, specific processing procedure；

In the preprocessing process, the input and output of the method are defined, is inputted as training sample set D= {(x₁,y₁),...,(x_m,y_m), wherein D is training sample set, (x₁,y₁),...,(x_m,y_m) it is training sample, X= {x₁,...,x_mIt is sample point, Y={ y₁,...,y_mBe the training sample set class label, y₁,...,y_m=0, 1 ..., C }, m is the number of the sample point and training sample, and x is sample point, and y is class label, after using grader To the classification results of sample point, sample point and the class label of same index are mutual corresponding；Output is that the output of algorithm is GraderWherein, H (x) is grader, β^(t)Indicate that Weak Classifier exists The t times iteration when weight, t indicate be in the t time iteration, β expression Weak Classifier weight, a kind of restriction of I function representations Relationship limits the function appeared in before I, and the value for meeting the function before I of condition in I function brackets is exported；Such as β^(t)I(h^(t)(x)=k) it indicates to meet h^(t)(x) weight beta of=k formulas^(t)Output；h^(t)(x) the t times iteration pair of presentation class device The classification handling result of sample point x, k indicate, to the hypothesis parameter of the classification results, temporarily to store the classification results, h tables Show that classification handling result of the grader to sample point x, T indicate the maximum times of iteration, is positive integer；Indicate the weight of the grader of all T iteration being added, make addition with reach Maximum value obtains the classification results of grader in such cases；

In specific processing procedure, it is as follows：

5) weighted value of initialization training sample isFor i-th of instruction in the training sample set Practice the weight of sample, i is positive integer, and subscript (1) indicates that the 1st iteration, m indicate the number of sample point；

6) start iteration, the total degree that iteration is arranged is T；The parameter of iteration is t, there is shown takes turns iteration in t, t is just Integer, there is shown take turns iteration, t=1 ..., T in t；

2.a) input parameter λ and training sample set D, selects the subset of D, and it is c=m to select the number of training sample point × λ, 0<λ≤1, detailed process are, by training sample set the D={ (x₁,y₁),...,(x_m,y_m) as the collection for waiting for selection It closes, selects a training sample point from the set for waiting for selection every time, the number selected is c, and c is positive integer, selection Step is：Initialization Integer n=0, n indicates the count parameter of selection, the number for calculating the training sample having been selected；

Vii. it is each training sample point (x_i,y_i) assignment selection weight q_i, i=1 ..., m,Selection The initial value q of weight₀=0；

Viii. random number o, the random number o are generated and is equal to rand (0, q_m), that is, it randomly generates and arrives q 0_mBetween one Number utilizes selection weight q_iGather between partition, it includes section be [q₀,q₁][q₁,q₂]、[q₂,q₃] ..., [q_i,q_i+1],…, [q_m-1,q_m], section [q_i,q_i+1] correspond to training sample point (x_i,y_i), when the random number r belongs to [q_i,q_i+1], selection training Sample point (x_i,y_i), when r is in section [q_i,q_i+1] between, by section [q_i,q_i+1] gather from the section in remove, n=at this time N+1, and go to the i-th step；

Ix. judge the random number o belongs to which section for including in the set of section, from the training sample set D Recall the subset D that corresponding training sample is put into training sample set^|, two integer l, r are set, l is initially 0, r and is initially equal to M-1 indicates the section [q for including in gathering with section respectively₀,q₁]、[q_m-1,q_m] corresponding, and meet q_lR and q_rr：It is described The process of cycle 1 is to take mediantIf q_lR, l=U+1, otherwise r=U；Check the numerical value of l and r It is whether equal, if equal jump out the cycle 1, if unequal continue to execute cycle 1；

By section [q_l,q_l+1] removed from section set, by (x_l,y_l) be added training sample subset, n=n+1, and And go to the i-th step；

When the value of n is equal to c, the subset of training sample is selected to finish；

2.b) by the subset of training sample and training w_i ^(t)As input parameter, classify, iterative classification is h^(t) =L (D^, w_t ^(t)), L indicates that weighted value be w_i(^t), weight distribution D_tTraining sample set carry out learning training, h^(t)For The t times Iterative classification of Weak Classifier h is as a result, use ELM models work in the method

For the learning training model of the Weak Classifier h, the error e of the t times Iterative classification^(t)For：

h^(t)(x_i) indicate to use Weak Classifier h sample points x to the t times iteration_iCarry out classification results, h^(t)(x_i)≠y_iTable Show i-th of sample point x_iClassification results are not equal to corresponding tag along sort y_i,Indicate i-th of sample point x_iPoint Class result is equal to the classification results other than tag along sort,Indicate full Sufficient h^(t)(x_i)≠y_iOrThe t times iteration in i-th of sample point weightThe formula will be met All sample points numerical value accumulation the sum of of the weight in all iterative process, and accounted for total weight proportion and calculated It is used as the error of the t times iteration；Indicate not to be tag along sort y_iAny one classification results, ifIt is defaulted as sample point x_iNot by the classification of grader h during t takes turns iteration；

In each round iteration, all by above formula calculate Weak Classifier h the t times iteration weight beta^(t)；β indicates weak typing The weight of device h；

Although preferred embodiments of the present invention have been described, it is created once a person skilled in the art knows basic Property concept, then additional changes and modifications may be made to these embodiments.So it includes excellent that the following claims are intended to be interpreted as It selects embodiment and falls into all change and modification of the scope of the invention.

Obviously, those skilled in the art can carry out the embodiment of the present invention various modification and variations without departing from this hair The spirit and scope of bright embodiment.In this way, if these modifications and variations of the embodiment of the present invention belong to the claims in the present invention And its within the scope of equivalent technologies, then the present invention is also intended to include these modifications and variations.

Claims

1. a kind of Text Classification System of avoidable over-fitting, which is characterized in that include the following contents：

The Text Classification System includes over-fitting rate judgment module, text classification module；

The over-fitting rate judgment module is used to judge the severity of over-fitting, can provide and sentence for the text classification module The method of disconnected over-fitting rate；Judged using over-fitting rate RO in the over-fitting rate judgment module, the over-fitting rate RO's It is defined as follows：

Wherein, o indicates that test error, z indicate that training error, u indicate the scale or iterations of training set；Over-fitting rate It is positive and negative to be determined by test error and positive and negative opposite with test error；When the over-fitting rate is less than zero, with training set Scale or iterations increase, test error o also increases, and the value of the over-fitting rate RO is smaller, then over-fitting Degree it is more serious；Further decrease test error o if necessary, by increase compensation term with increase the training error z with Avoid over-fitting；When the over-fitting rate is equal to 0, training set is further added by, and test error no longer changes；

The over-fitting rate is more than zero, indicates that the increase with training scale, test error reduce；When the over-fitting rate is Just, no longer there are problems that over-fitting；Training set is equal to training sample set；

The text classification module is responsible for classifying to text, is divided into two steps, the first step, preprocessing process, second step, specifically Processing procedure；

In the preprocessing process, the input and output of text classification module, institute described in the text classification module definition The input for stating text classification module is training sample set D={ (x₁,y₁),...,(x_i,y_i),...,(x_m,y_m), wherein D is Training sample set, (x₁,y₁),...,(x_i,y_i),...,(x_m,y_m) it is training sample, X={ x₁..., x_i,...,x_mIt is instruction Practice sample point, Y={ y₁..., y_i,...,y_mBe training sample set class label, x be training sample point, y be training The class label of sample point, for the classification results for using after grader to sample point, m is the number of training sample point, by user It is calculated after input training sample set；The training sample point with same index is to correspond with class label , i.e. i-th of training sample point x_iClassification results be y_i, i is=1 ..., m；The output of the text classification module is classification DeviceWherein, H (x) is classification knots of the grader H to training sample point x Fruit, β^(t)Indicate that weights of the Weak Classifier h in the t times iteration, t indicate to be in the t times iteration, β indicates the power of Weak Classifier h Weight, a kind of qualified relation of I function representations limit and appear in function before I, by the condition in I function brackets of meeting and position Set the value output of the function before being close to I；h^(t)(x) indicate Weak Classifier h at classification of the t times iteration to training sample point x For reason as a result, k expressions temporarily store classification results to the hypothesis parameter of classification results, T indicates the maximum times of iteration, is just whole Number；Indicate the weight of the grader of all T iteration being added, make addition with reach Maximum value obtains the classification results of grader in such cases；

In specific processing procedure, the text classification module execution is as follows：

1) weighted value of all training sample points of initialization isw_i ⁽¹⁾For i-th of instruction in the training sample set Practice sample point x_iIn the weight of the 1st iteration, i is positive integer, indicates label of the training sample point in training sample set, on Mark (1) indicates that training sample is in the 1st iteration, and m indicates the number of training sample point in the training sample set；

2) training sample set runs beginning iteration jointly, and the total degree that iteration is arranged is T, and T is positive integer；The parameter of iteration is t, and t is just Integer indicates to be in the t times iteration, t=1 ..., T；

2.a) the text classification module input parameter λ and training sample set D, and select D subset D ', if selection training The number of sample point be c=m × λ, i.e., subset D ' in include training sample point number be c, wherein 0<λ≤1, specific mistake Cheng Wei, by training sample set D={ (x₁,y₁),...,(x_m,y_m) as the set for waiting for selection, every time selection is waited for from described Select a training sample point in set, the step of number selected is c, and c is positive integer, selects for：

First, n=0 is initialized, n indicates the count parameter of selection, the number for calculating the training sample having been selected, and n is Positive integer；

I. the text classification module is each training sample (x_i,y_i) assignment selection weight q_i, i=1 ..., m, whereinSelect the initial value q of weight₀=0, wherein q are any one selection weight, and j is positive integer, w_jIt is j-th Training sample point x_jWeight；

Ii. the text classification module generates random number p, and random number p is equal to rand (0, q_m), that is, it randomly generates and arrives q 0_mBetween A number be assigned to p, then, using selection weight q demarcation interval set, the section that section set includes is [q₀,q₁] [q₁,q₂]、[q₂,q₃] ..., [q_i,q_i+1],…,[q_m-1,q_m], section [q_i,q_i+1] correspond to training sample (x_i+1,y_i+1)；q_iFor I-th of selection weight, q_mIndicate m-th of selection weight；

Iii. the text classification module judges random number p belongs to which section for including in the section set, from the instruction Practice recalled in sample set D selection training sample be put into the training sample set subset D ', set two integer l, r, l It is initially 0, r and is initially equal to m-1, the section [q for including in gathering respectively with section₀,q₁]、[q_m-1,q_m] corresponding, and meet q_l≤ r and q_r＞ r：The process of the text classification module setting cycle 1 is to take mediantIf q_l≤ r, l=U+1, otherwise r=U；Check whether l is equal with the numerical value of r, if equal jump out cycle 1, if unequal continuation Execute cycle 1；Wherein, q_lFor first of selection weight, q_rFor r-th of selection weight；

When jumping out cycle 1, by section [q_l,q_l+1] removed from section set, by (x_l,y_l) be added training sample subset D ', n= N+1, and go to the i-th step；

When the value of n is equal to c, the subset D of the training sample set D ' all finished by selection；

2.b) the text classification module is by the subset D of the training sample set D ' and the t times iteration, i-th of training sample Point x_iWeight w_i ^(t)As input parameter, classify, iterative classification is h^(t)=L (D ', w_t ^(t)), L is indicated to weighted value For w_i ^(t), the training sample set D subset D ' carry out learning training, h^(t)For the t times Iterative classification knot of Weak Classifier h Fruit uses learning training model of the ELM models as Weak Classifier h, and the t times iteration point in the text classification module The error e of class^(t)For：

h^(t)(x_i) indicate to the t times iteration using Weak Classifier h to training sample point x_iIt is classifying as a result, h^(t)(x_i)≠y_i Indicate i-th of training sample point x_iClassification results be not equal to corresponding tag along sort y_i,Indicate i-th of instruction Practice sample point x_iClassification results are equal to the classification results other than tag along sort,For the classification knot other than tag along sort Fruit,

Expression meets h^(t)(x_i)≠y_iOr

The t times iteration in i-th of sample point weighte^(t)Calculating formula be expressed as meeting should Numerical value accumulation the sum of of the weight of all sample points of formula in all iterative process, and accounted for total weight proportion meter Calculate the error as the t times iteration；Indicate not to be i-th of tag along sort y_iAny one classification results, ifIt is defaulted as sample point x_iNot by the classification of grader h during t takes turns iteration；

2.c) weight betas of the text classification module calculating Weak Classifier h in the t times iteration^(t), calculating formula is as follows：

In iteration each time, all by above formula calculate Weak Classifier h the t times iteration weight beta^(t)；β indicates Weak Classifier h's Weight；

2.d) the text classification module readjusts the weight of training sample, and the formula of levelling is as follows：

According to the Weak Classifier h of previous step the t times iteration weight beta^(t)Value, it is rightValue limited, h^(t)(x_i) table Grader h give the impression of weakness in the t times iteration pair, i-th of sample point x_iSorted class label, x_iIndicate i-th of sample point, y_iTable Show i-th of class label；Expression meets h^(t)(x_i)≠y_iOr The t times iteration in i-th of sample point weight beta^(t), as the index of e,

Then withThe weight being multiplied as next iteration training sample,For the power of i-th of training sample of the t times iteration Weight,For the weight of i-th of training sample of the t+1 times iteration；

2.e) finally, training sample weight of the text classification module to next roundIt is standardized, standardized meter Formula is as follows：

T=t+1 jumps out iteration when t is equal to T；

Finally, grader is calculated according to above-mentioned steps