CN106485528A - The method and apparatus of detection data - Google Patents
The method and apparatus of detection data Download PDFInfo
- Publication number
- CN106485528A CN106485528A CN201510552641.2A CN201510552641A CN106485528A CN 106485528 A CN106485528 A CN 106485528A CN 201510552641 A CN201510552641 A CN 201510552641A CN 106485528 A CN106485528 A CN 106485528A
- Authority
- CN
- China
- Prior art keywords
- sample
- data
- mark
- weight
- parameter
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
This application discloses a kind of method and apparatus of detection data.Wherein, the method includes:Read the confidence interval determining based on the sample weights of mark sample, wherein, sample weights are by weight model, the mark sample obtaining in advance to be trained obtaining;Judge that data to be verified, whether in confidence interval, obtains the first judged result;According to the first judged result, judge whether data to be verified is valid data, obtain the second judged result.Present application addresses the low technical problem of check results accuracy is it is achieved that the effect of verification data legitimacy exactly during the legitimacy of verification data.
Description
Technical field
The application is related to data processing field, in particular to a kind of method and apparatus of detection data.
Background technology
In internet virtual transaction platform, the virtual resource data of all kinds of virtual objects is mixed in together, in order to more preferable
Management and the legitimacy distinguishing these data, can based on data confidence interval distinguish account (as businessman) issue
Whether virtual resource data (as mobile phone price) is legal.What on present e-commerce website, all kinds of businessmans issued is various types of
The price of other commodity differs and substantial amounts, relies on artificial cognition to determine the problem that data is sorted out, expends very big
Cost of labor, and there is larger subjectivity in artificial cognition, judged result is inaccurate.
Provide a kind of data Estimating Confidence Interval method in prior art, the method after preprocessed data, directly
Calculate average and the variance of data, and according to the variance multiple setting, determine data confidence interval (the i.e. upper limit estimated
Value and lower limit), to judge whether data is distributed in such data confidence interval to classify to data.This existing number
Determine confidence interval according to the average only only in accordance with data value for the Estimating Confidence Interval method and variance, the confidence interval of determination is not
Accurately, thus leading to differentiate that the Stability and veracity of new data to be verified is relatively low.
For above-mentioned verification data legitimacy when the low problem of check results accuracy, not yet propose at present effectively to solve
Certainly scheme.
Content of the invention
The embodiment of the present application provides a kind of method and apparatus of detection data, at least to solve the legitimacy of verification data
When the low technical problem of check results accuracy.
A kind of one side according to the embodiment of the present application, there is provided method of detection data, the method includes:Read
The confidence interval being determined based on the sample weights of mark sample, wherein, sample weights are to obtaining in advance by weight model
The mark sample taking is trained and obtains;Judge that data to be verified, whether in confidence interval, obtains the first judged result;
According to the first judged result, judge whether data to be verified is valid data, obtain the second judged result.
According to the another aspect of the embodiment of the present application, additionally provide a kind of device of detection data, this device includes:Read
Delivery block, for reading the confidence interval determining based on the sample weights of mark sample, wherein, sample weights are to pass through
Weight model is trained to the mark sample obtaining in advance and obtains;First judge module, for judging number to be verified
According to whether in confidence interval, obtain the first judged result;Second judge module, for according to the first judged result, sentencing
Whether data to be verified of breaking is valid data, obtains the second judged result.
In the embodiment of the present application, by training the sample weights that obtain and determining confidence interval based on sample weights.?
In the program, when determining confidence interval, by sample weights, the significance level of mark sample is distinguished, Ye Ji
When determining confidence interval, improve the actively impact to confidence interval for the higher data of the credibility marking in sample, subtract
The disturbing influence to confidence interval for the corrupt data in mark sample less, so that confidence interval is true close to data
Confidence interval, automatically all kinds of data to be verified are classified using this confidence interval it is ensured that estimate accuracy,
Stability and reliability.By the application it is achieved that verification data legitimacy effect exactly, and then solve verification
Check results accuracy low technical problem during the legitimacy of data.
Brief description
Accompanying drawing described herein is used for providing further understanding of the present application, constitutes the part of the application, this Shen
Schematic description and description please is used for explaining the application, does not constitute the improper restriction to the application.In accompanying drawing
In:
Fig. 1 is a kind of hardware block diagram of terminal of the embodiment of the present application;
Fig. 2 is a kind of flow chart of the method for the detection data according to the embodiment of the present application;
Fig. 3 is a kind of flow chart by weight model to mark sample training according to the embodiment of the present application;
Fig. 4 is the flow chart of the method for a kind of optional detection data according to the embodiment of the present application;
Fig. 5 is a kind of schematic diagram of the device of the detection data according to the embodiment of the present application;
Fig. 6 is a kind of schematic diagram of the device of the alternatively detection data according to the embodiment of the present application;
Fig. 7 is the schematic diagram of another kind according to the embodiment of the present application alternatively device of detection data;And
Fig. 8 is a kind of network environment schematic diagram of the terminal according to the embodiment of the present application.
Specific embodiment
In order that those skilled in the art more fully understand application scheme, below in conjunction with the embodiment of the present application
Accompanying drawing, is clearly and completely described the embodiment it is clear that described to the technical scheme in the embodiment of the present application
It is only the embodiment of the application part, rather than whole embodiments.Based on the embodiment in the application, ability
The every other embodiment that domain those of ordinary skill is obtained under the premise of not making creative work, all should belong to
The scope of the application protection.
It should be noted that term " first " in the description and claims of this application and above-mentioned accompanying drawing, "
Two " it is etc. for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that this
Sample use data can exchange in the appropriate case so that embodiments herein described herein can with except
Here the order beyond those illustrating or describing is implemented.Additionally, term " comprising " and " having " and they
Any deformation, it is intended that covering non-exclusive comprising, for example, contains process, the side of series of steps or unit
Method, system, product or equipment are not necessarily limited to those steps clearly listed or unit, but may include unclear
List or for these processes, method, product or the intrinsic other steps of equipment or unit.
Embodiment 1
According to the embodiment of the present application, additionally provide a kind of embodiment of the method for detection data, it should be noted that
The step that the flow process of accompanying drawing illustrates can execute in the computer system of such as one group of computer executable instructions, and
And although showing logical order in flow charts, but in some cases, can be with different from order herein
The shown or described step of execution.
The embodiment of the method that the embodiment of the present application is provided can be in mobile terminal, terminal or similar computing
Execute in device.Taking run on computer terminals as a example, Fig. 1 is a kind of terminal of the embodiment of the present application
Hardware block diagram.As shown in figure 1, terminal 10 can include one or more (in figure only illustrates one)
(processor 102 can include but is not limited to Micro-processor MCV or PLD FPGA etc. to processor 102
Processing meanss), for data storage memorizer 104 and for communication function transmitting device 106.This area
Those of ordinary skill is appreciated that the structure shown in Fig. 1 is only and illustrates, it is not made to the structure of above-mentioned electronic installation
Become to limit.For example, terminal 10 may also include the assembly more or more less than shown in Fig. 1, or tool
There are the configurations different from shown in Fig. 1.
Memorizer 104 can be used for storing software program and the module of application software, the such as detection in the embodiment of the present application
Corresponding programmed instruction/the module of data method, processor 102 passes through to run the software journey being stored in memorizer 104
Sequence and module, thus executing various function application and data processing, that is, realize the leak inspection of above-mentioned application program
Survey method.Memorizer 104 may include high speed random access memory, may also include nonvolatile memory, such as one or
Multiple magnetic storage devices, flash memory or other non-volatile solid state memories.In some instances, memorizer 104
The memorizer remotely located with respect to processor 102 can be further included, these remote memories can by network even
It is connected to terminal 10.The example of above-mentioned network includes but is not limited to the Internet, intranet, LAN, shifting
Dynamic communication network and combinations thereof.
Transmitting device 106 is used for receiving via a network or sends data.Above-mentioned network instantiation may include
The wireless network that the communication providerses of terminal 10 provide.In an example, transmitting device 106 includes one
Individual network adapter (Network Interface Controller, NIC), it can be set with other networks by base station
For connected thus can be communicated with the Internet.In an example, transmitting device 106 can be radio frequency (Radio
Frequency, RF) module, it is used for wirelessly being communicated with the Internet.
Under above-mentioned running environment, this application provides a kind of detection data method as shown in Figure 2.Fig. 2 is basis
A kind of flow chart of the detection data method of the embodiment of the present application.
As shown in Fig. 2 the method for this detection data comprises the steps:
Step S202:Read the confidence interval determining based on the sample weights of mark sample, wherein, sample weights are logical
Cross weight model the mark sample obtaining in advance to be trained and obtains.
Step S204:Judge that data to be verified, whether in confidence interval, obtains the first judged result.
Step S206:According to the first judged result, judge whether data to be verified is valid data, obtain the second judgement
Result.
Alternatively, if the first judged result indicates that data to be verified, in confidence interval, judges that data to be verified is
Valid data, if the first judged result indicates that data to be verified, not in confidence interval, judges that data to be verified is
Invalid data.
In the above embodiments of the present application scheme, sample weights can be obtained to mark sample training by the model building,
Determine confidence interval based on this sample weights.In this scenario, when determining confidence interval, by sample weights to mark
The significance level (as the quality of data) of sample is distinguished, namely when determining confidence interval, improves mark sample
The actively impact to confidence interval for the higher data of credibility in this, the corrupt data reducing in mark sample is opposed
The interval disturbing influence of letter so that confidence interval is close to the real confidence interval of data, using this confidence interval from
Move to all kinds of data classification to be verified accuracy, stability and reliability it is ensured that estimating.Above-mentioned in the application
In embodiment, the quality of the weight model differentiation data by building, and redefine the confidence interval of data, it is based on
The quality of this confidence interval automatic distinguishing data to be verified, thus ensure final estimated result (i.e. the second judged result)
Accurately and reliably.By the application, solve the problems, such as verification data legitimacy when check results accuracy low, realize
Verification data legitimacy effect exactly.
Specifically, after terminal collects virtual resource information, from this virtual resource information, extract data to be verified,
Read this confidence interval corresponding to object described by virtual resource information, judge whether this data to be verified puts at this
In letter is interval, if this data to be verified is in confidence interval, judges that data to be verified is valid data, that is, determine
This virtual resource information is legal information;If data to be verified is not in confidence interval, judge that data to be verified is
Invalid data, that is, this virtual resource information is invalid information.Obtain each data to be verified whether valid data (and
/ or this virtual resource information whether be legal information) after, will determine that result is recorded, and be saved in memorizer
In.
Wherein, virtual resource information describes the information of virtual resource, and virtual resource is opposed name with real-life asset
Word, this virtual resource can be the various matter essential factor of circulation on the Internet, and specifically, maintenance data storehouse, program are compiled
The information resources collected are exactly virtual resource, information of trading object in such as online library, online shopping mall etc..
The information of virtual resource can be used for describing the property value of virtual resource, the such as value of trading object.
Alternatively, determining that this data to be verified is invalid data and/or determines that this virtual resource information is invalid information
Afterwards, this virtual resource information can be marked, insincere to identify this virtual resource information.
Alternatively, when being marked to virtual resource information, can be by invalid data (or information) and valid data
(or information) is with different color marks out it is also possible to by invalid data (or information) and valid data (or letter
Breath) it is marked with different indications, insincere to identify this virtual resource information.
Alternatively, determining that this data to be verified is invalid data and/or determines that this virtual resource information is invalid information
Afterwards, this invalid data/information can be made delete processing on the corresponding page.
Alternatively, the present processes can be used in the legitimacy differentiation of the virtual resource parameter in virtual trading platform,
As, can have in virtual trading platform all kinds of trading objects that a large amount of accounts (as virtual Merchants) issue (as mobile phone,
Household articless) virtual resource Transaction Information.
Below taking the virtual trading information of the mobile phone merchant transaction in virtual trading platform as a example, it is discussed in detail in the application
State embodiment:
Collect the virtual resource Transaction Information in virtual trading platform in terminal (as server), such as:Account A is (such as
Mobile phone businessman) trading object B (as mobile phone) virtual resource parameter be 5000, from this virtual resource Transaction Information
Middle extraction virtual resource parameter (i.e. above-mentioned mobile phone price 5000), to obtain data to be verified, reads this trading object B
Corresponding confidence interval, the confidence interval of such as this mobile phone transaction is (1000,6000), judges that each data to be verified is
No in confidence interval, that is, judge mobile phone price 5000 whether in the confidence interval (1000,6000) of mobile phone transaction.?
In this embodiment, mobile phone price 5000 in the confidence interval (1000,6000) that this mobile phone is concluded the business, then judges to treat school
Testing data (i.e. this mobile phone price 5000) is valid data, and this result (e.g., 5000 is valid data) is remembered
Record is got off, if data to be verified (price as mobile phone is 50) not in confidence interval (1000,6000), is then sentenced
Break data to be verified (as mobile phone price be 50) be invalid data, and by this result (e.g., 50 be non-legally
Data) record.
In the above-described embodiments, by the accurate estimation of the confidence interval to mark sample, it is possible to achieve effectively treat
The legitimacy of verification data (as mobile phone price data) differentiates, thus above-described embodiment solves is treating verification data
Validity judgement when, the low technical problem of the accuracy of judged result.
Alternatively, before reading the confidence interval determining based on the sample weights of mark sample, the method can include:
Obtain multiple mark samples, wherein, each mark sample has sample value;Extract the attribute data of each mark sample,
Weight model is set up based on the attribute data of each mark sample;Mark sample is trained by weight model, obtains each
The sample weights of mark sample;Extract the sample value in each mark sample, wherein, sample value is used for characterizing mark sample
Object corresponding virtual resource parameter described by this;Sample value based on each mark sample and each mark sample
Sample weights determine the confidence interval of multiple mark samples.
In scheme disclosed in above-described embodiment, read based on mark sample sample weights determine confidence interval it
Before, weight model can be set up based on the attribute data of each mark sample getting, due to marking the attribute of sample
Data (as marked the quality of sample itself) can set up as the tolerance of its credibility in data distribution rule
During weight model, make use of the characteristic of the attribute data of mark sample itself, namely consider when determining sample weights
The credibility of mark sample, improves accuracy and the reliability of weight model, after setting up accurate weight model,
Mark sample can be trained to obtain the high sample weights of accuracy by this weight model, and be based on this sample weights and institute
The sample value of each sample extracting is determining accurate confidence interval, thus improve the confidence interval of mark sample
Accuracy and reliability.Then reading this confidence interval, whether judging each data to be verified in confidence interval, if treating
Verification data in confidence interval, then judges that data to be verified is valid data, if data to be verified is not or not confidence area
Interior, then judge that data to be verified is invalid data.And to record each data to be verified be valid data or illegally count
According to judged result.By the above embodiments of the present application it is achieved that improving the accuracy of weight model and the effect of reliability
Really.
Specifically, after terminal gets multiple mark samples with sample value, from this mark sample, extract it
Attribute data, and the weight model of mark sample is set up based on each attribute data;Mark is trained by this weight model
Sample, thus obtain the sample weights of each mark sample;Then extract the sample value in each mark sample, that is, mark
Object corresponding virtual resource parameter described by note sample, is determined multiple afterwards based on each sample value and sample weights
The confidence interval of mark sample.Extract data to be verified from this virtual resource information, read this virtual resource information institute
The confidence interval corresponding to object of description, judges this data to be verified whether in this confidence interval, if this is to be verified
Data in confidence interval, then judges that data to be verified is valid data, that is, determine that this virtual resource information is legal
Information;If data to be verified is not in confidence interval, judge that data to be verified is invalid data, i.e. this virtual money
Source information is invalid information.Obtain each data to be verified whether valid data (and/or this virtual resource information is
No for legal information) after, will determine that result is recorded, and preserve in memory.
Alternatively, the present processes can be used in the extraction of the virtual resource parameter in virtual trading platform, e.g.,
All kinds of trading objects that a large amount of accounts (as virtual Merchants) are issued can be had (as mobile phone, household in virtual trading platform
Articles for use) virtual resource Transaction Information.Below with the virtual trading information of the mobile phone merchant transaction in virtual trading platform
As a example, the above embodiments of the present application are discussed in detail:
Get multiple mark samples in virtual trading platform in terminal (as server), wherein, each marks sample
There is sample value, e.g., the mark sample that account gets is:Virtual resource parameter C of trading object B of account A
(as the price 5000 of the commodity B of businessman A), wherein, this mark sample has sample value (e.g., the price of mobile phone
5000), extract the attribute data (as the brand of mobile phone, price, performance parameter) of each mark sample, based on each
The attribute data of mark sample sets up weight model;Mark sample is trained by weight model, obtains each and mark sample
Sample weights;Extract the sample value in each mark sample, wherein, sample value is the object described by mark sample
The virtual parameter (as the price 5000 of mobile phone) of (trading object B, such as mobile phone).Adopt in terminal (as server)
Collect the virtual resource Transaction Information in virtual trading platform, extract virtual resource ginseng from this virtual resource Transaction Information
Number (i.e. above-mentioned mobile phone price 5000), to obtain data to be verified, reads the corresponding confidence interval of this trading object B,
Confidence interval as the transaction of this mobile phone is (1000,6000), judges each data to be verified whether in confidence interval, that is,
Judge mobile phone price 5000 whether in the confidence interval (1000,6000) of mobile phone transaction.In this embodiment, mobile phone
Price 5000 in the confidence interval (1000,6000) that this mobile phone is concluded the business, then judges data to be verified (i.e. this mobile phone
Price 5000) it is valid data, and this result (e.g., 5000 is valid data) is recorded, if to be verified
Data (price as mobile phone is 50) not in confidence interval (1000,6000), then judges data to be verified (such as
The price of mobile phone is 50) it is invalid data, and this result (e.g., 50 is non-legally data) is recorded.
Alternatively, extract the attribute data of each mark sample, weight is set up based on the attribute data of each mark sample
Model includes:Extract the weight parameter of each mark sample, wherein, weight parameter includes:Bonus point parameter, scoring power
Weight, deduction of points parameter and point deduction weight, bonus point parameter be used for favorable comment fraction that the account of description mark sample obtained,
Scoring weight is used for describing the favorable comment fraction of account and the ratio of the credit of account, parameter of deducting points is used for describing account and is obtained
The deduction of points fraction obtaining, point deduction weight are used for describing owning in the account quantity and terminal being punished in the terminal of account place
The ratio of account quantity;Weight model is obtained based on bonus point parameter, scoring weight, deduction of points parameter and point deduction weight,
Wherein, weight model is:
fw(g, p)=w1*{g-a*w2* p }, wherein,
G is bonus point parameter;P is deduction of points parameter;w1For the weight that scores;w2For button
Fraction weight, β1And β2Learning parameter for weight model.
In the above-described embodiments, by weight model is set up based on the attribute data of each mark sample extracting, should
Weight model can be trained obtaining sample weights to weight samples, the weight that this weight model learns automatically also known as training
Model, the same with training other machines learning model, can be using the iterative optimization method of gradient decline.Above-mentioned enforcement
Using a kind of machine learning algorithm having supervision, BP algorithm, it is learnt example according to mark sample, adjusts each ginseng
Number carrys out matching sample results.In whole training process, comprise two processes, i.e. the reverse mistake of forward process and error
Journey.So that weight model output reaches minimum with the residual error of mark sample labeling value, residual error can be calculated to whole
The parameter gradients of individual weight model, thus the method iteration optimization weight model parameter with stochastic gradient descent.This excellent
Change method is on the premise of ensureing virtual resource parameter learning accuracy rate, it is possible to increase the efficiency of training process.Trained
Journey step as shown in Figure 3:
Step S301:The weight parameter of initialization weight model.
Specifically, extract the weight parameter of each mark sample, wherein, weight parameter includes:Bonus point parameter, scoring
Weight, deduction of points parameter and point deduction weight, bonus point parameter be used for favorable comment fraction that the account of description mark sample obtained,
Scoring weight is used for describing the favorable comment fraction of account and the ratio of the credit of account, deduction of points parameter is used for describing account and is obtained
Deduction of points fraction, point deduction weight be used for describing all accounts in the account quantity and terminal being punished in the terminal of account place
The ratio of amount amount.
Step S302:The parameter of weight model updates.
Step S303:The calculating of the sample weights of mark sample.
Weight model, wherein, weight mould are obtained based on bonus point parameter, scoring weight, deduction of points parameter and point deduction weight
Type can be:
fw(g, p)=w1*{g-a*w2* p }, wherein,
G is bonus point parameter;P is deduction of points parameter;w1For the weight that scores;w2For button
Fraction weight, β1And β2Learning parameter for weight model.
Step S304:The determination of data confidence interval.
Specifically, train mark sample using weight model, obtain the sample weights instruction of each mark sample, based on each
The sample weights of the sample value of individual mark sample and each mark sample determine the confidence interval of multiple mark samples.
Step S305:The calculating of residual error.
Specifically, calculate the difference (i.e. residual error) of the weight parameter in weight model and sample value, such that it is able to calculate
The parameter gradients to whole weight model for the residual error.
Step S306:Judge whether weight model restrains.Wherein, if weight model convergence, training process terminates.
If weight model is not converged, execution step S307, and return execution step S302.
Step S307:Gradient calculation.
The iterative optimization method being declined using gradient, and return the parameter renewal calculating of execution weight model.
In the training process, during the initialization of weight parameter, it can be 0.01 that gradient declines learning rate, training parameter β1
And β2Span can be [0,100].
In the training process, the initialization of virtual resource parameter is critically important, using sample weights model output sample power
Before retraining data Estimating Confidence Interval model, need sample weights are normalized, zoom to unified interval.
Alternatively, mark sample is trained by weight model, the sample weights obtaining each mark sample include:Pass through
Weight model fw(g, p) training mark sample, determines the learning parameter β of weight model1With learning parameter β2;Pass through
Determine weight model f of learning parameterw(g, p) calculates the weight of each mark sample;Power to each mark sample
It is normalized again, obtain marking the sample weights of sample.
Alternatively, the sample weights of the sample value based on each mark sample and each mark sample determine multiple mark samples
This confidence interval includes:Determine model using Gauss distribution is interval to described mark Sample Establishing;Obtain described interval
Determine the corresponding described confidence interval of model.
Specifically, it is possible to use Gauss distribution is interval to mark Sample Establishing to determine model f (x),Wherein, x is the sample value of mark sample, and μ is the average of f (x),wiFor the sample weights of i-th mark sample, xiRepresent the sample value of i-th mark sample, n
For marking the sum of sample, i is natural number, and σ is the standard deviation of f (x),Obtain area
Between determine model f (x) corresponding confidence interval region, region=[μ-k* σ, μ+k* σ], wherein, k is normal
Number.
In the above-described embodiments, due to can obtain the regularity of distribution symbol of the mark sample in above-described embodiment by analytical data
Close Gauss distribution rule, can be with Gauss distribution to this mark Sample Establishing Estimating Confidence Interval model, thus obtain being somebody's turn to do
The corresponding confidence interval of model.According to statistical theory, in nature, the different pieces of information regularity of distribution is different, corresponding number
Different according to model, using rational data model could preferably fitting data rule, therefore, above-described embodiment adopts
Meet the weight model of mark sample rule, such that it is able to more accurate confidence interval is obtained based on this model.
Specifically, parameter k in above-described embodiment is the true fiducial interval range specified, that is, in fiducial interval range
Interior data to be verified is legal (or normal) data of the category, and the data to be verified outside fiducial interval range is non-
Method (improper) data, the general value of constant k is 3.0 about.
Alternatively, obtain multiple mark samples to include:Obtain sample data, if there is no positive counter-example mark in sample data,
Then sample data is carried out with positive counter-example mark, the sample data after being marked;Line number is entered to the sample data after mark
According to filtration and normalized, obtain marking sample.
In the above-described embodiments, can carry out by acquisition sample data and to the data not having positive counter-example mark in data
Positive counter-example mark, can avoid the labeling operation that the data of mark is repeated, thus improve to sample data
Mark process speed, improve data processing efficiency.After sample data after obtaining mark, due to collecting
To sample data in there are a lot of noise datas (abnormal data), these abnormal datas can lead in the training process
Parameter cannot restrain, thus the accuracy of weighing factor model and robustness, so needing the distribution according to data itself
Rule is carried out to data, carries out data filtering process to the sample data after mark, rejecting abnormalities data point,
Reduce data noise, it is to avoid abnormal data disturbs the weight model of sample, so that sample data possesses the condition of convergence,
Meet the regularity of distribution of data itself, improve accuracy and generalization ability.Then the sample data after filtering is returned
One change is processed, and obtains marking sample, so that the weight model of sample data can restrain faster, thus obtaining
Mark sample there is higher accuracy rate.
Specifically, because the collection of sample data and the quality of mark are directly connected to training and obtain the accurate of sample weights
Property and reliability, are very important basic links.When obtaining sample data, sample data includes two parts:Real
When data division and history examination & verification data division.Wherein, history examination & verification data is the data that history artificial judgment is crossed,
Through this kind of sample data is carried out with Classification Management, this kind of sample data includes normal category data (i.e. positive example sample data)
With non-category data (i.e. negative data data), and real time data part (i.e. the data of extract real-time) is intended to root
It is labeled according to needs, improper categorical data is labeled as with non-category data (bearing example sample data), just
Regular data is labeled as normal data (i.e. positive example sample data).In training process, keep negative example sample data and positive example
The ratio of sample data is 1:7.
During to data filtering and pretreatment, there are a lot of noise datas due in the sample data collected,
These abnormal sample datas can lead to parameter cannot restrain in the training process, thus affecting accuracy and the Shandong of model
Rod, so need to carry out abnormal data cleaning according to the regularity of distribution of sample data itself to data.According to following
The value of the region of rejection that region of rejection RejectionRegion formula calculates comes whether judgment sample data is data noise:
Wherein, tα/2It is a kind of marginal value of sample data (as student) t-distribution (data distribution), n is that sample is big
Little.
When sample data is more than the value of this region of rejection, this sample data is regarded as data noise, then to sample number
According to being carried out, the concrete formula of cleaning sample data and step are as follows:
First, filtration treatment is carried out to sample data, when to sample data filtration treatment, can be first to sample data
(as commodity price) (as price) is ranked up by size, calculates average and the standard deviation of data;Judge this number again
Whether strong point (each sample data) is abnormal data.If judgment value is more than region of rejection, reject this data point, jump
Step to sorting the data, continues next data is judged;If judgment value is not more than region of rejection, stop
Filter process.
Specifically, before training learns the Estimating Confidence Interval model of weight (sample weights) automatically, in order that mould
Type more rapid convergence, obtains higher accuracy rate, needs to do normalized to data, mainly includes two steps:Right
Data carries out averaging operation and carries out data filtering process to the sample data after mark, wherein, data is gone
Averaging operation specifically adopts formula as follows:
Wherein, xnewFor newly obtaining the sample data after normalization, n is data x
Total number, xiFor i-th sample data, x is the sample value of the sample data after marking.
Standard deviation normalization operation is carried out to data and adopts equation below:
Wherein, s is standard deviation.
Alternatively, carry out data filtering to the sample data after mark to process and can include:Sample after obtaining according to mark
The sequence that the sample value sequence of notebook data obtains, and the average of the sample data after the mark in the sequence of calculation and standard
Difference;Sequentially obtain judgment value δ of the sample data after marking in sequence, wherein, sample after mark for the judgment value
The sample value of sample data after the standard deviation of data, mark and the sample value of sample data after multiple mark flat
Average and determine;If judgment value is more than the region of rejection obtaining in advance, reject the sample number after the corresponding mark of judgment value
According to, and return the sequence that the sample value sequence of the sample data after obtaining according to mark obtains, and the mark in the sequence of calculation
The average of the sample data after note and the step of standard deviation, until judgment value is not more than region of rejection.
Specifically, carry out data filtering to the sample data after mark to process and can include:Sample after obtaining according to mark
The sequence that the sample value sequence of notebook data obtains, and the average of the sample data after the mark in the sequence of calculation and standard
Difference;Sequentially obtain judgment value δ of the sample data after marking in sequence,S is multiple
The standard deviation of the sample data after mark, x is the sample value of the sample data after marking, and mean (X) is multiple marks
The meansigma methodss of the sample value of sample data afterwards;If judgment value is more than the region of rejection obtaining in advance, reject judgment value pair
Sample data after the mark answered, and return the sequence that the sample value sequence of the sample data after obtaining according to mark obtains
Row, and the step of the average of the sample data after the mark in the sequence of calculation and standard deviation, until judgment value is no more than refused
Distant and inaccessible land.
Alternatively, after obtaining the second judged result, the method includes:Accuracy rate school is carried out to the second judged result
Test, obtain the accuracy rate of the second judged result;If accuracy rate is less than predetermined threshold value, the sample power of adjustment mark sample
Weight;Redefine confidence interval based on the sample weights after adjustment.
Specifically, in multiple data, normal data (i.e. positive example sample data) and improper data (bear example sample
Notebook data) mix, if normal data (i.e. positive example sample data) can not be mistaken for improper data (i.e.
Negative example sample data) classification, then need to control False Rate in relatively low scope.The algorithm that the application proposes can not be complete
Entirely reach above-mentioned strictly accurately require, so need introduce manual evaluation, model parameter is finely adjusted so that
Model is operated in optimum state.This process can pass through the receiver operating characteristic curve (Receiver of model output
Operating Characteristic, ROC curve) it is adjusted, control FP so that model below 1%
Accuracy rate is close to 100%.So, manual evaluation fine setting is just for the data of partly a small amount of classification.
In data environment, because the extraneous factor such as season, constantly fluctuating with environment in the confidence interval of data, needs
Set up model modification mechanism.Using the mechanism updating a model for seven days it is ensured that the data confidence interval estimated exists
Verity in environment and ageing, then updates discrimination model on line.
During model training, the quality of labeled data and manual evaluation to the accuracy rate of final discrimination model and are recalled
Rate is particularly important.In order to embody advantage on normal data differentiates for this invention, by the method in the present embodiment and tradition
Method compares, and result is as shown in table 1 below.
Table 1
Algorithm | Accuracy rate | Recall rate | Manual examination and verification amount |
Traditional method | 79% | 7.6% | 100% |
The method of the present embodiment | 96.5% | 7.5% | 1.1% |
Using the multiple class of Taobao, commodity now calculate data in table 1, and from table 1, data can be seen that this
Application is substantially better than traditional method, and in the case that recall rate is almost without sacrificing, accuracy rate is obviously improved, significantly
Reduce manual examination and verification amount.
In embodiment as shown in Figure 4, number can be detected using the data Estimating Confidence Interval method based on weight model
According to comprising the following steps that of, this flow process:
Step S401:Collect the sample data of each classification and it is carried out with positive counter-example mark, wherein, positive counter-example mark
Be that data to be verified is divided into two classifications to each class, respectively category data (positive example sample data) and non-should
Categorical data (negative example sample data).
Step S402:Mark sample is filtered and pretreatment, rejecting abnormalities data point.
Specifically, by above-mentioned steps, it is possible to reduce data noise, it is to avoid abnormal data interference in learning model, improve
Accuracy and generalization ability.Rejecting abnormalities data point includes:Rejecting abnormal data, is normalized to data.
Step S403:Extract the weight parameter of each sample data, build the weight model based on sample data.Wherein,
This weight parameter (getting final product Reliability characteristics) includes:Bonus point parameter bonus point fraction (as bonus point fraction become reconciled scoring number),
Scoring weight (as the ratio of favorable comment fraction and credit score), deduction of points parameter (as deduction of points fraction, difference scoring number, are located
Point penalty number and warning fraction) and point deduction weight (ratios of the account quantity being such as punished and all account quantity).
Step S404:Using weight model, mark sample is trained and obtains sample weights, read mark sample
The confidence interval that sample weights determine.As using the data of weight model output, being inputted data confidence interval mould
Type, the confidence interval of output sample data.
Step S405:Accuracy rate verification is carried out to judged result, obtains this accuracy rate, if accuracy rate is less than predetermined threshold value,
Then adjust sample weights.
Specifically, due to, in the data of management all categories, needing to manage data (the positive example sample number of the non-category
According to), but normal data (i.e. valid data) can not be treated as improper data (i.e. invalid data), so needing
Finely tune parameter by manual evaluation, make model be transferred to optimum.According to the weight model obtaining, ROC curve can be obtained,
False Rate (False Positive, FP) can be controlled below 1%, thus the accuracy rate obtaining model is close
100%, staying data less than 1% to manual examination and verification, thus improve the accuracy rate of model, and reducing the work of examination & verification
Measure.
Step S406:Redefine confidence interval based on the sample weights after adjustment.
Specifically, model timing update mechanism can be set up, regularly update the data the estimation model of confidence interval.
In the scheme of above-described embodiment, by processing mass data (i.e. number to be verified based on big data digging technology
According to), in conjunction with machine learning techniques from big data learning model (i.e. weight model), for building data confidence interval
Estimate model (i.e. confidence interval).Meanwhile, the study mechanism based on weight model by machine learning field, draws
Enter in data Estimating Confidence Interval model, for automatically obtaining sample weights from data (marking sample) learning,
Confidence interval (as the estimation model of confidence interval) based on weight model is obtained by training study.Big data learns
Mechanism, under the Internet big data background, is applied to data warehouse, speech processes and natural language processing, achieves frightened
The effect of people.Compared with traditional data Estimating Confidence Interval method, the above embodiments of the present application propose to introduce credibility
Feature (weight parameter of sample data), by the experience of screening high quality training data by the machine based on weight model
Make automatic learning simulation, the credibility of profound mining data, reduce corrupt data (bearing example sample data) right
The harmful effect of Estimating Confidence Interval model (i.e. confidence interval), retains higher data (the i.e. positive example sample of credibility
Data) actively impact to Estimating Confidence Interval model (i.e. confidence interval), sample number with a low credibility will not be subject to
According to the impact changing it is ensured that the accuracy of data Estimating Confidence Interval model, reliability, stability.In addition, adopting
With study mechanism, apply for that the method proposing can be provided personalized service and be avoided the repeated work of substantial amounts of adjusting parameter
Make.
In the method for detection data that the application proposes, sample can be realized with reference to traditional method using artificial data screening
The screening of data and adjustment.Specifically, artificial screening data, retains high-quality data, calculates the variance of data
And average, according to setting fiducial interval range, the fluctuation of observed data, if finding to have deviation, need constantly to adjust number
According to collection it is ensured that its validity.
Alternatively, in order to choose the data of better quality, in data preprocessing phase, strict setting data screening conditions,
Assessment accuracy rate, then constantly adjusts data screening condition, is then estimated again.Constantly repeat this process, until
Evaluated effect is less than the threshold value setting.To improve sample data using critical data pretreatment with reference to traditional method
Quality.
The parameter of the method initialization data pattern of the detection data being provided using the application, is added when there being new data
When, assess the quality of data;If qualified, add data set, the confidence interval that output is estimated;Then Evaluated effect,
As unqualified, rejecting data.Constantly repeat this process, until Evaluated effect is less than the threshold value setting.Then using biography
The confidence interval of system method estimated data.
It should be noted that for aforesaid each method embodiment, in order to be briefly described, therefore it is all expressed as one and be
The combination of actions of row, but those skilled in the art should know, and the application is not subject to limiting of described sequence of movement
System, because according to the application, some steps can be carried out using other orders or simultaneously.Secondly, art technology
Personnel also should know, embodiment described in this description belongs to preferred embodiment, involved action and module
Not necessarily necessary to the application.
Through the above description of the embodiments, those skilled in the art can be understood that according to above-mentioned enforcement
The method of example can be realized by the mode of software plus necessary general hardware platform naturally it is also possible to pass through hardware, but
The former is more preferably embodiment in many cases.Based on such understanding, the technical scheme of the application substantially or
Say that what prior art was contributed partly can be embodied in the form of software product, this computer software product is deposited
Storage, in a storage medium (as ROM/RAM, magnetic disc, CD), includes some instructions use so that a station terminal
Described in equipment (can be mobile phone, computer, server, or network equipment etc.) execution each embodiment of the application
Method.
Embodiment 2
According to the embodiment of the present application, additionally provide a kind of device for implementing above-mentioned detection data method, as Fig. 5 institute
Show, this device includes:Read module 30, the first judge module 40 and the second judge module 50.
Wherein, read module 30, for reading the confidence interval determining based on the sample weights of mark sample, wherein,
Sample weights are by weight model, the mark sample obtaining in advance to be trained obtaining.
First judge module 40, for judging that data to be verified, whether in confidence interval, obtains the first judged result.
Second judge module 50, for according to the first judged result, judging whether data to be verified is valid data, obtains
To the second judged result.
If above-mentioned second judge module 50 can be used for data to be verified in confidence interval, judge number to be verified
According to for valid data, if data to be verified is not in confidence interval, judge that data to be verified is invalid data.
Alternatively, this device can also include:Logging modle, for obtaining the second judgement in the second judge module 50
After result, record the second judged result that each data to be verified is valid data or invalid data.
In device disclosed in the above embodiments of the present application, the sample weights that obtain can be trained by read module and be based on
Sample weights determine confidence interval.In the apparatus, when determining confidence interval, mould is judged by read module and first
Block makes sample weights that the significance level of mark sample is distinguished, namely when determining confidence interval, improves mark
The actively impact to confidence interval for the higher data of credibility in note sample, reduces the corrupt data in mark sample
Disturbing influence to confidence interval, so that confidence interval is close to the real confidence interval of data, using the first judgement
Classify it is ensured that estimates is accurate to all kinds of data to be verified automatically in confidence interval in module and the second judge module
Property, stability and reliability.By the application, solve verification data legitimacy when check results accuracy low
Problem is it is achieved that verification data legitimacy effect exactly.
Specifically, after virtual resource information is collected by the terminal of read module, from this virtual resource information
Extract data to be verified, the confidence area corresponding to the object described by this virtual resource information is read by read module
Between, and this data to be verified is judged whether in this confidence interval by the first judge module, using the second judge module
If judging this data to be verified in confidence interval, judging that data to be verified is valid data, that is, determining that this is virtual
Resource information is legal information;If data to be verified is not in confidence interval, judge that data to be verified is illegally to count
According to that is, this virtual resource information is invalid information.Obtaining each data to be verified, valid data (and/or should
Whether virtual resource information is legal information) after, will determine that result is recorded by logging modle, and be saved in
In memorizer.
Alternatively, as shown in fig. 6, this device can include:Sample acquisition module 21, the first extraction module 22,
Training module 23, the second extraction module 24 and the first determining module 25.
Wherein, sample acquisition module 21, for read based on mark sample sample weights determine confidence interval it
Before, obtain multiple mark samples, wherein, each mark sample has sample value.
First extraction module 22, the attribute for extracting the attribute data of each mark sample, based on each mark sample
Data sets up weight model.
Training module 23, for training mark sample by weight model, obtains the sample weights of each mark sample.
Second extraction module 24, for extracting the sample value in each mark sample, wherein, sample value is used for characterizing mark
Object corresponding virtual resource parameter described by note sample.
First determining module 25, the sample weights for the sample value based on each mark sample and each mark sample are true
The confidence interval of fixed multiple mark sample.
In the device that above-described embodiment is recorded, read based on putting that the sample weights of mark sample determine in read module
Before letter interval, weight can be set up based on the attribute data of each mark sample getting by the first extraction module
Model, the attribute data (as marked the quality of sample itself) due to marking sample can be made in data distribution rule
For the tolerance of its credibility, when setting up weight model, fundamentally make use of this characteristic of mark sample itself,
Thus improve accuracy and the reliability of weight model, after extraction module sets up accurate weight model, training
Module can train mark sample to obtain the high sample weights of accuracy by this weight model, and combines the second extraction mould
Block determines accurate confidence interval based on the sample value of this sample weights and each sample being extracted, thus improve
The accuracy of confidence interval of mark sample and reliability.Then read this confidence interval, by the first judge module and
Whether second judge module judges each data to be verified in confidence interval, if data to be verified is in confidence interval,
Judge that data to be verified is valid data, if data to be verified is not in confidence interval, judge data to be verified
For invalid data.And the judgement knot that each data to be verified is valid data or invalid data is recorded by logging modle
Really.By the above embodiments of the present application it is achieved that improving the accuracy of weight model and the effect of reliability.
Alternatively, as shown in fig. 7, the first extraction module 22 includes:First extracting sub-module 221 and model obtain
Module 223.
First extracting sub-module 221, for extracting the weight parameter of each mark sample, wherein, weight parameter includes:
Bonus point parameter, scoring weight, deduction of points parameter and point deduction weight, bonus point parameter is used for the account institute of description mark sample
The favorable comment fraction obtaining, scoring weight are used for describing the favorable comment fraction of account and the ratio of credit of account, deduction of points parameter
For describing the deduction of points fraction that account obtained, point deduction weight is used for describing the account number being punished in the terminal of account place
The ratio of all account quantity in amount and terminal.
Model acquisition module 223, for being obtained based on bonus point parameter, scoring weight, deduction of points parameter and point deduction weight
Weight model, wherein, weight model is:fw(g, p)=w1*{g-a*w2* p }, wherein,G is bonus point parameter;P is deduction of points parameter;w1For the weight that scores;w2For button
Fraction weight, β1And β2Learning parameter for weight model.
Alternatively, training module includes:Second determining module, the 3rd determining module and processing module.
Wherein, the second determining module, for by weight model fw(g, p) training mark sample, determines weight model
Learning parameter β1With learning parameter β2;3rd determining module, for the weight model by determining learning parameter
fw(g, p) calculates the weight of each mark sample;Processing module, the weight for each is marked with sample carries out normalizing
Change is processed, and obtains marking the sample weights of sample.
Alternatively, the first determining module includes:Set up module and interval acquisition module.
Wherein, set up module, for determining model using Gauss distribution is interval to mark Sample Establishing.
Interval acquisition module, for obtaining the interval determination corresponding confidence interval of model.
Specifically, set up module to can be used for determining model f (x) using Gauss distribution is interval to mark Sample Establishing,Wherein, μ is the average of f (x), and x is the sample value of mark sample,wiFor the sample weights of i-th mark sample, xiRepresent the sample value of i-th mark sample, n
For marking the sum of sample, i is natural number, and σ is the standard deviation of f (x),
Interval acquisition module can be used for obtaining the corresponding confidence interval region of interval determination model f (x),
Region=[μ-k* σ, μ+k* σ], wherein, k is constant.
Above-described embodiment is using the weight model meeting mark sample rule, more accurate such that it is able to be obtained based on this model
True confidence interval.
Alternatively, obtain multiple mark samples to include:Data acquisition module and pretreatment module.
Data acquisition module, for obtaining sample data, if do not have positive counter-example mark, to sample number in sample data
According to carrying out positive counter-example mark, the sample data after being marked;Pretreatment module, for the sample data after mark
Carry out data filtering and normalized, obtain marking sample.
In the above-described embodiments, can carry out by acquisition sample data and to the data not having positive counter-example mark in data
Positive counter-example mark, can avoid the labeling operation that the data of mark is repeated, thus improve to sample data
Mark process speed, improve data processing efficiency.Data acquisition module obtain mark after sample data it
Afterwards, there are a lot of noise datas (abnormal data) due in the sample data collected, these abnormal datas are in training
During parameter can be led to cannot to restrain, thus the accuracy of weighing factor model and robustness, so needing pretreatment
Module is carried out to data according to the regularity of distribution of data itself, carries out data filtering to the sample data after mark
Process, rejecting abnormalities data point, reduce data noise, it is to avoid abnormal data disturbs the weight model of sample, so that
Obtain sample data and possess the condition of convergence, meet the regularity of distribution of data itself, improve accuracy and generalization ability.Then
By pretreatment module, the sample data after filtering is normalized, obtains marking sample, so that sample
The weight model of data can restrain faster, thus the mark sample obtaining has higher accuracy rate.
Alternatively, pretreatment module includes:First processing module, judgment value acquisition module and Second processing module.
Wherein, first processing module, for obtaining the sequence obtaining according to the sample value sequence of the sample data after marking,
And the average of the sample data after the mark in the sequence of calculation and standard deviation.
Judgment value acquisition module, for sequentially obtaining judgment value δ of the sample data after mark in sequence, wherein, sentences
Disconnected value is after the sample value of the sample data after the standard deviation of the sample data after mark, mark and multiple mark
The meansigma methodss of the sample value of sample data and determine.
Second processing module, if being more than, for judgment value, the region of rejection obtaining in advance, rejects the corresponding mark of judgment value
Sample data afterwards, and return the sequence that the sample value sequence of the sample data after obtaining according to mark obtains, and calculate
The average of the sample data after mark in sequence and the step of standard deviation, until judgment value is not more than region of rejection.
Specifically, the sample of the sample data after the first processing module in above-described embodiment can be used for obtaining according to mark
The sequence that the sequence of this value obtains, and the average of the sample data after the mark in the sequence of calculation and standard deviation;Judgment value obtains
Delivery block, for sequentially obtaining judgment value δ of the sample data after mark in sequence,S is
The standard deviation of the sample data after multiple marks, x is the sample value of the sample data after marking, and mean (X) is multiple
The meansigma methodss of the sample value of the sample data after mark;Second processing module, if be more than for judgment value obtain in advance
Region of rejection, then reject the sample data after the corresponding mark of judgment value, and return the sample data after obtaining according to mark
The sample value sequence that obtains of sequence, and the step of the average of the sample data after the mark in the sequence of calculation and standard deviation
Suddenly, until judgment value is not more than region of rejection.
Alternatively, device includes:Verify acquisition module, adjusting module and redefine module.
Wherein, verify acquisition module, for, after obtaining the second judged result, carrying out accurately to the second judged result
Rate verifies, and obtains the accuracy rate of the second judged result.
Adjusting module, if be less than predetermined threshold value, the sample weights of adjustment mark sample for accuracy rate.
Redefine module, for redefining confidence interval based on the sample weights after adjustment.
Specifically, verify acquisition module, for judged result is carried out with accuracy rate verification, obtain the accurate of judged result
Rate.
Device in above-described embodiment, by processing mass data (i.e. number to be verified based on big data digging technology
According to), in conjunction with machine learning techniques from big data learning model (i.e. weight model), for building data confidence interval
Estimate model (i.e. confidence interval).Meanwhile, the study mechanism based on weight model by machine learning field, draws
Enter in data Estimating Confidence Interval model, for automatically obtaining sample weights from data (marking sample) learning,
Confidence interval (as the estimation model of confidence interval) based on weight model is obtained by training study.Big data learns
Mechanism, under the Internet big data background, is applied to data warehouse, speech processes and natural language processing, achieves frightened
The effect of people.Compared with traditional data Estimating Confidence Interval method, the above embodiments of the present application propose to introduce credibility
Feature (weight parameter of sample data), by the experience of screening high quality training data by the machine based on weight model
Make automatic learning simulation, the credibility of profound mining data, reduce corrupt data (bearing example sample data) right
The harmful effect of Estimating Confidence Interval model (i.e. confidence interval), retains higher data (the i.e. positive example sample of credibility
Data) actively impact to Estimating Confidence Interval model (i.e. confidence interval), sample number with a low credibility will not be subject to
According to the impact changing it is ensured that the accuracy of data Estimating Confidence Interval model, reliability, stability.In addition, adopting
With study mechanism, apply for that the method proposing can be provided personalized service and be avoided the repeated work of substantial amounts of adjusting parameter
Make.
Embodiment 3
Embodiments herein can provide a kind of terminal, and this terminal can be in terminal group
Any one computer terminal.Alternatively, in the present embodiment, above computer terminal can also replace with
The terminal units such as mobile terminal.
Alternatively, in the present embodiment, as shown in figure 8, above computer terminal may be located at the many of computer network
At least one of the individual network equipment network equipment 101, this network equipment 101 can be set with other networks by network
Standby 103 connections.
Alternatively, this terminal A in embodiment as shown in Figure 8 can include:One or more (in figures
Only illustrate one) processor, memorizer, and transmitting device.
Wherein, memorizer can be used for storing software program and module, the such as side of the detection data in the embodiment of the present application
Method and the corresponding programmed instruction/module of device, processor passes through to run software program and the mould being stored in memorizer
Block, thus executing various function application and data processing, that is, the method realizing above-mentioned detection data.Memorizer can
Including high speed random access memory, nonvolatile memory can also be included, such as one or more magnetic storage device,
Flash memory or other non-volatile solid state memories.In some instances, memorizer can further include with respect to place
The remotely located memorizer of reason device, these remote memories can be by network connection to terminal A.The reality of above-mentioned network
Example includes but is not limited to the Internet, intranet, LAN, mobile radio communication and combinations thereof.
Processor can call information and the application program of memory storage by transmitting device, to execute following step:
Read the confidence interval determining based on the sample weights of mark sample, wherein, sample weights are to pre- by weight model
The mark sample first obtaining is trained and obtains;Judge that data to be verified, whether in confidence interval, obtains the first judgement
Result;According to the first judged result, judge whether data to be verified is valid data, obtain the second judged result.
Optionally, above-mentioned processor can also carry out following steps:Determined based on the sample weights of mark sample reading
Confidence interval before, obtain multiple mark samples, wherein, each mark sample there is sample value.Extract each mark
The attribute data of note sample, sets up weight model based on the attribute data of each mark sample.Trained by weight model
Mark sample, obtains the sample weights of each mark sample.Extract the sample value in each mark sample, wherein, sample
This value is used for characterizing the object corresponding virtual resource parameter described by mark sample.Sample based on each mark sample
The sample weights of value and each mark sample determine the confidence interval of multiple mark samples.
Optionally, above-mentioned processor can also carry out following steps:Extract the weight parameter of each mark sample, wherein,
Weight parameter includes:Bonus point parameter, scoring weight, deduction of points parameter and point deduction weight, bonus point parameter is used for description mark
The favorable comment fraction that the account of note sample is obtained, scoring weight are used for describing the credit of the favorable comment fraction of account and account
Ratio, deduction of points parameter are for describing the deduction of points fraction that account is obtained, point deduction weight is used for describing account place terminal
The ratio of all account quantity in the account quantity being punished and terminal.Based on bonus point parameter, scoring weight, deduction of points
Parameter and point deduction weight obtain weight model.
Optionally, above-mentioned processor can also carry out following steps:Interval to mark Sample Establishing really using Gauss distribution
Cover half type;Obtain the interval determination corresponding confidence interval of model.
Optionally, above-mentioned processor can also carry out following steps:Obtain sample data, if in sample data not just
Counter-example identifies, then sample data is carried out with positive counter-example mark, the sample data after being marked.To the sample after mark
Data carries out data filtering and normalized, obtains marking sample.
Optionally, above-mentioned processor can also carry out following steps:The sample value of the sample data after obtaining according to mark
Sort the sequence obtaining, and the average of the sample data after the mark in the sequence of calculation and standard deviation;Sequentially obtain sequence
The judgment value of the sample data after middle mark, wherein, the standard deviation of sample data after mark for the judgment value, mark
Meansigma methodss of the sample value of sample data after the sample value of sample data afterwards and multiple mark and determine;If judging
Value is more than the region of rejection that obtains in advance, then reject the sample data after the corresponding mark of judgment value, and return acquisition according to
The sample value sequence that obtains of sequence of the sample data after mark, and the sample data after the mark in the sequence of calculation is equal
Value and the step of standard deviation, until judgment value is not more than region of rejection.
Optionally, above-mentioned processor can also carry out following steps:After obtaining the second judged result, sentence to second
Disconnected result carries out accuracy rate verification, obtains the accuracy rate of the second judged result;If accuracy rate is less than predetermined threshold value, adjust
The sample weights of whole mark sample;Redefine confidence interval based on the sample weights after adjustment.
Using the embodiment of the present application, there is provided a kind of scheme of detection data method, the program can be obtained by training
Sample weights and determine confidence interval based on sample weights.In this scenario, when determining confidence interval, by sample
Weight is distinguished to the significance level of mark sample, namely when determining confidence interval, improves in mark sample
The actively impact to confidence interval for the higher data of credibility, reduces the corrupt data marking in sample to confidence interval
Disturbing influence so that confidence interval is close to the real confidence interval of data, using this confidence interval automatically to each
The classification of class data to be verified it is ensured that estimate accuracy, stability and reliability.By the application, solve
Check results accuracy low problem during the legitimacy of verification data is it is achieved that verification data legitimacy effect exactly.
It will appreciated by the skilled person that the structure shown in Fig. 8 is only illustrating, the computer shown in Fig. 8
Terminal can also be smart mobile phone (as Android phone, iOS mobile phone etc.), panel computer, palm PC and shifting
The terminal units such as dynamic internet device (Mobile Internet Devices, MID), PAD.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is can
To be completed come the device-dependent hardware of command terminal by program, this program can be stored in a computer-readable storage
In medium, storage medium can include:Flash disk, read only memory (Read-Only Memory, ROM), random
Memory access (Random Access Memory, RAM), disk or CD etc..
Embodiment 4
Embodiments herein additionally provides a kind of storage medium.Alternatively, in the present embodiment, above-mentioned storage medium
Can be used for preserving the program code performed by the method for detection data that above-described embodiment one is provided.
Alternatively, in the present embodiment, above-mentioned storage medium may be located in computer network Computer terminal group
In any one terminal, or it is located in any one mobile terminal in mobile terminal group.
Alternatively, in the present embodiment, storage medium is arranged to store the program code for executing following steps:
Read the confidence interval determining based on the sample weights of mark sample, wherein, sample weights are to pre- by weight model
The mark sample first obtaining is trained and obtains;Judge that data to be verified, whether in confidence interval, obtains the first judgement
Result;According to the first judged result, judge whether data to be verified is valid data, obtain the second judged result.
Alternatively, in the present embodiment, storage medium is arranged to store the program code being additionally operable to execute following steps:
Before reading the confidence interval determining based on the sample weights of mark sample, obtain multiple mark samples, wherein, often
Individual mark sample has sample value;Extract the attribute data of each mark sample, the attribute number based on each mark sample
According to setting up weight model;Mark sample is trained by weight model, obtains the sample weights of each mark sample;Extract
Sample value in each mark sample, wherein, the object that sample value is used for characterizing described by mark sample is corresponding virtual
Resource parameters;The sample weights of the sample value based on each mark sample and each mark sample determine multiple mark samples
Confidence interval.
Alternatively, in the present embodiment, storage medium is arranged to store the program code being additionally operable to execute following steps:
Determine model using Gauss distribution is interval to mark Sample Establishing;Obtain the interval determination corresponding confidence interval of model.
Alternatively, in the present embodiment, storage medium is arranged to store the program code being additionally operable to execute following steps:
Obtain sample data, if there is no positive counter-example mark in sample data, positive counter-example mark being carried out to sample data, obtains
Sample data after mark;Data filtering and normalized are carried out to the sample data after mark, obtains marking sample.
Alternatively, in the present embodiment, storage medium is arranged to store the program code being additionally operable to execute following steps:
The sequence that the sample value sequence of the sample data after obtaining according to mark obtains, and the sample after the mark in the sequence of calculation
The average of data and standard deviation;Sequentially obtain the judgment value of the sample data after marking in sequence, wherein, judgment value is led to
Cross mark after the standard deviation of sample data, mark after the sample value of sample data and the sample number after multiple mark
According to the meansigma methodss of sample value and determine;If judgment value is more than the region of rejection obtaining in advance, reject judgment value corresponding
Sample data after mark, and return the sequence that the sample value sequence of the sample data after obtaining according to mark obtains, and
The average of the sample data after mark in the sequence of calculation and the step of standard deviation, until judgment value is not more than region of rejection.
Alternatively, in the present embodiment, storage medium is arranged to store the program code being additionally operable to execute following steps:
After obtaining the second judged result, the second judged result is carried out with accuracy rate verification, obtain the standard of the second judged result
Really rate;If accuracy rate is less than predetermined threshold value, the sample weights of adjustment mark sample;Based on the sample weights after adjustment
Redefine confidence interval.
Using the embodiment of the present application, there is provided a kind of scheme of detection data method, the program can be obtained by training
Sample weights and determine confidence interval based on sample weights.In this scenario, when determining confidence interval, by sample
Weight is distinguished to the significance level of mark sample, namely when determining confidence interval, improves in mark sample
The actively impact to confidence interval for the higher data of credibility, reduces the corrupt data marking in sample to confidence interval
Disturbing influence so that confidence interval is close to the real confidence interval of data, using this confidence interval automatically to each
The classification of class data to be verified it is ensured that estimate accuracy, stability and reliability.By the application, solve
Check results accuracy low problem during the legitimacy of verification data is it is achieved that verification data legitimacy effect exactly.
Above-mentioned the embodiment of the present application sequence number is for illustration only, does not represent the quality of embodiment.
In above-described embodiment of the application, the description to each embodiment all emphasizes particularly on different fields, and does not have in certain embodiment
The part describing in detail, may refer to the associated description of other embodiment.
It should be understood that disclosed technology contents in several embodiments provided herein, other can be passed through
Mode realize.Wherein, device embodiment described above is only the schematically division of for example described unit,
It is only a kind of division of logic function, actual can have other dividing mode when realizing, for example multiple units or assembly
Can in conjunction with or be desirably integrated into another system, or some features can be ignored, or does not execute.Another, institute
The coupling each other of display or discussion or direct-coupling or communication connection can be by some interfaces, unit or mould
The INDIRECT COUPLING of block or communication connection, can be electrical or other forms.
The described unit illustrating as separating component can be or may not be physically separate, show as unit
The part showing can be or may not be physical location, you can with positioned at a place, or can also be distributed to
On multiple NEs.Some or all of unit therein can be selected according to the actual needs to realize the present embodiment
The purpose of scheme.
In addition, can be integrated in a processing unit in each functional unit in each embodiment of the application it is also possible to
It is that unit is individually physically present it is also possible to two or more units are integrated in a unit.Above-mentioned integrated
Unit both can be to be realized in the form of hardware, it would however also be possible to employ the form of SFU software functional unit is realized.
If described integrated unit is realized and as independent production marketing or use using in the form of SFU software functional unit
When, can be stored in a computer read/write memory medium.Based on such understanding, the technical scheme of the application
The part substantially in other words prior art being contributed or all or part of this technical scheme can be with softwares
The form of product embodies, and this computer software product is stored in a storage medium, including some instructions in order to
Each is real to make a computer equipment (can be personal computer, server or network equipment etc.) execution the application
Apply all or part of step of a methods described.And aforesaid storage medium includes:USB flash disk, read only memory (ROM,
Read-Only Memory), random access memory (RAM, Random Access Memory), portable hard drive,
Magnetic disc or CD etc. are various can be with the medium of store program codes.
The above is only the preferred implementation of the application it is noted that ordinary skill people for the art
For member, on the premise of without departing from the application principle, some improvements and modifications can also be made, these improve and moisten
Decorations also should be regarded as the protection domain of the application.
Claims (16)
1. a kind of method of detection data is it is characterised in that include:
Read the confidence interval determining based on the sample weights of mark sample, wherein, described sample weights are to pass through
Weight model is trained to the described mark sample obtaining in advance and obtains;
Judge that data to be verified, whether in described confidence interval, obtains the first judged result;
According to the first judged result, judge whether described data to be verified is valid data, obtain the second judgement knot
Really.
2. method according to claim 1 is it is characterised in that determined based on the sample weights of mark sample reading
Confidence interval before, methods described includes:
Obtain multiple described mark samples, wherein, each described mark sample has sample value;
Extract the attribute data of each described mark sample, set up based on the attribute data marking sample each described
Described weight model;
Described mark sample is trained by described weight model, obtains the sample weights of each described mark sample;
Extract the sample value in each described mark sample, wherein, described sample value is used for characterizing described mark sample
Object corresponding virtual resource parameter described by this;
Determined multiple based on the sample weights of the sample value marking sample each described and each described mark sample
The confidence interval of described mark sample.
3. method according to claim 2 it is characterised in that extract each described mark sample attribute data,
Set up described weight model based on the attribute data marking sample each described to include:
Extract the weight parameter of each described mark sample, wherein, described weight parameter includes:Bonus point parameter,
Scoring weight, deduction of points parameter and point deduction weight, described bonus point parameter is used for describing the account of described mark sample
The favorable comment fraction being obtained, described scoring weight are used for describing the favorable comment fraction of described account and the letter of described account
Ratio, described deduction of points parameter are used for describing deduction of points fraction, the described point deduction weight use that described account is obtained
In all account quantity in the account quantity being punished and described terminal are described on described account place terminal
Ratio;
Institute is obtained based on described bonus point parameter, described scoring weight, described deduction of points parameter and described point deduction weight
State weight model, wherein, described weight model is:
fw(g, p)=w1*{g-a*w2* p }, wherein,
DescribedG is described bonus point parameter;P is described deduction of points parameter;w1
For described scoring weight;w2For described point deduction weight, β1And β2Learning parameter for described weight model.
4. according to the method in claim 2 or 3 it is characterised in that described mark is trained by described weight model
Sample, the sample weights obtaining each described mark sample include:
By described weight model fw(g, p) trains described mark sample, determines the study ginseng of described weight model
Number β1With learning parameter β2;
By determining described weight model f of described learning parameterw(g, p) calculates each described mark sample
Weight;
The weight marking sample each described is normalized, obtains the sample power of described mark sample
Weight.
5. method according to claim 2 is it is characterised in that based on marking the sample value of sample and each each described
The sample weights of individual described mark sample determine that the confidence interval of multiple described mark samples includes:
Determine model using Gauss distribution is interval to described mark Sample Establishing;
Obtain the described interval determination corresponding described confidence interval of model.
6. method according to claim 2 includes it is characterised in that obtaining multiple described mark samples:
Obtaining sample data, if there is no positive counter-example mark in described sample data, described sample data being carried out
Positive counter-example mark, the sample data after being marked;
Data filtering and normalized are carried out to the sample data after described mark, obtains described mark sample.
7. method according to claim 6 is it is characterised in that carry out data mistake to the sample data after described mark
Filter processes and includes:
The sequence that the sample value sequence of the sample data after obtaining according to described mark obtains, and calculate described sequence
In described mark after the average of sample data and standard deviation;
Sequentially obtain judgment value δ of the sample data after mark described in described sequence, wherein, described judgment value
The standard deviation of sample data after described mark, the sample value of sample data after described mark and multiple
Described mark after the meansigma methodss of the sample value of sample data and determine;
If described judgment value is more than the region of rejection obtaining in advance, after rejecting the corresponding described mark of described judgment value
Sample data, and return the sample value sequence that obtains of sequence of the sample data after obtaining according to described mark,
And the step of the average of sample data after the described mark calculating in described sequence and standard deviation, sentence until described
Disconnected value is not more than described region of rejection.
8. method according to claim 1 is it is characterised in that after obtaining the second judged result, methods described
Including:
Described second judged result is carried out with accuracy rate verification, obtains the accuracy rate of described second judged result;
If described accuracy rate is less than predetermined threshold value, adjust the sample weights of described mark sample;
Redefine described confidence interval based on the sample weights after adjustment.
9. a kind of device of detection data is it is characterised in that include:
Read module, for reading the confidence interval determining based on the sample weights of mark sample, wherein, described
Sample weights are by weight model, the described mark sample obtaining in advance to be trained obtaining;
First judge module, for whether judging data to be verified in described confidence interval, obtains the first judgement knot
Really;
Second judge module, for according to the first judged result, judging whether described data to be verified is legal number
According to obtaining the second judged result.
10. device according to claim 9 is it is characterised in that described device includes:
Sample acquisition module, for read based on mark sample sample weights determine confidence interval before,
Obtain multiple described mark samples, wherein, each described mark sample has sample value;
First extraction module, for extracting the attribute data of each described mark sample, based on mark each described
The attribute data of sample sets up described weight model;
Training module, for training described mark sample by described weight model, obtains each described mark sample
This sample weights;
Second extraction module, for extracting the sample value in each described mark sample, wherein, described sample value
For characterizing the object corresponding virtual resource parameter described by described mark sample;
First determining module, for based on the sample value marking sample each described and each described mark sample
Sample weights determine the confidence interval of multiple described mark samples.
11. devices according to claim 10 are it is characterised in that described first extraction module includes:
First extracting sub-module, for extracting the weight parameter of each described mark sample, wherein, described weight
Parameter includes:Bonus point parameter, scoring weight, deduction of points parameter and point deduction weight, described bonus point parameter is used for retouching
State described mark sample account obtained favorable comment fraction, described scoring weight be used for the good of described account is described
Scoring number is used for describing the deduction of points that described account is obtained with the ratio of credit of described account, described deduction of points parameter
Fraction, described point deduction weight are used for describing the account quantity being punished in the terminal of described account place and described terminal
On all account quantity ratio;
Model acquisition module, for based on described bonus point parameter, described scoring weight, described deduction of points parameter and
Described point deduction weight obtains described weight model, and wherein, described weight model is:
fw(g, p)=w1*{g-a*w2* p }, wherein,
DescribedG is described bonus point parameter;P is described deduction of points parameter;w1
For described scoring weight;w2For described point deduction weight, β1And β2Learning parameter for described weight model.
12. devices according to claim 10 or 11 are it is characterised in that described training module includes:
Second determining module, for by described weight model fw(g, p) trains described mark sample, determines institute
State the learning parameter β of weight model1With learning parameter β2;
3rd determining module, for described weight model f by determining described learning parameterw(g, p) calculates each
The weight of individual described mark sample;
Processing module, for being normalized to the weight marking sample each described, obtains described mark
The sample weights of sample.
13. devices according to claim 10 are it is characterised in that described first determining module includes:
Set up module, for determining model using Gauss distribution is interval to described mark Sample Establishing
Interval acquisition module, for obtaining the described interval determination corresponding described confidence interval of model.
14. devices according to claim 10 are it is characterised in that described sample acquisition module includes:
Data acquisition module, for obtaining sample data, if there is no positive counter-example mark in described sample data,
Described sample data is carried out with positive counter-example mark, the sample data after being marked;
Pretreatment module, for carrying out data filtering and normalized to the sample data after described mark, obtains
To described mark sample.
15. devices according to claim 14 are it is characterised in that described pretreatment module includes:
First processing module, the sequence obtaining according to the sample value sequence of the sample data after described mark for acquisition
Row, and calculate the average of sample data after the described mark in described sequence and standard deviation;
Judgment value acquisition module, for sequentially obtaining the judgment value of the sample data after mark described in described sequence
δ, wherein, sample after the standard deviation of the sample data after described mark, described mark for the described judgment value
The sample value of data and multiple described mark after the meansigma methodss of the sample value of sample data and determine;
Second processing module, if being more than, for described judgment value, the region of rejection obtaining in advance, rejects described judgement
It is worth the sample data after corresponding described mark, and return the sample of the sample data after obtaining according to described mark
The sequence that value sequence obtains, and calculate the average of sample data after the described mark in described sequence and standard deviation
Step, until described judgment value be not more than described region of rejection.
16. devices according to claim 9 are it is characterised in that described device also includes:
Verification acquisition module, for, after obtaining the second judged result, carrying out standard to described second judged result
Really rate verification, obtains the accuracy rate of described second judged result;
Adjusting module, if being less than predetermined threshold value for described accuracy rate, adjusts the sample power of described mark sample
Weight;
Redefine module, for redefining described confidence interval based on the sample weights after adjustment.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510552641.2A CN106485528A (en) | 2015-09-01 | 2015-09-01 | The method and apparatus of detection data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510552641.2A CN106485528A (en) | 2015-09-01 | 2015-09-01 | The method and apparatus of detection data |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106485528A true CN106485528A (en) | 2017-03-08 |
Family
ID=58238010
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510552641.2A Pending CN106485528A (en) | 2015-09-01 | 2015-09-01 | The method and apparatus of detection data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106485528A (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109211564A (en) * | 2018-10-24 | 2019-01-15 | 哈工大机器人(山东)智能装备研究院 | A kind of adaptive thresholding value detection method for ball screw assembly, health evaluating |
CN109406950A (en) * | 2018-12-21 | 2019-03-01 | 云南电网有限责任公司电力科学研究院 | A kind of evaluation method of power distribution network concentrated feed line automatization system |
CN109426834A (en) * | 2017-08-31 | 2019-03-05 | 佳能株式会社 | Information processing unit, information processing method and information processing system |
CN109446046A (en) * | 2018-10-24 | 2019-03-08 | 哈工大机器人(山东)智能装备研究院 | It is a kind of based on very poor adaptive threshold method and system |
WO2019056496A1 (en) * | 2017-09-25 | 2019-03-28 | 平安科技(深圳)有限公司 | Method for generating picture review probability interval and method for picture review determination |
CN110060247A (en) * | 2019-04-18 | 2019-07-26 | 深圳市深视创新科技有限公司 | Cope with the robust deep neural network learning method of sample marking error |
CN110222244A (en) * | 2019-05-29 | 2019-09-10 | 第四范式(北京)技术有限公司 | A kind of the audit method for pushing and device of labeled data |
CN110414688A (en) * | 2019-07-29 | 2019-11-05 | 卓尔智联(武汉)研究院有限公司 | Information analysis method, device, server and storage medium |
CN110717028A (en) * | 2019-10-18 | 2020-01-21 | 支付宝(杭州)信息技术有限公司 | Method and system for eliminating interference problem pairs |
CN110838040A (en) * | 2019-10-11 | 2020-02-25 | 苏宁云计算有限公司 | Price data monitoring method and device, computer equipment and storage medium |
CN111865720A (en) * | 2020-07-20 | 2020-10-30 | 北京百度网讯科技有限公司 | Method, apparatus, device and storage medium for processing request |
CN114818326A (en) * | 2022-04-26 | 2022-07-29 | 山东交控科技有限公司 | Verification method and device for urban rail transit electronic map |
CN115314412A (en) * | 2022-06-22 | 2022-11-08 | 北京邮电大学 | Operation and maintenance-oriented type-adaptive index prediction early warning method and device |
CN117150576A (en) * | 2023-10-30 | 2023-12-01 | 易签链(深圳)科技有限公司 | Intelligent verification system and method for block chain electronic seal |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102243744A (en) * | 2010-05-11 | 2011-11-16 | 腾讯科技(深圳)有限公司 | Commodity auditing method and device |
-
2015
- 2015-09-01 CN CN201510552641.2A patent/CN106485528A/en active Pending
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102243744A (en) * | 2010-05-11 | 2011-11-16 | 腾讯科技(深圳)有限公司 | Commodity auditing method and device |
Cited By (21)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109426834A (en) * | 2017-08-31 | 2019-03-05 | 佳能株式会社 | Information processing unit, information processing method and information processing system |
US11636378B2 (en) | 2017-08-31 | 2023-04-25 | Canon Kabushiki Kaisha | Information processing apparatus, information processing method, and information processing system |
CN109426834B (en) * | 2017-08-31 | 2022-05-31 | 佳能株式会社 | Information processing apparatus, information processing method, and information processing system |
WO2019056496A1 (en) * | 2017-09-25 | 2019-03-28 | 平安科技(深圳)有限公司 | Method for generating picture review probability interval and method for picture review determination |
CN109211564B (en) * | 2018-10-24 | 2020-05-01 | 哈工大机器人(山东)智能装备研究院 | Self-adaptive threshold detection method for health assessment of ball screw pair |
CN109446046A (en) * | 2018-10-24 | 2019-03-08 | 哈工大机器人(山东)智能装备研究院 | It is a kind of based on very poor adaptive threshold method and system |
CN109211564A (en) * | 2018-10-24 | 2019-01-15 | 哈工大机器人(山东)智能装备研究院 | A kind of adaptive thresholding value detection method for ball screw assembly, health evaluating |
CN109446046B (en) * | 2018-10-24 | 2021-07-20 | 哈工大机器人(山东)智能装备研究院 | Self-adaptive threshold value method and system based on range difference |
CN109406950A (en) * | 2018-12-21 | 2019-03-01 | 云南电网有限责任公司电力科学研究院 | A kind of evaluation method of power distribution network concentrated feed line automatization system |
CN110060247A (en) * | 2019-04-18 | 2019-07-26 | 深圳市深视创新科技有限公司 | Cope with the robust deep neural network learning method of sample marking error |
CN110222244A (en) * | 2019-05-29 | 2019-09-10 | 第四范式(北京)技术有限公司 | A kind of the audit method for pushing and device of labeled data |
CN110414688A (en) * | 2019-07-29 | 2019-11-05 | 卓尔智联(武汉)研究院有限公司 | Information analysis method, device, server and storage medium |
CN110838040A (en) * | 2019-10-11 | 2020-02-25 | 苏宁云计算有限公司 | Price data monitoring method and device, computer equipment and storage medium |
CN110717028A (en) * | 2019-10-18 | 2020-01-21 | 支付宝(杭州)信息技术有限公司 | Method and system for eliminating interference problem pairs |
CN110717028B (en) * | 2019-10-18 | 2022-02-15 | 支付宝(杭州)信息技术有限公司 | Method and system for eliminating interference problem pairs |
CN111865720A (en) * | 2020-07-20 | 2020-10-30 | 北京百度网讯科技有限公司 | Method, apparatus, device and storage medium for processing request |
CN114818326A (en) * | 2022-04-26 | 2022-07-29 | 山东交控科技有限公司 | Verification method and device for urban rail transit electronic map |
CN115314412A (en) * | 2022-06-22 | 2022-11-08 | 北京邮电大学 | Operation and maintenance-oriented type-adaptive index prediction early warning method and device |
CN115314412B (en) * | 2022-06-22 | 2023-09-05 | 北京邮电大学 | Operation-and-maintenance-oriented type self-adaptive index prediction and early warning method and device |
CN117150576A (en) * | 2023-10-30 | 2023-12-01 | 易签链(深圳)科技有限公司 | Intelligent verification system and method for block chain electronic seal |
CN117150576B (en) * | 2023-10-30 | 2024-02-09 | 易签链(深圳)科技有限公司 | Intelligent verification system and method for block chain electronic seal |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106485528A (en) | The method and apparatus of detection data | |
CN110222170B (en) | Method, device, storage medium and computer equipment for identifying sensitive data | |
TWI739798B (en) | Method and device for establishing data recognition model | |
CN108173708A (en) | Anomalous traffic detection method, device and storage medium based on incremental learning | |
CN109492945A (en) | Business risk identifies monitoring method, device, equipment and storage medium | |
CN110309840A (en) | Risk trade recognition methods, device, server and storage medium | |
CN108229580A (en) | Sugared net ranking of features device in a kind of eyeground figure based on attention mechanism and Fusion Features | |
CN110246031A (en) | Appraisal procedure, system, equipment and the storage medium of business standing | |
CN111028016A (en) | Sales data prediction method and device and related equipment | |
CN111309822B (en) | User identity recognition method and device | |
CN107633030A (en) | Credit estimation method and device based on data model | |
CN114386856B (en) | Method, device and equipment for identifying empty shell enterprises and computer storage medium | |
CN111476296A (en) | Sample generation method, classification model training method, identification method and corresponding devices | |
CN107633455A (en) | Credit estimation method and device based on data model | |
CN110443350A (en) | Model quality detection method, device, terminal and medium based on data analysis | |
CN107346515A (en) | A kind of credit card Forecasting Methodology and device by stages | |
CN109461068A (en) | Judgment method, device, equipment and the computer readable storage medium of fraud | |
CN114638688A (en) | Interception strategy derivation method and system for credit anti-fraud | |
CN110009012A (en) | A kind of risk specimen discerning method, apparatus and electronic equipment | |
CN108629508A (en) | Credit risk sorting technique, device, computer equipment and storage medium | |
CN112598225A (en) | Evaluation index determination method and apparatus, storage medium, and electronic apparatus | |
CN108492049A (en) | A kind of system for the P2P platform operation risk assessment that logic-based returns | |
CN108229011A (en) | A kind of shale lithofacies development Dominated Factors judgment method, equipment and storage device | |
CN110046951A (en) | A kind of trading activity judgment method and system | |
CN106897743A (en) | The anti-cheating big data detection method of movable attendance checking based on Bayesian model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170308 |
|
RJ01 | Rejection of invention patent application after publication |