CN117473717B - A data quality analysis method based on Bernoulli-Gaussian model and EM algorithm - Google Patents
A data quality analysis method based on Bernoulli-Gaussian model and EM algorithm Download PDFInfo
- Publication number
- CN117473717B CN117473717B CN202311354617.9A CN202311354617A CN117473717B CN 117473717 B CN117473717 B CN 117473717B CN 202311354617 A CN202311354617 A CN 202311354617A CN 117473717 B CN117473717 B CN 117473717B
- Authority
- CN
- China
- Prior art keywords
- bernoulli
- data quality
- gaussian model
- algorithm
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000004422 calculation algorithm Methods 0.000 title claims abstract description 44
- 238000004458 analytical method Methods 0.000 title claims abstract description 33
- 238000004364 calculation method Methods 0.000 claims abstract description 17
- 238000005259 measurement Methods 0.000 claims abstract description 8
- 238000007781 pre-processing Methods 0.000 claims abstract description 5
- 239000011159 matrix material Substances 0.000 claims description 23
- 239000013598 vector Substances 0.000 claims description 21
- 238000000034 method Methods 0.000 claims description 13
- 238000004590 computer program Methods 0.000 claims description 11
- 230000006870 function Effects 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 4
- 238000013461 design Methods 0.000 claims description 4
- 238000011156 evaluation Methods 0.000 claims description 4
- 238000000354 decomposition reaction Methods 0.000 claims description 2
- 230000002159 abnormal effect Effects 0.000 abstract description 4
- 230000007547 defect Effects 0.000 description 5
- 238000004891 communication Methods 0.000 description 4
- 238000000342 Monte Carlo simulation Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000012545 processing Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000013179 statistical model Methods 0.000 description 2
- 238000013459 approach Methods 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012804 iterative process Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- RGCLLPNLLBQHPF-HJWRWDBZSA-N phosphamidon Chemical compound CCN(CC)C(=O)C(\Cl)=C(/C)OP(=O)(OC)OC RGCLLPNLLBQHPF-HJWRWDBZSA-N 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F30/00—Computer-aided design [CAD]
- G06F30/20—Design optimisation, verification or simulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/11—Complex mathematical operations for solving equations, e.g. nonlinear equations, general mathematical optimization problems
- G06F17/12—Simultaneous equations, e.g. systems of linear equations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/18—Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2111/00—Details relating to CAD techniques
- G06F2111/08—Probabilistic or stochastic CAD
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Theoretical Computer Science (AREA)
- Pure & Applied Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- General Engineering & Computer Science (AREA)
- Algebra (AREA)
- Databases & Information Systems (AREA)
- Software Systems (AREA)
- Operations Research (AREA)
- Computing Systems (AREA)
- Geometry (AREA)
- Evolutionary Computation (AREA)
- Computer Hardware Design (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Complex Calculations (AREA)
Abstract
The invention discloses a data quality analysis method based on Bernoulli-Gaussian model and EM algorithm, which relates to the technical field of space geography, and comprises the steps of preprocessing based on collected geodetic data and constructing Bernoulli-Gaussian model according to the statistics of rough differences; and calculating Bernoulli-Gaussian model parameters based on the linear observation equation, and analyzing the data quality of the geodetic measurement. According to the invention, by a calculation method of Bernoulli-Gaussian model parameters in a linear observation equation, the precision information of an observation value, the coarse difference rate and the coarse difference size in the observation value and other factors are obtained, any threshold value is not required to be introduced for distinguishing an abnormal value from a normal value, the intervention of human to the data quality is avoided, and the analysis result of the data quality is more scientific and reliable.
Description
Technical Field
The invention relates to the technical field of space geography, in particular to a data quality analysis method based on Bernoulli-Gaussian model and an EM algorithm.
Background
In modern geodetic practice, a large number of observations have been recorded or sampled in recent years, such data sets are almost impossible to have outliers, thus outlier handling becomes part of the geodetic operator's daily work, from a statistical point of view, outliers are considered to occur due to observation models failing to provide adequate fit or statistical interpretation, beckman and Cook (1983) categorize the incapacity of observation models into two categories, local model defects and global model defects, weaknesses of local models are those reasons that only focus on peripheral observations and not on the entire model, these reasons may require separate handling of outliers, as the surrogate model containing outliers is generally unknown, global model defects are reasons that lead to replacement of existing models by new or revised models of the entire sample, these reasons treat outliers as known properties, and may lead to replacement of existing models with hybrid models.
The observation models can be divided into two categories according to the cause of outlier generation, in general, the cause of local model defect indicates that outlier should be handled independently from normal observation, in which case there is one anomaly model to explain the existence of outlier in addition to the existing normal model to describe normal observation data, so far there are many different normal-anomaly models, the most common of which is two models proposed by Dixon (1950), one is called mean shift model, which considers outlier as the result of mean shift, the other is called variance expansion model, which considers outlier as the result of variance expansion (LEHMANN ET al 2020), the opposite, the cause of overall model defect requires replacement of existing model with new model or correction model of whole sample, in which case hybrid model is usually used to combine normal and anomaly observation, hawkins (1980) can build two common hybrid models, respectively, a position pollution model and a scale pollution model (leann 2013), the former consisting of normal distribution and another position pollution distribution and the latter of normal distribution and another scale pollution distribution.
However, the solutions that are common today have drawbacks including that in geodetics outliers are usually caused by gross errors, which often cause outliers, and that it is natural to explain the cause of outliers generation in geodetic data processing by the nature of gross errors, in which case the gross error model plays an important role in establishing the distinction and connection between different observation models.
Disclosure of Invention
The present invention has been made in view of the above-mentioned problems of the prior art in which a threshold value is introduced for distinguishing between outliers and normal values and for human intervention in data quality when outlier processing is performed on a data set obtained in a geodetic practice.
Therefore, the problem to be solved by the invention is how to provide a method for distinguishing abnormal values from normal values without introducing any threshold value, avoiding human intervention on the data quality and enabling the analysis result of the data quality to be more scientific and reliable.
In order to solve the technical problems, the invention provides the following technical scheme:
In a first aspect, an embodiment of the present invention provides a data quality analysis method based on a Bernoulli-Gaussian model and an EM algorithm, which includes preprocessing based on collected geodetic data, constructing the Bernoulli-Gaussian model according to the statistics of the coarse differences, constructing a hybrid model of observations in a linear observation equation based on the Expectation Maximum algorithm, and calculating based on the linear observation equation
Bernoulli-Gaussian model parameters, and analyze the data quality of the geodetic measurements.
As a preferable scheme of the data quality analysis method based on Bernoulli-Gaussian model and EM algorithm, the method comprises the following steps of:
wherein e g is the gross error, Z is the pattern matrix; Is a size vector.
As a preferable scheme of the data quality analysis method based on the Bernoulli-Gaussian model and the EM algorithm, the invention comprises the following specific formulas, wherein the mode matrix Z is a diagonal matrix, the value is 2 m, and the specific formulas are as follows:
Z=Zi,i∈{0,…,2m-1}
Wherein the j-th diagonal element of the pattern matrix Z obeys the Bernoulli distribution of the parameter ε j, the probability distribution of the pattern matrix Z is as follows:
Wherein Z ij is the j-th diagonal element of Z i, epsilon j is the coarse difference of the j-th observed value, and the size vector Obeying multidimensional Gaussian distribution, and the specific formula is as follows:
wherein, Is the mean vector of the size of the coarse difference; is a variance-covariance matrix.
As a preferable scheme of the data quality analysis method based on Bernoulli-Gaussian model and EM algorithm, the observation value comprises true value, accidental error and gross error, and the specific formula of the observation value is as follows:
y=Ax+e+eg
Wherein y is an observation value vector, A is a non-random design matrix, x is a parameter vector to be estimated, e is a zero-mean value, and variance is a random error vector of sigma.
As a preferable scheme of the data quality analysis method based on the Bernoulli-Gaussian model and the EM algorithm, the method for constructing the mixed model of the observed values in the linear observation equation comprises the following steps:
According to the additivity of the Gaussian distribution, the probability distribution formula of the observed value y is as follows:
wherein, Is the probability distribution of the observed value y, x, Σ, epsilon, The specific calculation steps of the parameters to be estimated of the observed value y in the mixed model are as follows, if the coarse error rates of different observed values in one type of observation equation are the same, the specific formulas are as follows:
εj=ε,j∈{1,...,m}
Will be The method is divided into a known co-factor array and an unknown factor, and the specific formula is as follows:
Σ=σ2Q
Wherein, Q is the number of the components, Are known co-factor vectors or matrixes, and the formula of the converted parameters to be estimated is as follows:
wherein, Is x, sigma 2, epsilon, Probability distribution of y.
As a preferred embodiment of the data quality analysis method based on the Bernoulli-Gaussian model and the EM algorithm according to the present invention, the calculation of the Bernoulli-Gaussian model parameters based on the linear observation equation uses Expectation Maximum algorithm, given x, σ 2, epsilon, And divides each iteration of the Expectation Maximum algorithm into an E step and an M step.
As a preferable scheme of the data quality analysis method based on the Bernoulli-Gaussian model and the EM algorithm, the calculation formula of the step E is as follows:
Wherein, gamma (Z i) is the posterior probability of Z i, and the calculation formula of the M step is as follows:
The calculation formulas of M i and N i are as follows:
Substituting the parameter value on the right side in the M step into the estimated value of the previous iteration, and taking the parameter value on the left side in the M step as the new estimated value of the parameter of the previous iteration until the iteration converges.
In a second aspect, to further solve the security problem existing in geodetic measurement, embodiments provide a system for data quality analysis based on Bernoulli-Gaussian model and EM algorithm, which includes a Bernoulli-Gaussian model module for decomposing and calculating a coarse difference and obtaining a probability quality function and a probability density function of whether the coarse difference occurs or not, a hybrid model construction module for constructing a hybrid model of observations in a linear observation equation, and a parameter evaluation module for calculating values of Bernoulli-Gaussian model parameters in the linear observation equation and analyzing data quality of geodetic measurement.
In a third aspect, an embodiment of the present invention provides a computer device, comprising a memory and a processor, the memory storing a computer program, wherein the computer program when executed by the processor implements any step of the data quality analysis method based on the Bernoulli-Gaussian model and the EM algorithm according to the first aspect of the present invention.
In a fourth aspect, embodiments of the present invention provide a computer readable storage medium having a computer program stored thereon, wherein the computer program when executed by a processor implements any of the steps of the data quality analysis method according to the first aspect of the present invention based on Bernoulli-Gaussian model and EM algorithm.
The invention has the beneficial effects that the Bernoulli-Gaussian coarse difference statistical model is provided, the estimation method of the BG model parameters in the linear observation equation is provided based on EM (Expectation Maximum) algorithm, the BG model parameters can be estimated in a single observation equation, and can also be estimated in a plurality of observation equations, the invention not only can obtain the precision information of the observation value, but also can obtain the coarse difference rate, the coarse difference size and other factors in the observation value, and provides an omnibearing analysis means for the geodetic data quality, the invention does not need to introduce any threshold value for distinguishing the abnormal value from the normal value, the intervention of human data quality is avoided, and the analysis result of the data quality is more scientific and reliable.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art. Wherein:
fig. 1 is a PDF graph estimated during an EM algorithm iteration in example 1.
Fig. 2 is a diagram of histograms and estimated PDF case1 for the gaussian model and the mixed model in different cases in example 2.
Fig. 3 is a histogram and estimated PDF case2 plot of the gaussian model and the mixed model of example 2 under different conditions.
Fig. 4 is a histogram and estimated PDF case3 plot of the gaussian model and the mixed model of example 2 under different conditions.
Detailed Description
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, but the present invention may be practiced in other ways other than those described herein, and persons skilled in the art will readily appreciate that the present invention is not limited to the specific embodiments disclosed below.
Further, reference herein to "one embodiment" or "an embodiment" means that a particular feature, structure, or characteristic can be included in at least one implementation of the invention. The appearances of the phrase "in one embodiment" in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments.
Example 1
Referring to fig. 1, a first embodiment of the present invention provides a data quality analysis method based on a Bernoulli-Gaussian model and an EM algorithm, comprising the steps of:
S1, preprocessing based on collected geodetic data, and constructing a Bernoulli-Gaussian model according to the statistics of the coarse differences.
Preferably, the data of the geodetic measurement are collected for preprocessing and the statistics of the gross errors are carried out.
Further, the coarse difference comprises a Bernoulli variable and a Gaussian variable, and the calculation of the decomposition of the coarse difference is as follows:
wherein e g is the gross error, Z is the pattern matrix; Is a size vector.
Further, the mode matrix Z is a diagonal matrix, the value is 2 m, and the specific formula is as follows:
Z=Zi,i∈{0,…,2m-1}
Wherein the j-th diagonal element of the pattern matrix Z obeys the Bernoulli distribution of the parameter ε j, the probability distribution of the pattern matrix Z is as follows:
Wherein Z ij is the j-th diagonal element of Z i, and ε j is the coarse difference of the j-th observed value.
Preferably, the size vectorObeying multidimensional Gaussian distribution, and the specific formula is as follows:
wherein, Is the mean vector of the size of the coarse difference; is a variance-covariance matrix.
S2, constructing a mixed model of observed values in a linear observation equation based on Expectation Maximum algorithm.
Preferably, the observations include true values, occasional errors and gross errors, and the specific formulas for the observations are as follows:
y=Ax+e+eg
Wherein y is an observation value vector, A is a non-random design matrix, x is a parameter vector to be estimated, e is a zero-mean value, and variance is a random error vector of sigma.
Further, constructing a hybrid model of the observed values in the linear observation equation includes the following steps of, according to the additivity of the Gaussian distribution, a probability distribution formula of the observed values y:
wherein, Is the probability distribution of the observed value y, x, Σ, epsilon, For each parameter to be estimated for which the observed value y is in the hybrid model.
The specific calculation steps of each parameter to be estimated of the observed value y in the mixed model are as follows, if the coarse error rates of different observed values in one type of observation equation are the same, the specific formula is as follows:
εj=ε,j∈{1,...,m}
The method will be described in terms of sigma, The method is divided into a known co-factor array and an unknown factor, and the specific formula is as follows:
Σ=σ2Q
Wherein, Q is the number of the components, Are known co-factor vectors or matrixes, and the formula of the converted parameters to be estimated is as follows:
wherein, Is x, sigma 2, epsilon, Probability distribution of y; specifically, for the linear observation equation y=ax+e+e g, where a= [1,., 1] T, x= [ μ ]; performing numerical calculation by Monte Carlo Simulation (MCS), firstly, according to the true value of Ax simulation observation value, adding random error and coarse error into the true value of observation value, and making the random error of e be independently and completely obeyed Gaussian distribution, namely, the coarse error in e-N (0, sigma 2);eg) is simulated according to BG modelAnd is also provided withThe observations in y are samples sampled independently from the mixed distribution,Parameter estimation is performed by EM algorithm as a PDF fitting process on these samples.
S3, calculating Bernoulli-Gaussian model parameters based on a linear observation equation, and analyzing the data quality of the geodetic measurement.
Preferably, the Bernoulli-Gaussian model parameters are calculated based on linear observation equations using the Expectation Maximum algorithm, given x, σ 2, ε, And divides each iteration of the Expectation Maximum algorithm into an E step and an M step.
Further, the calculation formula of the step E is as follows:
wherein, gamma (Z i) is the posterior probability of Z i, and the calculation formula of the M step is as follows:
wherein, the calculation formulas of M i and N i are as follows:
Substituting the parameter value on the right side in the M step into the estimated value of the previous iteration, and taking the parameter value on the left side in the M step as the new estimated value of the parameter of the previous iteration until the iteration converges.
Specifically, model parameters are calculated by using an EM algorithm through simulating n samples of the mixed model, so that true values and initial values of different parameters in the mixed model with parameters shown in table 1 are obtained:
Table 1 truth and initial tables for different parameters in a hybrid model
In the EM iteration process, the parameters calculated by the EM algorithm are as shown in table 2 and estimated in the EM algorithm iteration process:
table 2 parameter table estimated during iteration of EM algorithm
From the results of tables 1 and 2, it can be seen that the parameters calculated by the present invention are different in each iteration and eventually converge to a true value.
Further, as shown in fig. 1, in the iterative process of EM, the calculated PDF is gradually deformed, and finally approaches the actual PDF.
The embodiment also provides a system for data quality analysis based on the Bernoulli-Gaussian model and the EM algorithm, which comprises a Bernoulli-Gaussian model module, a mixed model construction module and a parameter evaluation module, wherein the Bernoulli-Gaussian model module is used for decomposing and calculating the coarse difference and obtaining a probability quality function and a probability density function of whether the coarse difference occurs or not, the mixed model construction module is used for constructing a mixed model of an observed value in a linear observation equation, and the parameter evaluation module is used for calculating the numerical value of a Bernoulli-Gaussian model parameter in the linear observation equation and analyzing the data quality measured in the ground.
The embodiment also provides a computer device, which is suitable for the situation of the data quality analysis method based on the Bernoulli-Gaussian model and the EM algorithm, and comprises a memory and a processor, wherein the memory is used for storing computer executable instructions, and the processor is used for executing the computer executable instructions to realize the data quality analysis method based on the Bernoulli-Gaussian model and the EM algorithm, which is proposed by the embodiment.
The computer device may be a terminal comprising a processor, a memory, a communication interface, a display screen and input means connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
The present embodiment also provides a storage medium having stored thereon a computer program which when executed by a processor implements a method for data quality analysis based on the Bernoulli-Gaussian model and the EM algorithm as proposed in the above embodiments, the storage medium may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as a static random access Memory (Static Random Access Memory, SRAM for short), an electrically erasable Programmable Read-Only Memory (ELECTRICALLY ERASABLE PROGRAMMABLE READ-Only Memory, EEPROM for short), an erasable Programmable Read-Only Memory (Erasable Programmable Read Only Memory, EPROM for short), a Programmable Read-Only Memory (PROM for short), a Read-Only Memory (ROM for short), a magnetic Memory, a flash Memory, a magnetic disk or an optical disk.
In summary, the invention provides a Bernoulli-Gaussian gross error statistical model, and provides an estimation method of BG model parameters in a linear observation equation based on EM (Expectation Maximum) algorithm, which can estimate the BG model parameters in a single observation equation and in a plurality of observation equations, can obtain the accuracy information of an observed value, can obtain factors such as the gross error rate, the gross error size and the like in the observed value, provides an omnibearing analysis means for the data quality of geodetic measurement, does not need to introduce any threshold value for distinguishing abnormal values and normal values, avoids human intervention on the data quality, and ensures that the analysis result of the data quality is more scientific and reliable.
Example 2
Referring to fig. 2 to 4, a second embodiment of the present invention is different from the first embodiment in that experimental comparison data of the present invention and the prior art are provided for verifying the beneficial effects thereof.
The comparison of the present invention with a hybrid model using three different parameter values and calculating LS using a conventional Gaussian model and corresponding model parameters, such as the true and estimated values of LS and EM parameters for the different cases of Table 3, is as follows.
TABLE 3 truth and valuation of LS and EM parameters for different situations
As can be seen from the results of table 3, there is no significant difference between the actual values of LS and EM and the estimated parameters when the total error rate and the total error magnitude are relatively small (see Case 1), and as the total error rate increases and the total error magnitude increases (see Case 2 and Case 3), the accuracy of LS estimation decreases significantly due to lack of robustness. In contrast, the parameters of the EM estimation are less affected by outliers and remain stable all the time, and the precision of the coarse-difference parameters is even improved in the case of a large proportion of outliers.
As shown in figures 2-4, when the total error rate and the total error size are relatively small, the PDFs estimated by the Gaussian model and the mix model are close to the histogram and have satisfactory performance for fitting the samples, as the total error rate is increased and the total error size is increased (see Case 2 and Case 3), the PDFs estimated by the Gaussian model and the LS are obviously deformed compared with the histogram, the Gaussian model loses the capability of fitting the samples polluted by coarse errors with larger proportion and amplitude, and conversely, the PDFs of the mixed model are always close to the histogram and keep good fitting performance.
It should be noted that the above embodiments are only for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that the technical solution of the present invention may be modified or substituted without departing from the spirit and scope of the technical solution of the present invention, which is intended to be covered in the scope of the claims of the present invention.
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311354617.9A CN117473717B (en) | 2023-10-19 | 2023-10-19 | A data quality analysis method based on Bernoulli-Gaussian model and EM algorithm |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311354617.9A CN117473717B (en) | 2023-10-19 | 2023-10-19 | A data quality analysis method based on Bernoulli-Gaussian model and EM algorithm |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117473717A CN117473717A (en) | 2024-01-30 |
CN117473717B true CN117473717B (en) | 2024-12-06 |
Family
ID=89632272
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311354617.9A Active CN117473717B (en) | 2023-10-19 | 2023-10-19 | A data quality analysis method based on Bernoulli-Gaussian model and EM algorithm |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117473717B (en) |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114117802A (en) * | 2021-11-29 | 2022-03-01 | 同济大学 | A method, device and medium for multiple gross error detection based on maximum a posteriori estimation |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
NO337304B1 (en) * | 2014-06-03 | 2016-03-07 | Q Free Asa | Detection of a charge object in a GNSS system with particle filter |
CN104776827B (en) * | 2015-04-03 | 2017-04-05 | 东南大学 | The Detection of Gross Errors method of GPS height anomaly data |
GB2555375B (en) * | 2016-09-30 | 2020-01-22 | Equinor Energy As | Improved methods relating to quality control |
CN109270560B (en) * | 2018-10-12 | 2022-04-26 | 东南大学 | Multi-dimensional gross error positioning and value fixing method for area elevation abnormal data |
-
2023
- 2023-10-19 CN CN202311354617.9A patent/CN117473717B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114117802A (en) * | 2021-11-29 | 2022-03-01 | 同济大学 | A method, device and medium for multiple gross error detection based on maximum a posteriori estimation |
Non-Patent Citations (1)
Title |
---|
"基于贝叶斯学习的阵列天线故障诊断方法研究";许煜辉;《中国优秀硕士学位论文全文数据库(信息科技辑)》;20230215;正文24-28 * |
Also Published As
Publication number | Publication date |
---|---|
CN117473717A (en) | 2024-01-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hasan et al. | Priority ranking of critical uncertainties affecting small-disturbance stability using sensitivity analysis techniques | |
Bi | A review of statistical methods for determination of relative importance of correlated predictors and identification of drivers of consumer liking | |
Andersen et al. | Extensions to the Gaussian copula: Random recovery and random factor loadings | |
Wan | Simulating survival data with predefined censoring rates for proportional hazards models | |
Polson et al. | Bayesian l 0‐regularized least squares | |
TW202013104A (en) | Data processing method, data processing device, and computer-readable recording medium | |
CN112016826A (en) | Method and device for determining corrosion degree of transformer substation equipment and computer equipment | |
White et al. | An evaluation of point and interval estimates in population pharmacokinetics using NONMEM analysis | |
Tekwa et al. | Theory and application of an improved species richness estimator | |
Struben et al. | Parameter estimation through maximum likelihood and bootstrapping methods | |
CN117473717B (en) | A data quality analysis method based on Bernoulli-Gaussian model and EM algorithm | |
Skinner et al. | Weibull regression for lifetimes measured with error | |
Li et al. | Quantile association for bivariate survival data | |
Kelly | A review of software packages for analyzing correlated survival data | |
Sasaki et al. | Estimating sexual size dimorphism in fossil species from posterior probability densities | |
Duewer | A comparison of location estimators for interlaboratory data contaminated with value and uncertainty outliers | |
Bai et al. | Calibrating input parameters via eligibility sets | |
Elliott et al. | Weighted Dirichlet process mixture models to accommodate complex sample designs for linear and quantile regression | |
CN113095963A (en) | Real estate cost data processing method, real estate cost data processing device, computer equipment and storage medium | |
Atamanyuk et al. | Management of an agricultural enterprise on the basis of its economic state forecasting | |
Stevens et al. | Augmented measurement system assessment | |
CN114492003B (en) | Gravity modeling method and device based on inverse distance weighting method and quadric surface method | |
Guolo | Measurement errors in control risk regression: A comparison of correction techniques | |
Ruiz et al. | Generalized Functional Mixed Models for Accelerated Degradation-Based Reliability Analysis | |
CN110097265A (en) | Acquisition methods, device and the storage medium of the ready degree of Project Technical |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |