CN117555892A - Atmospheric pollutant multimode fusion accounting model post-treatment method - Google Patents
Atmospheric pollutant multimode fusion accounting model post-treatment method Download PDFInfo
- Publication number
- CN117555892A CN117555892A CN202410032744.5A CN202410032744A CN117555892A CN 117555892 A CN117555892 A CN 117555892A CN 202410032744 A CN202410032744 A CN 202410032744A CN 117555892 A CN117555892 A CN 117555892A
- Authority
- CN
- China
- Prior art keywords
- data
- accounting
- emission
- max
- threshold
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 76
- 239000003344 environmental pollutant Substances 0.000 title claims abstract description 37
- 231100000719 pollutant Toxicity 0.000 title claims abstract description 37
- 230000004927 fusion Effects 0.000 title claims abstract description 32
- 230000002159 abnormal effect Effects 0.000 claims abstract description 37
- 238000012805 post-processing Methods 0.000 claims abstract description 20
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 12
- 238000012417 linear regression Methods 0.000 claims abstract description 8
- 238000012216 screening Methods 0.000 claims description 51
- 238000004364 calculation method Methods 0.000 claims description 16
- 238000010998 test method Methods 0.000 claims description 12
- 239000000356 contaminant Substances 0.000 claims description 11
- 125000000205 L-threonino group Chemical group [H]OC(=O)[C@@]([H])(N([H])[*])[C@](C([H])([H])[H])([H])O[H] 0.000 claims description 6
- 230000000295 complement effect Effects 0.000 claims description 6
- 239000012855 volatile organic compound Substances 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 4
- 238000003860 storage Methods 0.000 claims description 3
- 238000007689 inspection Methods 0.000 claims description 2
- 238000012545 processing Methods 0.000 claims description 2
- 238000013213 extrapolation Methods 0.000 abstract description 5
- 238000012544 monitoring process Methods 0.000 description 6
- 239000000809 air pollutant Substances 0.000 description 3
- 231100001243 air pollutant Toxicity 0.000 description 3
- 238000012360 testing method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 239000010865 sewage Substances 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
- G06F18/24147—Distances to closest patterns, e.g. nearest neighbour classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/243—Classification techniques relating to the number of classes
- G06F18/2433—Single-class perspective, e.g. one-against-all classification; Novelty detection; Outlier detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/27—Regression, e.g. linear or logistic regression
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/70—Machine learning, data mining or chemometrics
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16C—COMPUTATIONAL CHEMISTRY; CHEMOINFORMATICS; COMPUTATIONAL MATERIALS SCIENCE
- G16C20/00—Chemoinformatics, i.e. ICT specially adapted for the handling of physicochemical or structural data of chemical particles, elements, compounds or mixtures
- G16C20/90—Programming languages; Computing architectures; Database systems; Data warehousing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2123/00—Data types
- G06F2123/02—Data types in the time domain, e.g. time-series data
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Databases & Information Systems (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Computing Systems (AREA)
- Crystallography & Structural Chemistry (AREA)
- Software Systems (AREA)
- Chemical & Material Sciences (AREA)
- Quality & Reliability (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention provides a method for post-processing an atmospheric pollutant multi-mode fusion accounting model. The post-treatment method comprises the following steps: s1, acquiring accounting result data of an accounting interval of an atmospheric pollutant multi-mode fusion accounting model; the accounting result data comprises the discharge time, the corresponding discharge amount and the accounting relative error; s2, carrying out abnormal data identification on the accounting result dataScreening to obtain an actual reserved emission data set I g The method comprises the steps of carrying out a first treatment on the surface of the S3, the actual reserved emission data set I is subjected to K nearest neighbor algorithm or linear regression g Performing data interpolation filling to obtain filled complete emission data. According to the invention, through carrying out various abnormal value identification on the accounting result data, abnormal values are effectively removed, the missing data is filled in an interpolation and extrapolation mode, the missing data is recovered relatively accurately, and the accuracy and stability of the accounting result are improved.
Description
Technical Field
The invention relates to the technical field of environmental monitoring analysis, in particular to a multi-mode fusion accounting model post-processing method for atmospheric pollutants.
Background
The industrial park is a centralized place of industrial production, plays a vital role in the development of local economy and is also a large consumer of pollutant emission.
In the ecological civilization construction process, the intensity and the total amount of controlling the emission of atmospheric pollutants in an industrial park are key points for stably improving the local air environment level. The accurate accounting of the pollutant discharge amount of the industrial park is the basis for making reasonable discharge indexes and scientifically making emission reduction plans for the park.
At present, although most of organized sewage outlets of key enterprises in an industrial park are already provided with on-line monitoring instruments, the emission forms such as the non-organized emission are difficult to monitor by an on-line monitoring means. Therefore, there are problems of data missing, instrument error, and the like in the monitoring process.
Although the developed multi-mode fusion accounting model for the atmospheric pollutants can carry out online accounting on the actual emission of a park, the problems that an accounting result is invalid or the error is overlarge and the like caused by factors such as errors of monitoring instruments, data loss, errors of the accounting model and the like are difficult to avoid. This also directly affects the reasonable emission index, reasonable emission reduction, etc. plan established for the campus.
In view of this, there is an urgent need to design and develop a set of methods for performing the post-processing of the multi-mode fusion accounting model of the atmospheric pollutants by outlier recognition and interpolation and extrapolation data.
Disclosure of Invention
Based on the above, it is necessary to provide a post-processing method of the multi-mode fusion accounting model of the atmospheric pollutants, aiming at the problems that the existing multi-mode fusion accounting model of the atmospheric pollutants is invalid or overlarge in accounting result or error caused by factors such as monitoring instrument error, data loss, accounting model error and the like.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
the method for post-processing the atmospheric pollutant multi-mode fusion accounting model comprises the following steps:
s1, acquiring accounting result data of an accounting interval of an atmospheric pollutant multi-mode fusion accounting model; the accounting result data comprises the discharge time, the corresponding discharge amount and the accounting relative error;
s2, carrying out abnormal data identification screening on the accounting result data to obtain an actual reserved emission data set I g The method comprises the steps of carrying out a first treatment on the surface of the The specific steps of screening out the abnormal data are as follows:
s21, screening out data with the relative calculation error smaller than a preset error value c in the calculation result data to obtain preliminary screening data;
s22, calculating a maximum value threshold y of the discharge amount in the preliminary screening data through a drawing-based test method and/or a 3 sigma principle iteration method max And the total allowable value y allow Screening data with emission in the interval between the two data in the primary screening data to obtain secondary screening data;
s23, calculating a neighbor difference sequence d of adjacent emission according to the emission time sequence in the secondary screening data by using a maximum neighbor difference method i Calculating a maximum neighbor difference value threshold d by a graph-based test method and a 3 sigma principle iteration method max Further obtain the actual threshold d' max :;
S24, calculating an actual reserved emission data set I g :
The method comprises the steps of carrying out a first treatment on the surface of the Wherein i represents a data number, y i Representing the absolute value sequence of emissions in the accounting result data, relerr i Representing a relative error sequence of accounting in the accounting result data;
s3, the actual reserved emission data set I is subjected to K nearest neighbor algorithm or linear regression g Performing data interpolation filling to obtain filled complete emission data。
Further, the graph-based test method calculates a maximum value threshold y of the discharge amount in the preliminary screening data max And the total allowable value y allow The specific steps of (a) are as follows:
s201, calculating a maximum value threshold y max :The method comprises the steps of carrying out a first treatment on the surface of the Wherein Q is 3 To initially screen data for three-quarters of emissions, Q 1 Quarter digits of the emission in the preliminary screening data;
s202 will be the mostLarge value threshold y max Performing one iteration to obtain an abnormal high value threshold new_thres after one iteration:the method comprises the steps of carrying out a first treatment on the surface of the Wherein new_Q 3 Is less than or equal to the maximum threshold y max Three quarters of the emissions, new_q, in the preliminary screening data of (a) 1 Is less than or equal to the maximum threshold y max Quarter digit of emission in the preliminary screening data of (a);
s203, repeating the step S202 for iteration until the average allowance of the discharge amount is less than or equal to 0.001 or the discharge amount is less than the maximum value threshold y max The emission data amount of (2) is less than or equal to 30% of the total data amount, and the abnormal high value threshold after iteration is the total allowable value y allow 。
Further, when the graph-based test method does not have convergence, replacing the 3 sigma principle iteration method; calculating the maximum value threshold y of the discharge in the preliminary screening data by using a 3 sigma principle iteration method max And the total allowable value y allow The specific steps of (a) are as follows:
s211, calculating a maximum value threshold y max :The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Std (y) is the standard deviation of the discharge amount, which is the discharge amount average;
s212, threshold value y of maximum value max Performing one iteration to obtain an abnormal high value threshold new_thres after one iteration:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Is less than or equal to the maximum threshold y max Is the average of the emissions of>Is less than or equal to the maximum valueThreshold y max Standard deviation of the discharge amount of (2);
s213 repeating the step S212 until the average allowance of the discharge amount is less than or equal to 0.001 or the discharge amount is less than the maximum value threshold y max The emission data amount of (2) is less than or equal to 30% of the total data amount, and the abnormal high value threshold after iteration is the total allowable value y allow 。
Further, for the actual reserved emission data set I g Before the complement of the (1) is subjected to data interpolation filling, the data set I of the actual reserved emission is obtained g Number of elements of (2)And comparing with a preset value b, and making the following decision according to the comparison result:
(1) If it isThe method comprises the steps of carrying out a first treatment on the surface of the Then it is determined that the actual reserved emission data set I is subjected to K nearest neighbor algorithm g Performing data interpolation filling on the complement of the data;
(2) If it isThe method comprises the steps of carrying out a first treatment on the surface of the Then it is determined to apply linear regression to the actual reserve emission data set I g And (3) carrying out data interpolation filling on the complement of the data.
Further, before abnormal data identification screening is carried out on the accounting result data, whether the data quantity N of the emission quantity in the accounting interval meets N & gta or not is judged; if yes, carrying out abnormal data identification screening; where a is a preset constant.
Further, the emission amount in the accounting result data is the emission amount of pollutants per hour; according to the filled complete emission dataCalculating estimated annual emission total amount of pollutants annual_discharge:。
further, atmospheric pollutionThe substance includes NO 2 、PM 2.5 VOCs and SO 2 。
Further, when in the form of atmospheric pollutants NO 2 、PM 2.5 When VOCs are taken as accounting factors, all allowable values y are calculated by adopting a combination mode of a graph base test method and a 3 sigma principle iteration method allow ;
When in the form of atmospheric pollutants SO 2 When the method is used as a factor, the 3 sigma principle iterative method is singly adopted to calculate all the allowable values y allow 。
Compared with the prior art, the invention has the beneficial effects that:
1. according to the invention, through carrying out multiple abnormal value recognition on the accounting result data of the multi-mode fusion accounting model of the atmospheric pollutants, abnormal values are effectively removed, the missing data is filled in an interpolation and extrapolation mode, the missing data is recovered relatively accurately, and the accuracy and stability of the accounting result are improved;
2. according to the invention, an abnormal value identification mode with higher accuracy is adopted aiming at different atmospheric pollutants, namely different accounting factors, so that abnormal values are further effectively removed, and the stability and accuracy of an accounting result are improved.
Drawings
The disclosure of the present invention is described with reference to the accompanying drawings. It is to be understood that the drawings are designed solely for the purposes of illustration and not as a definition of the limits of the invention. Wherein:
FIG. 1 is a flow chart of a method for post-processing an atmospheric contaminant multi-mode fusion accounting model introduced in example 1 of the present invention;
FIG. 2 is a flow chart of a screening shim process based on the anomaly data identification of FIG. 1;
fig. 3 is a schematic diagram of the result of actual post-processing data based on fig. 1.
Detailed Description
It is to be understood that, according to the technical solution of the present invention, those skilled in the art may propose various alternative structural modes and implementation modes without changing the true spirit of the present invention. Accordingly, the following detailed description and drawings are merely illustrative of the invention and are not intended to be exhaustive or to limit the invention to the precise form disclosed.
Example 1
Referring to fig. 1, the embodiment describes a method for post-processing an air pollutant multi-mode fusion accounting model, which mainly screens abnormal data of accounting result data of the air pollutant multi-mode fusion accounting model, so as to conveniently estimate more accurate pollutant discharge.
The method for post-processing the atmospheric pollutant multi-mode fusion accounting model comprises the following specific steps:
s1, acquiring accounting result data of an accounting interval of an atmospheric pollutant multi-mode fusion accounting model; the accounting result data includes discharge time and corresponding discharge amount, accounting for relative errors.
The accounting result data includes not only the discharge time (hours), the discharge amount (ton), the accounting relative error (%), but also the data amount N of the accounting interval and the accounting start time x 0 . The accounting relative error refers to the relative error of the discharge amount accounting at this time.
Before the abnormal data is identified and screened out, firstly, whether the data quantity N in the accounting interval is sufficient or not, namely whether N is more than a or not is required to be confirmed; a is an adjustable preset constant. This is a condition for determining whether or not to perform post-processing. If the data quantity N reaches the condition, the abnormal data identification and screening operation is started to the calculation result data.
S2, screening abnormal data of the accounting result data to obtain an actual reserved emission data set I g The method comprises the steps of carrying out a first treatment on the surface of the Referring to fig. 2, the specific steps for screening out abnormal data are as follows:
s21, screening out data with the relative calculation error smaller than a preset error value c in the calculation result data to obtain preliminary screening data;
;
;
x i representing the emission time series in the accounting result data, y i Representing the absolute value sequence of emissions in the accounting result data, relerr i Representing the relative error sequence of accounting in the accounting result data, x c And y c A subset of the data is initially screened. c is an adjustable preset error value, typically 50%. This step screens out data with relatively high errors.
S22, calculating a maximum value threshold y of the discharge amount in the preliminary screening data through a drawing-based test method and/or a 3 sigma principle iteration method max And the total allowable value y allow And screening out the data with the emission in the interval between the two data in the primary screening data to obtain secondary screening data.
Calculating the maximum value threshold y of the discharge in the preliminary screening data by adopting a graph-based test method max And the total allowable value y allow The specific steps of (a) are as follows:
calculating the maximum value threshold y max :The method comprises the steps of carrying out a first treatment on the surface of the In which Q 3 To initially screen three quarters of the emissions in the data, Q 1 Quarter digit of emission in the preliminary screening data; and (3) performing a graph-based test on all the data with relatively high removal accounting errors to obtain a maximum value which allows effective data to be retained, namely an abnormally high value threshold, so as to limit the highest range of the effective emission.
To obtain the whole permissible value y allow One iteration is performed with the following formula:
...(1)
in the formula (1), new_thres is an abnormally high threshold value after one iteration, new_Q 3 Is less than or equal to the maximum threshold y max Three quarters of the emissions, new_q, in the preliminary screening data of (a) 1 Is less than or equal to the maximum threshold y max Quarter of the emissions in the preliminary screening data of (a)Quantiles.
Iterating with formula (1) until the tolerance of the average emission value is less than or equal to 0.001, or the emission data amount less than the threshold value is less than or equal to 30% of the total data amount, wherein the abnormal high value threshold value obtained by iteration is the total allowable value y allow 。
And updating an abnormally high value threshold value in each iteration, screening out some data, carrying out average value calculation on the residual emission data after the iteration, carrying out difference value calculation on the emission average value after the iteration and the emission average value after the last iteration, namely, obtaining the emission average tolerance, and if the emission average tolerance is almost 0, indicating that the emission average tolerance is basically unchanged before and after the iteration, considering that the emission average tolerance is converged, and stopping the iteration.
In a few cases, the graph-based test iteration does not have convergence, and instead uses the 3 sigma principle, the method is as follows:
wherein,as the average of the discharge amount, +.>Is the standard deviation of the discharge amount.
To obtain the whole permissible value y allow One iteration is performed with the following formula:
...(2)
in equation (2), new_thres is an abnormally high threshold after one iteration,is less than or equal to the maximum threshold y max Is the average of the emissions of>Is less than or equal to the maximum threshold y max Standard deviation of the discharge amount of (2).
Iterating with formula (2) until the tolerance of the average emission value is less than or equal to 0.001, or the emission data amount less than the threshold value is less than or equal to 30% of the total data amount, wherein the abnormal high value threshold value obtained by iteration is the total allowable value y allow 。
The present embodiment is directed to the estimation of the amount of atmospheric pollutant emissions. Atmospheric pollutants including NO 2 、PM 2.5 VOCs and SO 2 Etc. The accounting factor of this example is NO 2 The method is combined iteratively using a graph-based test iteration and a 3 sigma principle iteration. When the accounting factor is NO 2 、PM 2.5 When VOCs are used, all the allowable values are calculated by using a graph-based inspection iteration and 3 sigma principle iteration combination method; when the accounting factor is SO 2 When all the license values are calculated iteratively using the 3σ principle alone.
S23, calculating a neighbor difference sequence d of adjacent emission through a maximum neighbor difference method according to the emission time sequence in the secondary screening data i Calculating a maximum neighbor difference value threshold d by a graph-based test method and a 3 sigma principle iteration method max Further obtain the actual threshold d' max :。
In the interval below the abnormally high value threshold and above all the allowable values, the emission data which rises gradually and reasonably is reserved by using the maximum neighbor difference method.
According to the stability assumption of the time variation of the discharge amount, the time sequence difference value (namely, the first derivative) of the discharge amount needs to be further subjected to maximum value screening, and the adjacent discharge amount data difference value is calculated firstly by using a maximum neighbor difference value method:
and then to d i The time sequence obtains the converged maximum neighbor difference value threshold d by applying the iterative method max . Data absolute due to different accounting objectsThe range of value fluctuations may vary, so the final threshold value needs to be taken into account simultaneously with the absolute sequence of dataSum data neighbor difference sequence->Thus the actual threshold d' max From comparison d max And y allow Taking the smaller value as the actual threshold value d' max 。
。
S24, calculating an actual reserved emission data set I g :
The method comprises the steps of carrying out a first treatment on the surface of the Where i represents a data number.
Thus, less than or equal to the full allowable value y in the preliminary screening data allow Is retained in its entirety at the full permissible value y allow And a maximum value threshold y max The emission data between them meets the maximum neighbor difference requirement and the rule is reserved.
S3, the actual reserved emission data set I is subjected to K nearest neighbor algorithm or linear regression g Performing data interpolation filling to obtain filled complete emission data。
Based on actual reserved emission data set I g Before interpolating and extrapolating data using the K-nearest neighbor algorithm, it is necessary to confirm that the data amount meets the coefficient requirements of the algorithm used, i.e. g Whether the number of elements in the container meets the requirement of a preset amount. The conditions are as follows:
wherein b is a preset value, which is an adjustable constant, and is a general value of 10, and the condition that determines whether to use the K-nearest neighbor algorithm.Representing the number of the set elements; if->Linear regression is used to fill in missing data.
For actual reserved emission data set I g Its corresponding discharge time has been converted to an index starting from 0IAnd the regression algorithm is convenient to conduct data interpolation and extrapolation. And simulating missing data in the time sequence to fill up the blank caused by the screened data through a K nearest neighbor algorithm or linear regression.
The K Nearest Neighbor algorithm (K-Nearest Neighbor) carries out regression by measuring the distance between characteristic values, the calculation weight of the adjacent points is inversely proportional to the distance, and the closer the distance is, the larger the weight is; for the values to be regressed:
Wherein the method comprises the steps ofIs->Elements in the K immediate vicinity of the set, wherein +.>The weight factor is:
I j representing the discharge to be returnedThe sequence number of the j-th neighbor of (I) i Representing the value to be regressed +.>Is a timing sequence number of (a).
According to the filled complete emission dataCalculating estimated annual emission total amount of pollutants annual_discharge:。
according to the embodiment, the calculation result data of the air pollutant multi-mode fusion calculation model is obtained, abnormal data in the calculation result data are identified and removed, the reserved effective data are interpolated and extrapolated to obtain missing data in a time sequence, and the total pollutant emission amount in a round area in the calculation interval time period and the estimated pollutant annual emission amount are calculated based on filled complete data.
Referring to fig. 3, a schematic diagram of the result of actual post-processing data is shown, wherein the abscissa represents the discharge time and the ordinate represents the discharge amount. As can be seen from the figure, the light gray dotted connecting wire strip is the accounting result of the multi-mode fusion accounting model of the atmospheric pollutants, and the dark gray dotted connecting wire strip is the actual reserved emission data after abnormal value identification and screening. In fig. 3, the upper dark horizontal line is the discharge amount maximum threshold value, and the lower light horizontal line is the total allowable value.
Therefore, in the embodiment, by carrying out multiple abnormal value identification on the accounting result of the multi-mode fusion accounting model of the atmospheric pollutants, abnormal values are effectively removed, missing data is filled up through interpolation and extrapolation, the missing data is recovered relatively accurately, the accuracy and stability of the accounting result are improved, different accounting factors adopt an abnormal value identification mode with higher accuracy, the abnormal values are further effectively removed, and the stability and accuracy of the accounting result are improved.
Example 2
The embodiment introduces a computer terminal, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor executes the program to realize the steps of the method for processing the atmospheric pollutants after the multi-mode fusion accounting model. When in application, the application can be performed in the form of software, such as a program designed to run independently, and is installed on a computer terminal, which can be a computer, a smart phone, and the like. The system can also be designed into an embedded running program which is installed on a computer terminal, such as a singlechip.
Example 3
The present embodiment describes a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the above-described atmospheric contaminant multimode fusion accounting model post-processing method. In application, the application may be performed in the form of software, such as a program designed to run independently on a computer-readable storage medium, which may be a usb disk, designed as a U-shield, through which the program is designed to start the whole method by external triggering.
The technical scope of the present invention is not limited to the above description, and those skilled in the art may make various changes and modifications to the above-described embodiments without departing from the technical spirit of the present invention, and these changes and modifications should be included in the scope of the present invention.
Claims (10)
1. The post-processing method of the multi-mode fusion accounting model of the atmospheric pollutants is used for processing the accounting abnormal data of the multi-mode fusion accounting model of the atmospheric pollutants; the method is characterized by comprising the following steps of:
s1, acquiring accounting result data of an accounting interval of an atmospheric pollutant multi-mode fusion accounting model; the accounting result data comprises discharge time, corresponding discharge amount and accounting relative error;
s2, carrying out abnormal data identification screening on the accounting result data to obtain an actual reserved emission data set I g The method comprises the steps of carrying out a first treatment on the surface of the The specific steps of screening out the abnormal data are as follows:
s21, screening out data with the relative calculation error smaller than a preset error value c in the calculation result data to obtain preliminary screening data;
s22, calculating a maximum value threshold y of the discharge amount in the preliminary screening data through a drawing-based test method and/or a 3 sigma principle iteration method max And the total allowable value y allow Screening out the data with the emission in the interval between the two data in the primary screening data to obtain secondary screening data;
s23, calculating a neighbor difference sequence d of adjacent emission through a maximum neighbor difference method according to the emission time sequence in the secondary screening data i Calculating a maximum neighbor difference value threshold d by a graph-based test method and a 3 sigma principle iteration method max Further obtain the actual threshold d' max :;
S24, calculating an actual reserved emission data set I g :
The method comprises the steps of carrying out a first treatment on the surface of the Wherein i represents a data number, y i Representing the absolute value sequence of emissions in the accounting result data, relerr i Representing a relative error sequence of accounting in the accounting result data;
s3, the actual reserved emission data set I is subjected to K nearest neighbor algorithm or linear regression g Performing data interpolation filling to obtain filled complete emission data。
2. The method for post-processing the atmospheric contaminant multimode fusion accounting model according to claim 1, wherein a graph-based inspection method calculates a maximum value threshold y of the emission amount in the preliminary screening data max And the total allowable value y allow The specific steps of (a) are as follows:
s201, calculating a maximum value threshold y max :The method comprises the steps of carrying out a first treatment on the surface of the Wherein Q is 3 To initially screen data for three-quarters of emissions, Q 1 Quarter digits of the emission in the preliminary screening data;
s202, threshold value y of maximum value max Performing one iteration to obtain an abnormal high value threshold new_thres after one iteration:the method comprises the steps of carrying out a first treatment on the surface of the Wherein new_Q 3 Is less than or equal to the maximum threshold y max Three quarters of the emissions, new_q, in the preliminary screening data of (a) 1 Is less than or equal to the maximum threshold y max Quarter digit of emission in the preliminary screening data of (a);
s203, repeating the step S202 for iteration until the average allowance of the discharge amount is less than or equal to 0.001 or the discharge amount is less than the maximum value threshold y max The emission data amount of (2) is less than or equal to 30% of the total data amount, and the abnormal high value threshold after iteration is the total allowable value y allow 。
3. The method for post-processing the atmospheric contaminant multimode fusion accounting model according to claim 2, wherein when the graph-based test method does not have convergence, a 3 sigma principle iteration method is used instead; calculating the maximum value threshold y of the discharge in the preliminary screening data by using a 3 sigma principle iteration method max And the total allowable value y allow The specific steps of (a) are as follows:
s211, calculating a maximum value threshold y max :The method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Std (y) is the standard deviation of the discharge amount, which is the discharge amount average;
s212, threshold value y of maximum value max Performing one iteration to obtain an abnormal high value threshold new_thres after one iteration:the method comprises the steps of carrying out a first treatment on the surface of the Wherein (1)>Is less than or equal to the maximum threshold y max Is the average of the emissions of>Is less than or equal to the maximum threshold y max Standard deviation of the discharge amount of (2);
s213 repeating the step S212 until the average allowance of the discharge amount is less than or equal to 0.001 or the discharge amount is less than the maximum value threshold y max The emission data amount of (2) is less than or equal to 30% of the total data amount, and the abnormal high value threshold after iteration is the total allowable value y allow 。
4. The method of post-processing an atmospheric contaminant multi-mode fusion accounting model of claim 1, wherein the emissions data set I is calculated for actual reserve g Before the complement of the (1) is subjected to data interpolation filling, the data set I of the actual reserved emission is obtained g Number of elements of (2)And comparing with a preset value b, and making the following decision according to the comparison result:
(1) If it isThe method comprises the steps of carrying out a first treatment on the surface of the Then it is determined that the actual reserved emission data set I is subjected to K nearest neighbor algorithm g Performing data interpolation filling on the complement of the data;
(2) If it isThe method comprises the steps of carrying out a first treatment on the surface of the Then the decision is made to use linear regression to the realInter-reserved emission data set I g And (3) carrying out data interpolation filling on the complement of the data.
5. The method for post-processing the atmospheric pollutant multi-mode fusion accounting model according to claim 1, wherein before abnormal data identification screening is carried out on the accounting result data, whether the data amount N of the emission amount in the accounting interval meets N & gta is judged; if yes, carrying out abnormal data identification screening; where a is a preset constant.
6. The atmospheric contaminant multimode fusion accounting model post-processing method of claim 1, wherein emissions in said accounting result data are contaminant emissions per hour; according to the filled complete emission dataCalculating estimated annual emission total amount of pollutants annual_discharge:
。
7. the method of claim 6, wherein the atmospheric contaminants include NO 2 、PM 2.5 VOCs and SO 2 。
8. The method of claim 7, wherein the method is characterized by the fact that the method is performed in the form of atmospheric contaminant NO 2 、PM 2.5 When VOCs are taken as accounting factors, all allowable values y are calculated by adopting a combination mode of a graph base test method and a 3 sigma principle iteration method allow ;
When in the form of atmospheric pollutants SO 2 When the method is used as a factor, the 3 sigma principle iterative method is singly adopted to calculate all the allowable values y allow 。
9. A computer terminal comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the atmospheric contaminant multimode fusion calculation model post-processing method according to any one of claims 1 to 8 when the program is executed by the processor.
10. A computer readable storage medium having stored thereon a computer program, wherein the program when executed by a processor performs the steps of the atmospheric contaminant multimode fusion calculation model post-processing method according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410032744.5A CN117555892B (en) | 2024-01-10 | 2024-01-10 | Atmospheric pollutant multimode fusion accounting model post-treatment method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410032744.5A CN117555892B (en) | 2024-01-10 | 2024-01-10 | Atmospheric pollutant multimode fusion accounting model post-treatment method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117555892A true CN117555892A (en) | 2024-02-13 |
CN117555892B CN117555892B (en) | 2024-04-02 |
Family
ID=89818869
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410032744.5A Active CN117555892B (en) | 2024-01-10 | 2024-01-10 | Atmospheric pollutant multimode fusion accounting model post-treatment method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117555892B (en) |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111275547A (en) * | 2020-03-19 | 2020-06-12 | 重庆富民银行股份有限公司 | Wind control system and method based on isolated forest |
CN112785080A (en) * | 2021-02-03 | 2021-05-11 | 燕山大学 | Energy consumption optimization method of real-time dynamic cement grinding system based on cement industry |
CN113159448A (en) * | 2021-05-12 | 2021-07-23 | 烟台应辉智能科技有限公司 | Automatic analysis and discrimination method based on environmental protection big data |
CN113210824A (en) * | 2021-05-26 | 2021-08-06 | 上海大制科技有限公司 | Servo welding gun driving abnormity detection method and equipment |
CN114384015A (en) * | 2022-01-12 | 2022-04-22 | 中国环境科学研究院 | Water environment monitoring method based on multi-source remote sensing and machine learning |
CN114494017A (en) * | 2022-01-25 | 2022-05-13 | 北京至简墨奇科技有限公司 | Method, device, equipment and medium for adjusting DPI (deep packet inspection) image according to scale |
CN115080619A (en) * | 2022-06-24 | 2022-09-20 | 中国工商银行股份有限公司 | Data anomaly threshold determination method and device |
CN116522124A (en) * | 2023-05-31 | 2023-08-01 | 广东海洋大学 | Dissolved oxygen content prediction method and system based on influence of environmental factors |
CN116557230A (en) * | 2023-05-19 | 2023-08-08 | 华能陈巴尔虎旗风力发电有限公司 | Wind power plant unit power abnormality online assessment method and system |
CN117115637A (en) * | 2023-10-18 | 2023-11-24 | 深圳市天地互通科技有限公司 | Water quality monitoring and early warning method and system based on big data technology |
-
2024
- 2024-01-10 CN CN202410032744.5A patent/CN117555892B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111275547A (en) * | 2020-03-19 | 2020-06-12 | 重庆富民银行股份有限公司 | Wind control system and method based on isolated forest |
CN112785080A (en) * | 2021-02-03 | 2021-05-11 | 燕山大学 | Energy consumption optimization method of real-time dynamic cement grinding system based on cement industry |
CN113159448A (en) * | 2021-05-12 | 2021-07-23 | 烟台应辉智能科技有限公司 | Automatic analysis and discrimination method based on environmental protection big data |
CN113210824A (en) * | 2021-05-26 | 2021-08-06 | 上海大制科技有限公司 | Servo welding gun driving abnormity detection method and equipment |
CN114384015A (en) * | 2022-01-12 | 2022-04-22 | 中国环境科学研究院 | Water environment monitoring method based on multi-source remote sensing and machine learning |
CN114494017A (en) * | 2022-01-25 | 2022-05-13 | 北京至简墨奇科技有限公司 | Method, device, equipment and medium for adjusting DPI (deep packet inspection) image according to scale |
CN115080619A (en) * | 2022-06-24 | 2022-09-20 | 中国工商银行股份有限公司 | Data anomaly threshold determination method and device |
CN116557230A (en) * | 2023-05-19 | 2023-08-08 | 华能陈巴尔虎旗风力发电有限公司 | Wind power plant unit power abnormality online assessment method and system |
CN116522124A (en) * | 2023-05-31 | 2023-08-01 | 广东海洋大学 | Dissolved oxygen content prediction method and system based on influence of environmental factors |
CN117115637A (en) * | 2023-10-18 | 2023-11-24 | 深圳市天地互通科技有限公司 | Water quality monitoring and early warning method and system based on big data technology |
Non-Patent Citations (3)
Title |
---|
YVAN LE MARC ET AL.: "A stochastic approach for modelling the effects of temperature on the growth rate of Bacillus cereus sensu lato", 《INTERNATIONAL JOURNAL OF FOOD MICROBIOLOGY》, 2 July 2021 (2021-07-02), pages 1 - 11 * |
杜威: "智能交通数据质量控制关键技术研究", 《中国优秀硕士学位论文全文数据库 工程科技Ⅱ辑》, 15 April 2018 (2018-04-15), pages 034 - 876 * |
王经顺 等: "基于大气环境监测数据的工业园区污染物排放总量实时反演核算方法", 《环境工程学报》, 30 November 2023 (2023-11-30), pages 3698 - 3705 * |
Also Published As
Publication number | Publication date |
---|---|
CN117555892B (en) | 2024-04-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112540317B (en) | Battery health state estimation and residual life prediction method based on real vehicle data | |
US11106190B2 (en) | System and method for predicting remaining lifetime of a component of equipment | |
CN110967645B (en) | SOC correction method and apparatus, battery management system, and storage medium | |
CN112765560B (en) | Equipment health state evaluation method, device, terminal equipment and storage medium | |
MXPA04006254A (en) | Method, system and computer product for estimating a remaining equipment life. | |
CN108334652B (en) | Machine pre-diagnosis method and pre-diagnosis device | |
CN111458661A (en) | Power distribution network line variation relation diagnosis method, device and system | |
JP2015520374A (en) | Device and method for determining energy status based on data derived from processing methods | |
CN109388888A (en) | A kind of bridge structure Asphalt pavements method based on vehicular load spatial distribution | |
WO2016147722A1 (en) | Estimating device, estimating method and program | |
CN117445755A (en) | Cloud computing-based remote monitoring system for batteries of electric vehicle | |
CN112034353B (en) | Battery life prediction method and system | |
CN117029968A (en) | Traffic data diagnosis method, system, storage medium and electronic equipment | |
CN116990691A (en) | Method, device, equipment and medium for evaluating remaining full charge time of battery | |
CN117555892B (en) | Atmospheric pollutant multimode fusion accounting model post-treatment method | |
KR20200056716A (en) | Battery SOH output system and method | |
CN116304949A (en) | Calibration method for energy consumption historical data | |
CN114879070A (en) | Battery state evaluation method and related equipment | |
CN1287275C (en) | Soft sensor device and device for evaluating the same | |
CN112782588B (en) | SOC online monitoring method based on LSSVM and storage medium thereof | |
CN111751508A (en) | Performance evaluation prediction method and system for life cycle of water quality sensor | |
CN117077837A (en) | Carbon emission prediction method, device, equipment and medium | |
CN114779081A (en) | Method and device for predicting service life of vehicle battery through mutual learning and storage medium | |
CN115701545A (en) | Method and device for providing a calculated and predicted state of aging of an electrical energy store | |
CN115082802A (en) | Road disease identification method, device, equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |