Nothing Special   »   [go: up one dir, main page]

CN116821104B - Industrial Internet data processing method and system based on big data - Google Patents

Industrial Internet data processing method and system based on big data Download PDF

Info

Publication number
CN116821104B
CN116821104B CN202210990142.1A CN202210990142A CN116821104B CN 116821104 B CN116821104 B CN 116821104B CN 202210990142 A CN202210990142 A CN 202210990142A CN 116821104 B CN116821104 B CN 116821104B
Authority
CN
China
Prior art keywords
data
industrial internet
sequence
internet data
abnormal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210990142.1A
Other languages
Chinese (zh)
Other versions
CN116821104A (en
Inventor
张丹丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Hongdaxin Electronic Technology Co ltd
Original Assignee
Zhong Guobiao
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhong Guobiao filed Critical Zhong Guobiao
Priority to CN202210990142.1A priority Critical patent/CN116821104B/en
Publication of CN116821104A publication Critical patent/CN116821104A/en
Application granted granted Critical
Publication of CN116821104B publication Critical patent/CN116821104B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides an industrial Internet data processing method and system based on big data, wherein the method comprises the following steps: s1, acquiring industrial Internet data; s2, preprocessing the acquired industrial Internet data, and storing the preprocessed industrial Internet data into a data warehouse; s3, carrying out data management on industrial Internet data in a data warehouse; and S4, based on the data analysis model, extracting corresponding industrial Internet data from the data warehouse and carrying out data analysis to obtain a data analysis result. The method can perform unified acquisition, pretreatment, treatment and analysis treatment on the industrial Internet data generated in the daily production process of enterprises, realize the collection and effective utilization of the industrial Internet data, lay a foundation for further performing large data processing based on the acquired industrial Internet data and realizing the construction of an industrial Internet information processing framework, and is beneficial to improving the adaptability of modern industrial Internet construction.

Description

Industrial Internet data processing method and system based on big data
Technical Field
The invention relates to the technical field of industrial Internet data processing, in particular to an industrial Internet data processing method and system based on big data.
Background
Currently, with the development of industrial intelligence, massive industrial internet data, such as monitoring data of industrial equipment, monitoring data of production environment, etc., are generated in a daily production process. In the prior art, aiming at massive industrial Internet monitoring data, most enterprises only carry out localized simple processing and management, the value of the data is not utilized and embodied most effectively and reasonably, and the requirements of modern industrial Internet construction cannot be met.
Disclosure of Invention
In view of the above problems, the present invention aims to provide an industrial internet data processing method and system based on big data.
The aim of the invention is realized by adopting the following technical scheme:
In a first aspect, the present invention provides an industrial internet data processing method based on big data, including:
s1, acquiring industrial Internet data;
s2, preprocessing the acquired industrial Internet data, and storing the preprocessed industrial Internet data into a data warehouse;
s3, carrying out data management on industrial Internet data in a data warehouse;
And S4, based on the data analysis model, extracting corresponding industrial Internet data from the data warehouse and carrying out data analysis to obtain a data analysis result.
In one embodiment, the industrial internet data includes operational data of the industrial equipment and status data of the industrial equipment;
The step S1 comprises the following steps:
s11, acquiring state data of industrial equipment through a sensor arranged on the industrial equipment, wherein the state data of the industrial equipment comprise temperature data, humidity data and vibration signals;
s12, acquiring operation data of the industrial equipment through an intelligent terminal arranged on the industrial equipment, wherein the operation data of the industrial equipment comprise operation data and equipment state data of the industrial equipment.
In one embodiment, the industrial internet data includes warehouse material inventory data and equipment management data;
wherein, step S1 includes:
s13, acquiring material inventory data from a warehouse material management system;
S14, acquiring the entered industrial equipment basic data from the equipment management system.
In one embodiment, step S2 includes:
s21, decrypting the acquired industrial Internet data to obtain decrypted industrial Internet data;
S22, carrying out standardization processing on the decrypted industrial Internet data to obtain standardized industrial Internet data;
S23, carrying out data cleaning treatment on the standardized industrial Internet data to obtain cleaned industrial Internet data;
S24, carrying out data integration processing on the industrial Internet data after the data cleaning processing, and storing the industrial Internet data into a data warehouse.
In one embodiment, in step S23, the data cleaning process is performed on the industrial internet data after the normalization process, and specifically includes:
S231, acquiring continuous industrial Internet data which are acquired by different data acquisition sources and aimed at the same target, and respectively forming an industrial Internet data sequence, wherein the target comprises industrial equipment or a production place comprising a plurality of industrial equipment; the data acquisition source comprises a sensor of the industrial equipment and an intelligent terminal on the industrial equipment; each data acquisition source corresponds to an industrial Internet data sequence;
S232, carrying out combined abnormal data detection on the industrial Internet data sequence to obtain an industrial Internet data sequence abnormal detection result; and carrying out anomaly marking on the industrial Internet data sequence with the detected anomaly;
S233, carrying out outlier processing on the abnormally marked industrial Internet data sequence, and cleaning the abnormally marked industrial Internet data sequence into data meeting the quality requirement.
In one embodiment, step S3 includes:
S31, quality inspection is carried out on industrial Internet data in the data warehouse to obtain quality inspection results, and the industrial Internet data with unqualified quality is isolated according to the quality inspection results;
S32, managing metadata associated with industrial Internet data in a data warehouse according to the set data specification;
S33, performing blood margin analysis on the industrial Internet data in the data warehouse to obtain blood margin analysis results and generate a blood margin relation graph.
In one embodiment, step S4 includes:
And calling a corresponding data set from the data warehouse according to the set analysis task, and analyzing based on the set data analysis model to obtain a data analysis result.
In one embodiment, step S4 further comprises:
and visually displaying according to the obtained data analysis result.
In a second aspect, the present invention proposes an industrial internet data processing system based on big data, comprising:
The acquisition module is used for acquiring industrial Internet data;
the pretreatment module is used for carrying out pretreatment on the acquired industrial Internet data and storing the pretreated industrial Internet data into the data warehouse;
The management module is used for carrying out data management on the industrial Internet data in the data warehouse;
And the analysis module is used for extracting corresponding industrial Internet data from the data warehouse based on the data analysis model and carrying out data analysis to obtain a data analysis result.
The beneficial effects of the invention are as follows: the method can perform unified acquisition, pretreatment, treatment and analysis treatment on the industrial Internet data generated in the daily production process of enterprises, realize the collection and effective utilization of the industrial Internet data, lay a foundation for further performing large data processing based on the acquired industrial Internet data and realizing the construction of an industrial Internet information processing framework, and is beneficial to improving the adaptability of modern industrial Internet construction.
Drawings
The invention will be further described with reference to the accompanying drawings, in which embodiments do not constitute any limitation of the invention, and other drawings can be obtained by one of ordinary skill in the art without inventive effort from the following drawings.
FIG. 1 is a flow chart of a method for processing industrial Internet data based on big data according to an embodiment of the invention;
FIG. 2 is a flowchart of a method for processing industrial Internet data based on big data in step S2 according to an embodiment of the present invention;
FIG. 3 is a block diagram of an industrial Internet data processing system based on big data according to an embodiment of the present invention.
Detailed Description
The invention is further described in connection with the following application scenario.
Referring to fig. 1, an industrial internet data processing method based on big data is shown, which includes:
s1, acquiring industrial Internet data;
s2, preprocessing the acquired industrial Internet data, and storing the preprocessed industrial Internet data into a data warehouse;
s3, carrying out data management on industrial Internet data in a data warehouse;
And S4, based on the data analysis model, extracting corresponding industrial Internet data from the data warehouse and carrying out data analysis to obtain a data analysis result.
According to the embodiment of the invention, the industrial Internet data processing method based on big data is provided, unified acquisition, pretreatment, treatment and analysis processing can be carried out on industrial Internet data generated in the daily production process of enterprises, collection and effective utilization of the industrial Internet data are realized, a foundation is laid for further carrying out big data processing based on the acquired industrial Internet data and building an industrial Internet information processing frame, and the adaptability of modern industrial Internet construction is improved.
In one embodiment, the industrial internet data includes operational data of the industrial equipment and status data of the industrial equipment;
The step S1 comprises the following steps:
S11, acquiring state data of the industrial equipment through a sensor arranged on the industrial equipment, wherein the state data of the industrial equipment comprise temperature data, humidity data, vibration signals and the like;
S12, collecting operation data of the industrial equipment through an intelligent terminal arranged on the industrial equipment, wherein the operation data of the industrial equipment comprise operation data of the industrial equipment, equipment state data (such as current, voltage and part monitoring data) and the like.
In one embodiment, the industrial internet data includes warehouse material inventory data and equipment management data;
wherein, step S1 includes:
s13, acquiring material inventory data from a warehouse material management system;
S14, acquiring the entered industrial equipment basic data from the equipment management system.
In a scene, the corresponding operation data and state data of the industrial equipment in the production scene are acquired in the daily production process of the enterprise, so that the construction of a corresponding industrial equipment production database for the enterprise is facilitated; through the comprehensive collection of the production data, a foundation can be laid for the targeted analysis (such as production analysis, safety analysis, production management and the like) further according to the collected industrial equipment data.
The collection mode of the related data of the industrial equipment can be that the real-time data is collected through a sensor or a sensor group arranged on the industrial equipment, or can be that an intelligent terminal directly connected to the industrial equipment is used for acquiring the related operation data of the industrial equipment in real time. Or by means of database transfer or human entry, to enter useful or desired industrial internet data into the system.
In one embodiment, referring to fig. 2, step S2, preprocessing the acquired industrial internet data, and storing the preprocessed industrial internet data in a data warehouse, includes:
s21, decrypting the acquired industrial Internet data to obtain decrypted industrial Internet data;
S22, carrying out standardization processing on the decrypted industrial Internet data to obtain standardized industrial Internet data;
S23, carrying out data cleaning treatment on the standardized industrial Internet data to obtain cleaned industrial Internet data;
S24, carrying out data integration processing on the industrial Internet data after the data cleaning processing, and storing the industrial Internet data into a data warehouse.
According to the embodiment of the invention, after the acquisition of the industrial Internet data is realized, the acquired industrial Internet data is subjected to preliminary pretreatment such as decryption, standardization treatment and the like, and the industrial Internet data is further subjected to data cleaning, so that the obtained data resources can be subjected to preliminary arrangement under the condition of acquiring massive industrial Internet data, the quality of the industrial Internet data can be improved, meanwhile, the industrial Internet data is uniformly stored in the data warehouse built by utilizing the cloud technology, the distributed technology and the like, the data collection can be realized, and the reliable data support can be provided for the subsequent large data analysis and treatment through the built data warehouse.
In a scenario, for performing data cleaning processing on industrial internet data with time sequence, such as status data of industrial equipment and operation data of industrial equipment, which are acquired in a daily production process of an enterprise, a specific targeted data cleaning method is also provided, and in step S23, the data cleaning processing is performed on the standardized industrial internet data, which specifically includes:
S231, acquiring continuous industrial Internet data which are acquired by different data acquisition sources and aimed at the same target, and respectively forming an industrial Internet data sequence, wherein the target comprises industrial equipment or a production place comprising a plurality of industrial equipment; the data acquisition source comprises a sensor of the industrial equipment and an intelligent terminal on the industrial equipment; each data acquisition source corresponds to an industrial Internet data sequence;
In one scenario, the acquired industrial internet data may be industrial internet data acquired by different data acquisition sources acquired in real-time by an acquisition module.
S232, carrying out combined abnormal data detection on the industrial Internet data sequence to obtain an industrial Internet data sequence abnormal detection result; and carrying out anomaly marking on the industrial Internet data sequence with the detected anomaly;
S233, carrying out outlier processing on the abnormally marked industrial Internet data sequence, and cleaning the abnormally marked industrial Internet data sequence into data meeting the quality requirement.
In the prior art, a single data cleaning mode is mostly adopted for data cleaning, namely, independent data cleaning processing is carried out for data obtained by a single data source; however, the data cleaning technology of only a single data source, whether the data cleaning mode is based on standard rules or the data cleaning mode aiming at deep learning, cannot well detect and identify dirty data when the data cleaning is carried out on industrial internet data, so that the performance and effect of the data cleaning are not ideal.
Considering that industrial internet data is characterized by being associated with industrial production scenes, there may be a high association between industrial internet data (for example, operation data and status data collected for the same production equipment, which have a certain logic or influence relationship before); therefore, for the real-time data collected in the production scene, based on the time sequence and the relevance of the real-time production data, the embodiment also provides a targeted data cleaning method, which can continuously combine abnormal data detection based on the data acquired by different data collection sources of the same target, and effectively improve the detection and identification accuracy of dirty data in the data cleaning process; after the abnormal data is detected, abnormal value processing is further carried out on the abnormal data, and the abnormal data is cleaned in a deleting, correcting, replacing and other modes, so that the quality of the industrial Internet data is improved.
In one embodiment, in step S232, the joint anomaly data detection for the industrial internet data sequence includes:
a) Cleaning pretreatment is carried out according to the acquired industrial Internet data sequence, and the method comprises the following steps:
performing time mark alignment according to the acquired industrial Internet data sequence, and filling the missing data to obtain a primarily finished industrial Internet data sequence;
adopting a gradual sequence reduction method to respectively reduce each initially-finished industrial Internet data sequence to obtain an industrial Internet data sequence after cleaning pretreatment;
In a scene, aligning data acquired by different data acquisition sources arranged on a large intelligent production device according to identified time information, acquiring corresponding data to form an industrial Internet data sequence based on a set time length, wherein the industrial Internet data sequence for the data acquisition source A is a sequence formed by 2000 pieces of current data acquired in time sequence in a time period from t 1 to t 2; the industrial internet data sequence for the data acquisition source B is a sequence of 2000 industrial equipment engine temperature data acquired in time sequence in the time period from t 1 to t 2.
B) Dividing the industrial Internet data sequence after cleaning pretreatment into different data groups based on the trained data group distribution model, wherein the method comprises the following steps:
each data group comprises at least one industrial Internet data sequence, the data group distribution model comprises belonging grouping information corresponding to different data attributes, and the industrial Internet data sequences are divided into corresponding data groups according to the data attributes corresponding to the industrial Internet data sequences; the data attribute comprises data acquisition source information of industrial Internet data;
In a scene, the data group distribution model comprises data group information aiming at different data acquisition sources, and the industrial Internet data sequences are divided into corresponding data groups according to the data acquisition sources corresponding to the obtained industrial Internet data sequences.
C) For each data group, carrying out detection processing on abnormal data in the data group according to an industrial Internet data sequence in the data group to obtain a detection result of the abnormal data in the data group, wherein the detection result comprises the following steps:
Normalizing each industrial Internet data sequence, normalizing the industrial Internet data sequence into a normalized sequence with the average value of 0 and the standard deviation of 1;
For normalized sequences corresponding to a plurality of industrial Internet data sequences in the same data set, respectively calculating association parameters among the plurality of industrial Internet data sequences, and constructing a first association feature matrix in the set according to the association parameters among the sequences:
Wherein Z c represents the first correlation matrix of the c-th data set, Representing an association parameter between an ith industrial internet data sequence and a jth industrial internet data sequence within a c-th data set, wherein i=1, 2, … D, j=1, 2, … D, D representing a total number of industrial internet data sequences within the data set; wherein,
Wherein,Represents the mth data in the normalized sequence corresponding to the ith industrial internet data sequence in the c-th data set, n represents the total number of data in the industrial internet data sequence,Represents the average value of each data in the normalized sequence corresponding to the ith industrial Internet data sequence in the c-th data group,Represents the mth data in the normalized sequence corresponding to the jth industrial internet data sequence in the c-th data set,Representing the average value of each data in a normalization sequence corresponding to the j industrial Internet data sequence in the c data group; Representing the mth data in the low frequency IMF component derived from the ith industrial internet data sequence within the c-th data set; representing the average value of each data in the low frequency IMF component obtained from the ith industrial internet data sequence in the c-th data set, Representing mth data in the low frequency IMF component derived from the jth industrial internet data sequence within the c-th data set; representing an average value of each data in the low-frequency IMF component obtained from the j-th industrial internet data sequence in the c-th data set; omega 1 and omega 2 respectively represent set associated adjustment factors, wherein omega 12∈[1,1.1],ω1≥ω2;
Comparing each association parameter of the first association feature matrix with a set standard threshold range, and when the association parameter exceeds the set standard threshold range, marking that the association parameter is abnormal and marking that abnormal data exists in a data set corresponding to the first association feature matrix;
Counting the number of associated parameters marked as abnormal corresponding to each industrial Internet data sequence aiming at a data group with abnormal data, marking the industrial Internet data sequence corresponding to the industrial Internet data sequence with the largest number of associated parameters marked as abnormal data, and adding the industrial Internet data sequence marked as abnormal data into the abnormal data group;
Removing the industrial Internet data sequence marked as abnormal data from the data group, and carrying out abnormal data detection processing in the data group on the residual industrial Internet data sequences in the data group again until the abnormal data detection processing results in the data group are normal, and continuing the abnormal data detection processing in the data group of the next data group until the abnormal data detection processing in the data group of all the data groups is completed;
In a scenario, the method comprises: acquiring a low-frequency IMF component and a high-frequency IMF component of an industrial Internet data sequence:
According to the obtained industrial Internet data sequence x (m) empirical mode decomposition, K IMF components { IMF 1,imf2,imf3,…imfK } and remainder y of the industrial Internet data sequence x (m) are obtained;
Taking the obtained IMF 1 as a high-frequency IMF component IMF g;
the reconstruction is performed as a low-frequency IMF component IMF d from the obtained IMF component { IMF 2,imf3,…imfK } and remainder y.
D) Performing detection processing on abnormal data among the data groups according to each data group to obtain detection results of the abnormal data among the data groups, wherein the detection results comprise:
For each data set, respectively counting the sum of associated parameters of each industrial Internet data sequence corresponding to other industrial Internet data sequences: Wherein the method comprises the steps of Representing the sum of associated parameters of the ith industrial internet data sequence in the c-th data set; taking the industrial Internet data sequence with the largest sum of the associated parameters as a characteristic sequence of the data set;
respectively calculating association parameters among the feature sequences of each data set, and constructing a second association feature matrix according to the association parameters among the sequences:
Wherein, Representing a second correlation feature matrix, v ab representing correlation parameters of the feature sequence of the a-th data set and the feature sequence of the b-th data set, wherein a=1, 2, … F, b=1, 2, … F, F representing the total number of data sets; wherein,
Wherein u (m) a represents the m-th data in the normalized sequence corresponding to the feature sequence of the a-th data set, and n represents the total number of data in the normalized sequence corresponding to the feature sequence; Representing the average value of each data in the normalized sequence corresponding to the characteristic sequence of the a-th data set; u (m) b represents the mth data in the normalized sequence corresponding to the signature sequence of the b-th data set; representing the average value of each data in the normalized sequence corresponding to the characteristic sequence of the b data set; zero (IMF a-g) represents the zero-crossing rate of the high-frequency IMF component obtained from the signature sequence of the a-th dataset; zero (IMF b-g) represents the zero-crossing rate of the high-frequency IMF component obtained from the signature sequence of the b-th dataset; omega 3 and omega 4 represent relevant regulatory factors, wherein omega 34=1,ω3>2ω4;
comparing the association parameters among the feature sequences of each data set with the corresponding association threshold ranges, and marking that the association parameters are abnormal when the association parameters exceed the corresponding association threshold ranges;
Counting the number of associated parameters marked as abnormal corresponding to the feature sequence, marking a data group corresponding to the feature sequence with the largest number of associated parameters marked as abnormal as a problem data group, marking all industrial Internet data sequences in the problem data group as abnormal data, and adding the industrial Internet data sequences marked as abnormal data into the abnormal data group;
Removing the data groups marked as abnormal data, and carrying out detection processing on abnormal data among the data groups according to the rest data groups again until the detection results of the abnormal data among the data groups are normal;
e) Obtaining an industrial Internet data sequence abnormality detection result according to an abnormality data detection result in the data group and an abnormality data detection result among the data groups, wherein the industrial Internet data sequence abnormality detection result comprises:
And marking the abnormal detection results marked by the industrial Internet data sequences contained in the abnormal data set as abnormal, and marking the abnormal detection results of the rest industrial Internet data sequences as normal.
According to the combined abnormal data detection method, firstly, cleaning pretreatment is carried out according to the obtained industrial Internet data of different data acquisition sources, so that the quality of an industrial Internet data sequence can be primarily improved; the method comprises the steps of carrying out data grouping according to an acquired industrial Internet data sequence, carrying out intra-group abnormality detection on the industrial Internet data sequence in the same data group based on the data grouping, carrying out abnormality detection on intra-group data according to association characteristics among the intra-group data sequences, taking data with strong association in the industrial Internet data into consideration, carrying out abnormality detection on the data with strong association by taking multidimensional data as a basis, and effectively improving objectivity and accuracy of abnormal data detection; when calculating the correlation parameters of the data in the group, the influence of the noise data points on the variation trend is not overcome by taking the similar variation trend among the same data group into consideration, but the traditional correlation parameter calculation mode based on the data variation cannot overcome, so that the characteristic of the variation trend of the data sequence reflected by taking the low-frequency IMF component as a parameter is particularly added when calculating the correlation parameters, the influence of the noise data points on the variation trend of the data sequence can be effectively avoided, and the accuracy of the correlation characteristic expression among the industrial Internet data sequences in the data group is effectively improved. The method is beneficial to improving the accuracy of abnormal data detection in the data set.
After the abnormal data in the group is detected, further carrying out comprehensive abnormal detection on independent or weaker-relevance data according to the relevance characteristics among the data groups, and screening abnormal data; in the process of calculating the correlation parameters among the data sets, the correlation of the characteristic sequences among the data sets based on the change trend is not strong, so that the high-frequency IMF component based on the characteristic sequences is particularly added as the change fluctuation influence relation among the sequences of parametric response, the relation among the characteristic sequences can be further reflected, and the accuracy of abnormal data detection is improved.
Meanwhile, compared with the traditional association anomaly detection technical scheme, the detection efficiency of the anomaly data can be effectively improved, and the performance of real-time anomaly detection of the industrial Internet can be improved.
When the abnormal data detection is started, firstly, a model required by the abnormal data detection needs to be built, and through the building of the model, the standardized model building can be carried out on the association characteristics and grouping conditions of a plurality of data acquisition source data of different targets, so that reasonable references are provided for grouping standard and standard threshold range calculation required in the abnormal detection process.
In one embodiment, step S232 further includes:
training a data set distribution model, comprising:
constructing a training set, wherein the training set comprises standard industrial Internet data corresponding to different data acquisition sources aiming at the same target, the training set comprises standard industrial Internet data in different time periods, and the industrial Internet data in each time period comprises standard industrial Internet data sequences corresponding to different data acquisition sources in the time period; the standard industrial Internet data sequence is a sequence with an average value of 0 and a standard deviation of 1 after normalization treatment;
In a scene, historical data acquired by different data acquisition sources of large intelligent production equipment are acquired from a data warehouse, abnormal conditions of the data are analyzed through expert research, judgment and other modes, and the abnormal conditions are associated with corresponding historical data to form training set data.
In one scenario, a training set contains device operational data collected by 10 sensors for a large intelligent production device, wherein the device operational data collected by each sensor corresponds to 3 time periods, each time period containing 1000 data collected in time order.
According to the obtained training set, standard association parameters among different data acquisition sources are calculated, and a first standard association feature matrix is constructed:
Wherein Z' represents a first standard correlation feature matrix, H ij represents a standard correlation parameter between an i-th data acquisition source and a j-th data acquisition source, where i=1, 2, … N, j=1, 2, … N, N represents a total number of data acquisition sources; wherein,
Wherein x (t, m) i represents the mth data of the standard industrial Internet data sequence corresponding to the ith data acquisition source in the t period, n represents the total number of data in the standard industrial Internet data sequence,Representing the average value of each data of a standard industrial Internet data sequence corresponding to the ith data acquisition source in the t period; x (t, m) j represents the mth data of the standard industrial internet data sequence corresponding to the jth data acquisition source in the t period; representing the average value of each data of a standard industrial Internet data sequence corresponding to a jth data acquisition source in a t period; IMF i-d (t, m) represents the mth data in the low-frequency IMF component obtained according to the standard industrial Internet data sequence corresponding to the ith data acquisition source in the t period; The average value of all data in the low-frequency IMF component obtained according to the standard industrial Internet data sequence corresponding to the ith data acquisition source in the t period is represented, and IMF j-d (t, m) represents the mth data in the low-frequency IMF component obtained according to the standard industrial Internet data sequence corresponding to the jth data acquisition source in the t period; Representing the average value of all data in the low-frequency IMF component obtained according to the standard industrial Internet data sequence corresponding to the jth data acquisition source of the t period; omega 1 and omega 2 respectively represent set associated adjustment factors, wherein omega 12∈[1,1.1],ω1≥ω2;
Traversing the obtained first standard association feature matrix based on the set condition, and solving a standard grouping result of a data acquisition source, wherein the adopted grouping condition function is as follows:
the standard industrial Internet data sequences corresponding to the data acquisition sources meeting the condition function are divided into the same data group, A represents a target data group, a i,aj represents an ith data acquisition source and a jth data acquisition source respectively, H ij represents standard association parameters between the ith data acquisition source and the jth data acquisition source, and gamma represents a set first standard threshold value, wherein gamma is [0.7,0.8]; num (H ij. Gtoreq. Gtoreq.) represents the number of data acquisition sources in data set A satisfying a criterion correlation parameter between any two acquisition sources greater than beta, where beta represents a second criterion threshold, beta ε [0.4,0.6], and num (A) represents the number of data acquisition sources contained in data set A;
When the same data acquisition source simultaneously meets the conditions existing in a plurality of data groups, the data acquisition source is preferentially divided into the data groups with more data acquisition sources.
In a scenario, the method comprises: the low frequency IMF component and the high frequency IMF component of the standard industrial internet data sequence are acquired. The method in which the low-frequency IMF component and the high-frequency IMF component are acquired corresponds to the method in which the low-frequency IMF component and the high-frequency IMF component of the industrial internet data sequence shown in the above-described embodiment are acquired.
In one embodiment, the range of standard thresholds within each standard data set is calculated based on the partitioned standard data sets.
In a scene, a standard threshold range adopts a standard threshold of a fixed value as a standard, wherein the standard threshold T1E [0.68,0.8] and the standard threshold range are [ T1,1];
In one scenario, the standard threshold range uses an adaptive standard threshold as a standard, wherein the adaptive standard threshold t2=max (T1, H ij - α), where T1 e [0.68,0.8], α e [0.08,0.2], the standard threshold range is [ T2,1];
In one embodiment, according to the divided data set, respectively counting the sum of standard association parameters of each data acquisition source and other data acquisition sources in the set, and taking the data acquisition source with the largest sum of standard association parameters as a characteristic data acquisition source;
Standard association parameters among characteristic data acquisition sources of each standard data set are calculated respectively, and a second standard association characteristic matrix is constructed according to the standard association parameters:
Wherein Φ represents the second standard correlation feature matrix, V ab represents the standard correlation parameters of the a-th standard data set and the b-th standard data set, wherein a=1, 2, … F, b=1, 2, … F, and F represents the total number of data sets; wherein,
Wherein u (t, m) a represents the mth data of the standard industrial Internet data sequence corresponding to the characteristic data acquisition source of the a standard data group at the t moment, n represents the total number of data in the standard industrial Internet data sequence,The average value of the data of the standard industrial Internet data sequence corresponding to the characteristic data acquisition source of the a standard data group at the t moment is represented, u (t, m) b represents the m data of the standard industrial Internet data sequence corresponding to the characteristic data acquisition source of the b standard data group at the t moment,Representing the average value of each data of the standard industrial Internet data sequence corresponding to the characteristic data acquisition source of the a standard data set at the t moment; zero (IMF a-g (t)) represents the zero-crossing rate of the high-frequency IMF component obtained from the standard industrial internet data sequence corresponding to the characteristic data acquisition source representing the a-th standard data set at time t, and zero (IMF b-g (t)) represents the zero-crossing rate of the high-frequency IMF component obtained from the standard industrial internet data sequence corresponding to the characteristic data acquisition source representing the b-th standard data set at time t; omega 3 and omega 4 represent relevant regulatory factors, wherein omega 34=1,ω3>2ω4;
In one embodiment, a standard association threshold range of each data set is calculated according to a second standard association feature matrix, wherein the standard association threshold range adopts an adaptive standard association threshold as a standard, and the standard association threshold range is [ max (-1, V ab-δ),min(Vab +delta, 1) ]; wherein, delta is 0.2, 0.4.
Based on the implementation mode, the data group distribution model is built and trained, a training set can be built according to standard industrial Internet data of a target, and data grouping is carried out in advance according to standard data acquired by different data sources, so that relevant standard grouping information is obtained. The method can accurately group the data according to the association characteristics among the industrial Internet data acquired by different data acquisition sources, accurately divide the data with strong association into the same data group, and distinguish the data with weak association line of the data by different data groups, thereby providing support for the subsequent detection of the combined abnormal data aiming at the data acquired in real time.
When calculating the correlation parameters of the data in the group, the influence of the noise data points on the variation trend is not overcome by taking the similar variation trend among the same data group into consideration, but the traditional correlation parameter calculation mode based on the data variation cannot overcome, so that the characteristic of the variation trend of the data sequence reflected by taking the low-frequency IMF component as a parameter is particularly added when calculating the correlation parameters, the influence of the noise data points on the variation trend of the data sequence can be effectively avoided, and the accuracy of the correlation characteristic expression among the industrial Internet data sequences in the data group is effectively improved. The method is beneficial to improving the accuracy of abnormal data detection in the data set.
In the process of calculating the correlation parameters among the data sets, the correlation of the characteristic sequences among the data sets based on the change trend is not strong, so that the change fluctuation influence relationship among the sequences based on the characteristic sequences and taking the high-frequency IMF component based on the characteristic sequences as the parametric response is particularly added, the relation among the characteristic sequences can be further reflected, and the accuracy of abnormal data detection is improved.
In one embodiment, step S3 includes:
S31, quality inspection is carried out on industrial Internet data in the data warehouse to obtain quality inspection results, and the industrial Internet data with unqualified quality is isolated according to the quality inspection results;
S32, managing metadata associated with industrial Internet data in a data warehouse according to the set data specification;
S33, performing blood margin analysis on the industrial Internet data in the data warehouse to obtain blood margin analysis results and generate a blood margin relation graph.
In a scene, in order to improve the industrial Internet data stored in the data warehouse, the invention also carries out data management on the industrial Internet data in the data warehouse, including quality inspection, metadata management, blood edge analysis and other management on the line data in the data warehouse, which can be helpful for improving the data management level of the data warehouse and improving the industrial Internet data value.
In one embodiment, step S4 includes:
And calling a corresponding data set from the data warehouse according to the set analysis task, and analyzing based on the set data analysis model to obtain a data analysis result.
In one embodiment, step S4 further comprises:
and visually displaying according to the obtained data analysis result.
After the data volume in the data warehouse reaches a certain scale, further large data analysis processing can be performed based on the industrial Internet data in the data warehouse, and data analysis meeting different requirements can be performed according to the requirements of enterprises in different scenes in the production process, so that the utilization value of the industrial Internet data can be improved.
Meanwhile, statistics and visual display can be performed according to data in a data warehouse, so that the management level of industrial Internet data is improved.
Referring to the embodiment of FIG. 3, an industrial Internet data processing system based on big data, comprising:
The acquisition module is used for acquiring industrial Internet data;
the pretreatment module is used for carrying out pretreatment on the acquired industrial Internet data and storing the pretreated industrial Internet data into the data warehouse;
The management module is used for carrying out data management on the industrial Internet data in the data warehouse;
And the analysis module is used for extracting corresponding industrial Internet data from the data warehouse based on the data analysis model and carrying out data analysis to obtain a data analysis result.
It should be noted that, the collection module, the preprocessing module, the treatment module and the analysis module included in the industrial internet data processing system based on big data provided by the present invention are further configured to correspondingly implement the specific embodiments corresponding to the steps of the industrial internet data processing method based on big data shown in fig. 1, and the present invention is not repeated herein.
The system provided by the invention can be built based on a cloud server, an edge server, an intelligent terminal and the like.
The industrial Internet data processing system based on the big data can perform unified acquisition, pretreatment, treatment and analysis processing on the industrial Internet data generated in the daily production process of enterprises, realize the collection and effective utilization of the industrial Internet data, lay a foundation for further carrying out big data processing based on the acquired industrial Internet data and realizing the construction of an industrial Internet information processing framework, and contribute to improving the adaptability of modern industrial Internet construction.
It should be noted that, in each embodiment of the present invention, each functional unit/module may be integrated in one processing unit/module, or each unit/module may exist alone physically, or two or more units/modules may be integrated in one unit/module. The integrated units/modules described above may be implemented either in hardware or in software functional units/modules.
From the description of the embodiments above, it will be apparent to those skilled in the art that the embodiments described herein may be implemented in hardware, software, firmware, middleware, code, or any suitable combination thereof. For a hardware implementation, the processor may be implemented in one or more of the following units: an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a processor, a controller, a microcontroller, a microprocessor, other electronic units designed to perform the functions described herein, or a combination thereof. For a software implementation, some or all of the flow of an embodiment may be accomplished by a computer program to instruct the associated hardware. When implemented, the above-described programs may be stored in or transmitted as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a computer. Computer-readable media can include, but are not limited to, RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage media or other magnetic storage devices, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer.
Finally, it should be noted that the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the scope of the present invention, and although the present invention has been described in detail with reference to the preferred embodiments, those skilled in the art should understand that modifications or equivalent substitutions can be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims (6)

1. An industrial internet data processing method based on big data is characterized by comprising the following steps:
S1, acquiring industrial Internet data; wherein the industrial internet data comprises operation data of industrial equipment and status data of the industrial equipment; the step S1 comprises the following steps:
s11, acquiring state data of industrial equipment through a sensor arranged on the industrial equipment, wherein the state data of the industrial equipment comprise temperature data, humidity data and vibration signals;
s12, acquiring operation data of the industrial equipment through an intelligent terminal arranged on the industrial equipment, wherein the operation data of the industrial equipment comprise operation data and equipment state data of the industrial equipment;
S2, preprocessing the acquired industrial Internet data, and storing the preprocessed industrial Internet data into a data warehouse, wherein the method comprises the following steps of:
s21, decrypting the acquired industrial Internet data to obtain decrypted industrial Internet data;
S22, carrying out standardization processing on the decrypted industrial Internet data to obtain standardized industrial Internet data;
S23, carrying out data cleaning treatment on the standardized industrial Internet data to obtain cleaned industrial Internet data;
S24, carrying out data integration processing on the industrial Internet data after the data cleaning processing, and storing the industrial Internet data into a data warehouse;
In step S23, the data cleaning process is performed on the standardized industrial internet data, which specifically includes:
S231, acquiring continuous industrial Internet data which are acquired by different data acquisition sources and aimed at the same target, and respectively forming an industrial Internet data sequence, wherein the target comprises industrial equipment or a production place comprising a plurality of industrial equipment; the data acquisition source comprises a sensor of the industrial equipment and an intelligent terminal on the industrial equipment; each data acquisition source corresponds to an industrial Internet data sequence;
S232, carrying out combined abnormal data detection on the industrial Internet data sequence to obtain an industrial Internet data sequence abnormal detection result; and carrying out anomaly marking on the industrial Internet data sequence with the detected anomaly;
S233, carrying out outlier processing on the abnormally marked industrial Internet data sequence, and cleaning the abnormally marked industrial Internet data sequence into data meeting the quality requirement;
In step S232, the detecting of the joint abnormal data for the industrial internet data sequence includes:
a) Cleaning pretreatment is carried out according to the acquired industrial Internet data sequence, and the method comprises the following steps:
performing time mark alignment according to the acquired industrial Internet data sequence, and filling the missing data to obtain a primarily finished industrial Internet data sequence;
adopting a gradual sequence reduction method to respectively reduce each initially-finished industrial Internet data sequence to obtain an industrial Internet data sequence after cleaning pretreatment;
b) Dividing the industrial Internet data sequence after cleaning pretreatment into different data groups based on the trained data group distribution model, wherein the method comprises the following steps:
each data group comprises at least one industrial Internet data sequence, the data group distribution model comprises belonging grouping information corresponding to different data attributes, and the industrial Internet data sequences are divided into corresponding data groups according to the data attributes corresponding to the industrial Internet data sequences; the data attribute comprises data acquisition source information of industrial Internet data;
c) For each data group, carrying out detection processing on abnormal data in the data group according to an industrial Internet data sequence in the data group to obtain a detection result of the abnormal data in the data group, wherein the detection result comprises the following steps:
Normalizing each industrial Internet data sequence, normalizing the industrial Internet data sequence into a normalized sequence with the average value of 0 and the standard deviation of 1;
For normalized sequences corresponding to a plurality of industrial Internet data sequences in the same data set, respectively calculating association parameters among the plurality of industrial Internet data sequences, and constructing a first association feature matrix in the set according to the association parameters among the sequences:
Wherein Z c represents the first correlation matrix of the c-th data set, Representing an association parameter between an ith industrial internet data sequence and a jth industrial internet data sequence within a c-th data set, wherein i=1, 2, … D, j=1, 2, … D, D representing a total number of industrial internet data sequences within the data set; wherein,
Wherein,Represents the mth data in the normalized sequence corresponding to the ith industrial internet data sequence in the c-th data set, n represents the total number of data in the industrial internet data sequence,Represents the average value of each data in the normalized sequence corresponding to the ith industrial Internet data sequence in the c-th data group,Represents the mth data in the normalized sequence corresponding to the jth industrial internet data sequence in the c-th data set,Representing the average value of each data in a normalization sequence corresponding to the j industrial Internet data sequence in the c data group; Representing the mth data in the low frequency IMF component derived from the ith industrial internet data sequence within the c-th data set; representing the average value of each data in the low frequency IMF component obtained from the ith industrial internet data sequence in the c-th data set, Representing mth data in the low frequency IMF component derived from the jth industrial internet data sequence within the c-th data set; representing an average value of each data in the low-frequency IMF component obtained from the j-th industrial internet data sequence in the c-th data set; omega 1 and omega 2 respectively represent set associated adjustment factors, wherein omega 12∈[1,1.1],ω1≥ω2;
Comparing each association parameter of the first association feature matrix with a set standard threshold range, and when the association parameter exceeds the set standard threshold range, marking that the association parameter is abnormal and marking that abnormal data exists in a data set corresponding to the first association feature matrix;
Counting the number of associated parameters marked as abnormal corresponding to each industrial Internet data sequence aiming at a data group with abnormal data, marking the industrial Internet data sequence corresponding to the industrial Internet data sequence with the largest number of associated parameters marked as abnormal data, and adding the industrial Internet data sequence marked as abnormal data into the abnormal data group;
Removing the industrial Internet data sequence marked as abnormal data from the data group, and carrying out abnormal data detection processing in the data group on the residual industrial Internet data sequences in the data group again until the abnormal data detection processing results in the data group are normal, and continuing the abnormal data detection processing in the data group of the next data group until the abnormal data detection processing in the data group of all the data groups is completed;
d) Performing detection processing on abnormal data among the data groups according to each data group to obtain detection results of the abnormal data among the data groups, wherein the detection results comprise:
For each data set, respectively counting the sum of associated parameters of each industrial Internet data sequence corresponding to other industrial Internet data sequences: Wherein the method comprises the steps of Representing the sum of associated parameters of the ith industrial internet data sequence in the c-th data set; taking the industrial Internet data sequence with the largest sum of the associated parameters as a characteristic sequence of the data set;
respectively calculating association parameters among the feature sequences of each data set, and constructing a second association feature matrix according to the association parameters among the sequences:
Wherein, Representing a second correlation feature matrix, v ab representing correlation parameters of the feature sequence of the a-th data set and the feature sequence of the b-th data set, wherein a=1, 2, … F, b=1, 2, … F, F representing the total number of data sets; wherein,
Wherein u (m) a represents the m-th data in the normalized sequence corresponding to the feature sequence of the a-th data set, and n represents the total number of data in the normalized sequence corresponding to the feature sequence; Representing the average value of each data in the normalized sequence corresponding to the characteristic sequence of the a-th data set; u (m) b represents the mth data in the normalized sequence corresponding to the signature sequence of the b-th data set; representing the average value of each data in the normalized sequence corresponding to the characteristic sequence of the b data set; zero (IMF a-g) represents the zero-crossing rate of the high-frequency IMF component obtained from the signature sequence of the a-th dataset; zero (IMF b-g) represents the zero-crossing rate of the high-frequency IMF component obtained from the signature sequence of the b-th dataset; omega 3 and omega 4 represent relevant regulatory factors, wherein omega 34=1,ω3>2ω4;
comparing the association parameters among the feature sequences of each data set with the corresponding association threshold ranges, and marking that the association parameters are abnormal when the association parameters exceed the corresponding association threshold ranges;
Counting the number of associated parameters marked as abnormal corresponding to the feature sequence, marking a data group corresponding to the feature sequence with the largest number of associated parameters marked as abnormal as a problem data group, marking all industrial Internet data sequences in the problem data group as abnormal data, and adding the industrial Internet data sequences marked as abnormal data into the abnormal data group;
Removing the data groups marked as abnormal data, and carrying out detection processing on abnormal data among the data groups according to the rest data groups again until the detection results of the abnormal data among the data groups are normal;
e) Obtaining an industrial Internet data sequence abnormality detection result according to an abnormality data detection result in the data group and an abnormality data detection result among the data groups, wherein the industrial Internet data sequence abnormality detection result comprises:
marking the abnormal detection results marked by the industrial Internet data sequences contained in the abnormal data set as abnormal, and marking the abnormal detection results of the rest industrial Internet data sequences as normal;
s3, carrying out data management on industrial Internet data in a data warehouse;
And S4, based on the data analysis model, extracting corresponding industrial Internet data from the data warehouse and carrying out data analysis to obtain a data analysis result.
2. The big data based industrial internet data processing method of claim 1, wherein the industrial internet data includes warehouse material inventory data and equipment management data;
wherein, step S1 includes:
s13, acquiring material inventory data from a warehouse material management system;
S14, acquiring the entered industrial equipment basic data from the equipment management system.
3. The industrial internet data processing method based on big data according to claim 1, wherein step S3 comprises:
quality inspection is carried out on the industrial Internet data in the data warehouse to obtain quality inspection results, and the industrial Internet data with unqualified quality is isolated according to the quality inspection results;
Managing metadata associated with industrial Internet data in a data warehouse according to the set data specification;
and performing blood margin analysis on the industrial Internet data in the data warehouse to obtain blood margin analysis results and generate a blood margin relation graph.
4. The industrial internet data processing method based on big data according to claim 1, wherein step S4 comprises:
And calling a corresponding data set from the data warehouse according to the set analysis task, and analyzing based on the set data analysis model to obtain a data analysis result.
5. The industrial internet data processing method according to claim 4, wherein the step S4 further comprises:
and visually displaying according to the obtained data analysis result.
6. An industrial internet data processing system based on big data, comprising:
The system comprises an acquisition module, a control module and a control module, wherein the acquisition module is used for acquiring industrial Internet data, and the industrial Internet data comprises operation data of industrial equipment and state data of the industrial equipment; the acquisition module comprises:
collecting state data of the industrial equipment by a sensor arranged on the industrial equipment, wherein the state data of the industrial equipment comprise temperature data, humidity data and vibration signals;
Collecting operation data of industrial equipment through an intelligent terminal arranged on the industrial equipment, wherein the operation data of the industrial equipment comprise operation data and equipment state data of the industrial equipment;
The preprocessing module is used for preprocessing the acquired industrial internet data and storing the preprocessed industrial internet data into the data warehouse, and comprises the following steps:
Decrypting the acquired industrial Internet data to obtain decrypted industrial Internet data;
Carrying out standardization processing on the decrypted industrial Internet data to obtain standardized industrial Internet data;
performing data cleaning treatment on the standardized industrial Internet data to obtain cleaned industrial Internet data;
Carrying out data integration processing on the industrial Internet data after the data cleaning processing, and storing the industrial Internet data into a data warehouse;
the method for cleaning the standardized industrial Internet data specifically comprises the following steps of:
acquiring continuous industrial Internet data which are acquired by different data acquisition sources aiming at the same target, and respectively forming an industrial Internet data sequence, wherein the target comprises industrial equipment or a production place comprising a plurality of industrial equipment; the data acquisition source comprises a sensor of the industrial equipment and an intelligent terminal on the industrial equipment; each data acquisition source corresponds to an industrial Internet data sequence;
carrying out combined abnormal data detection on the industrial Internet data sequence to obtain an industrial Internet data sequence abnormal detection result; and carrying out anomaly marking on the industrial Internet data sequence with the detected anomaly;
performing outlier processing on the abnormally marked industrial Internet data sequence, and cleaning the abnormally marked industrial Internet data sequence into data meeting the quality requirement;
The method for detecting the combined abnormal data of the industrial Internet data sequence comprises the following steps:
a) Cleaning pretreatment is carried out according to the acquired industrial Internet data sequence, and the method comprises the following steps:
performing time mark alignment according to the acquired industrial Internet data sequence, and filling the missing data to obtain a primarily finished industrial Internet data sequence;
adopting a gradual sequence reduction method to respectively reduce each initially-finished industrial Internet data sequence to obtain an industrial Internet data sequence after cleaning pretreatment;
b) Dividing the industrial Internet data sequence after cleaning pretreatment into different data groups based on the trained data group distribution model, wherein the method comprises the following steps:
each data group comprises at least one industrial Internet data sequence, the data group distribution model comprises belonging grouping information corresponding to different data attributes, and the industrial Internet data sequences are divided into corresponding data groups according to the data attributes corresponding to the industrial Internet data sequences; the data attribute comprises data acquisition source information of industrial Internet data;
c) For each data group, carrying out detection processing on abnormal data in the data group according to an industrial Internet data sequence in the data group to obtain a detection result of the abnormal data in the data group, wherein the detection result comprises the following steps:
Normalizing each industrial Internet data sequence, normalizing the industrial Internet data sequence into a normalized sequence with the average value of 0 and the standard deviation of 1;
For normalized sequences corresponding to a plurality of industrial Internet data sequences in the same data set, respectively calculating association parameters among the plurality of industrial Internet data sequences, and constructing a first association feature matrix in the set according to the association parameters among the sequences:
Wherein Z c represents the first correlation matrix of the c-th data set, Representing an association parameter between an ith industrial internet data sequence and a jth industrial internet data sequence within a c-th data set, wherein i=1, 2, … D, j=1, 2, … D, D representing a total number of industrial internet data sequences within the data set; wherein,
Wherein,Represents the mth data in the normalized sequence corresponding to the ith industrial internet data sequence in the c-th data set, n represents the total number of data in the industrial internet data sequence,Represents the average value of each data in the normalized sequence corresponding to the ith industrial Internet data sequence in the c-th data group,Represents the mth data in the normalized sequence corresponding to the jth industrial internet data sequence in the c-th data set,Representing the average value of each data in a normalization sequence corresponding to the j industrial Internet data sequence in the c data group; Representing the mth data in the low frequency IMF component derived from the ith industrial internet data sequence within the c-th data set; representing the average value of each data in the low frequency IMF component obtained from the ith industrial internet data sequence in the c-th data set, Representing mth data in the low frequency IMF component derived from the jth industrial internet data sequence within the c-th data set; representing an average value of each data in the low-frequency IMF component obtained from the j-th industrial internet data sequence in the c-th data set; omega 1 and omega 2 respectively represent set associated adjustment factors, wherein omega 12∈[1,1.1],ω1≥ω2;
Comparing each association parameter of the first association feature matrix with a set standard threshold range, and when the association parameter exceeds the set standard threshold range, marking that the association parameter is abnormal and marking that abnormal data exists in a data set corresponding to the first association feature matrix;
Counting the number of associated parameters marked as abnormal corresponding to each industrial Internet data sequence aiming at a data group with abnormal data, marking the industrial Internet data sequence corresponding to the industrial Internet data sequence with the largest number of associated parameters marked as abnormal data, and adding the industrial Internet data sequence marked as abnormal data into the abnormal data group;
Removing the industrial Internet data sequence marked as abnormal data from the data group, and carrying out abnormal data detection processing in the data group on the residual industrial Internet data sequences in the data group again until the abnormal data detection processing results in the data group are normal, and continuing the abnormal data detection processing in the data group of the next data group until the abnormal data detection processing in the data group of all the data groups is completed;
d) Performing detection processing on abnormal data among the data groups according to each data group to obtain detection results of the abnormal data among the data groups, wherein the detection results comprise:
For each data set, respectively counting the sum of associated parameters of each industrial Internet data sequence corresponding to other industrial Internet data sequences: Wherein the method comprises the steps of Representing the sum of associated parameters of the ith industrial internet data sequence in the c-th data set; taking the industrial Internet data sequence with the largest sum of the associated parameters as a characteristic sequence of the data set;
respectively calculating association parameters among the feature sequences of each data set, and constructing a second association feature matrix according to the association parameters among the sequences:
Wherein, Representing a second correlation feature matrix, v ab representing correlation parameters of the feature sequence of the a-th data set and the feature sequence of the b-th data set, wherein ω=1, 2, … F, b=1, 2, … F, F representing the total number of data sets; wherein,
Wherein u (m) a represents the m-th data in the normalized sequence corresponding to the feature sequence of the a-th data set, and n represents the total number of data in the normalized sequence corresponding to the feature sequence; Representing the average value of each data in the normalized sequence corresponding to the characteristic sequence of the a-th data set; u (m) b represents the mth data in the normalized sequence corresponding to the signature sequence of the b-th data set; representing the average value of each data in the normalized sequence corresponding to the characteristic sequence of the b data set; zero (IMF a-g) represents the zero-crossing rate of the high-frequency IMF component obtained from the signature sequence of the a-th dataset; zero (IMF b-g) represents the zero-crossing rate of the high-frequency IMF component obtained from the signature sequence of the b-th dataset; omega 3 and omega 4 represent relevant regulatory factors, wherein omega 34=1,ω3>2ω4;
comparing the association parameters among the feature sequences of each data set with the corresponding association threshold ranges, and marking that the association parameters are abnormal when the association parameters exceed the corresponding association threshold ranges;
Counting the number of associated parameters marked as abnormal corresponding to the feature sequence, marking a data group corresponding to the feature sequence with the largest number of associated parameters marked as abnormal as a problem data group, marking all industrial Internet data sequences in the problem data group as abnormal data, and adding the industrial Internet data sequences marked as abnormal data into the abnormal data group;
Removing the data groups marked as abnormal data, and carrying out detection processing on abnormal data among the data groups according to the rest data groups again until the detection results of the abnormal data among the data groups are normal;
e) Obtaining an industrial Internet data sequence abnormality detection result according to an abnormality data detection result in the data group and an abnormality data detection result among the data groups, wherein the industrial Internet data sequence abnormality detection result comprises:
marking the abnormal detection results marked by the industrial Internet data sequences contained in the abnormal data set as abnormal, and marking the abnormal detection results of the rest industrial Internet data sequences as normal;
The management module is used for carrying out data management on the industrial Internet data in the data warehouse;
And the analysis module is used for extracting corresponding industrial Internet data from the data warehouse based on the data analysis model and carrying out data analysis to obtain a data analysis result.
CN202210990142.1A 2022-08-18 2022-08-18 Industrial Internet data processing method and system based on big data Active CN116821104B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210990142.1A CN116821104B (en) 2022-08-18 2022-08-18 Industrial Internet data processing method and system based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210990142.1A CN116821104B (en) 2022-08-18 2022-08-18 Industrial Internet data processing method and system based on big data

Publications (2)

Publication Number Publication Date
CN116821104A CN116821104A (en) 2023-09-29
CN116821104B true CN116821104B (en) 2024-07-16

Family

ID=88122677

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210990142.1A Active CN116821104B (en) 2022-08-18 2022-08-18 Industrial Internet data processing method and system based on big data

Country Status (1)

Country Link
CN (1) CN116821104B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112667704A (en) * 2020-12-22 2021-04-16 煤炭科学研究总院 Coal mine industry internet data middle platform system structure
CN114419507A (en) * 2022-01-18 2022-04-29 中国石油大学(华东) Internet factory operation diagnosis method and system based on federal learning

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102171592B1 (en) * 2014-01-02 2020-10-29 한국전자통신연구원 Device of preventing fault by detecting possibility of fault
CN107871187A (en) * 2016-09-22 2018-04-03 北京航天长峰科技工业集团有限公司 A kind of enterprise safety operation data managing method
CN108412710B (en) * 2018-01-30 2019-08-06 同济大学 A kind of Wind turbines wind power data cleaning method
CN109857732A (en) * 2019-01-31 2019-06-07 山东省电子信息产品检验院 A kind of industry internet platform monitoring data transmission switching method and system
CN110297847A (en) * 2019-07-03 2019-10-01 牡丹江师范学院 A kind of intelligent information retrieval method based on big data principle
CN110771940A (en) * 2019-11-29 2020-02-11 浙江工业大学 Intelligent tobacco leaf baking control system and method based on Internet of things and deep learning
US11838367B2 (en) * 2019-12-09 2023-12-05 Siemens Aktiengesellschaft Information acquiring method, apparatus, and system
FR3105845B1 (en) * 2019-12-31 2021-12-31 Bull Sas DATA PROCESSING METHOD AND SYSTEM FOR PREPARING A DATA SET
KR102165528B1 (en) * 2020-06-03 2020-10-14 (주)금호전력 Sensing and fusion data monitoring system for railway electronic equipments
CN114442563A (en) * 2020-11-04 2022-05-06 淮安苏信科技信息有限公司 Monitoring method and device for more accurately detecting industrial internet data
CN112699175B (en) * 2021-01-15 2024-02-13 广州汇智通信技术有限公司 Data management system and method thereof
CN113676525A (en) * 2021-08-09 2021-11-19 武汉卓尔信息科技有限公司 Network collaborative manufacturing-oriented industrial internet public service platform
CN113837539A (en) * 2021-08-19 2021-12-24 华能(浙江)能源开发有限公司玉环分公司 Coal fired boiler heating surface depth fault early warning system based on industrial internet
CN114221805A (en) * 2021-12-13 2022-03-22 恒安嘉新(北京)科技股份公司 Method, device, equipment and medium for monitoring industrial internet data
CN114490886A (en) * 2021-12-29 2022-05-13 北京航天智造科技发展有限公司 Industrial operation system data lake construction method based on data warehouse

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112667704A (en) * 2020-12-22 2021-04-16 煤炭科学研究总院 Coal mine industry internet data middle platform system structure
CN114419507A (en) * 2022-01-18 2022-04-29 中国石油大学(华东) Internet factory operation diagnosis method and system based on federal learning

Also Published As

Publication number Publication date
CN116821104A (en) 2023-09-29

Similar Documents

Publication Publication Date Title
CN110851338B (en) Abnormality detection method, electronic device, and storage medium
CN111475804B (en) Alarm prediction method and system
WO2021179572A1 (en) Operation and maintenance system anomaly index detection model optimization method and apparatus, and storage medium
US11836162B2 (en) Unsupervised method for classifying seasonal patterns
CN108415789B (en) Node fault prediction system and method for large-scale hybrid heterogeneous storage system
CN109034244B (en) Line loss abnormity diagnosis method and device based on electric quantity curve characteristic model
CN112188531B (en) Abnormality detection method, abnormality detection device, electronic apparatus, and computer storage medium
CN105071983B (en) Abnormal load detection method for cloud calculation on-line business
CN112445690B (en) Information acquisition method and device and electronic equipment
CN107180056A (en) The matching process and device of fragment in video
Cai et al. An efficient outlier detection approach on weighted data stream based on minimal rare pattern mining
CN117132000A (en) Crop growth condition prediction method, device and medium based on AI
CN111078512A (en) Alarm record generation method and device, alarm equipment and storage medium
CN116821104B (en) Industrial Internet data processing method and system based on big data
CN115664038A (en) Intelligent power distribution operation and maintenance monitoring system for electrical safety management
CN113127464A (en) Agricultural big data environment feature processing method and device and electronic equipment
CN118283082A (en) Big data acquisition method and system based on cloud computing
CN116738261A (en) Numerical characteristic discretization attribution analysis method and device based on clustering and binning
CN116662904A (en) Method, device, computer equipment and medium for detecting variation of data type
CN117290405A (en) Internet of things system for quickly inquiring large-scale equipment data
CN117912645A (en) Blood preservation whole-flow supervision method and system based on Internet of things
CN112685473B (en) Network abnormal flow detection method and system based on time sequence analysis technology
CN116662466A (en) Land full life cycle maintenance system through big data
CN110708296B (en) VPN account number collapse intelligent detection model based on long-time behavior analysis
CN118520365B (en) Data visualization analysis system for sheep breeding growth monitoring

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20240624

Address after: Room 804, No. 56 Feicui Oasis Forest Peninsula Street, Xintang Town, Zengcheng District, Guangzhou City, Guangdong Province, 511300

Applicant after: Zhong Guobiao

Country or region after: China

Address before: 226000 Room 101, building 11, Xingfu new town, No. 66, Xinhua Road, Xingfu street, Chongchuan District, Nantong City, Jiangsu Province

Applicant before: Nantong zeshuo Information Technology Co.,Ltd.

Country or region before: China

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240906

Address after: 510000 Room 1905, No. 437 Dongfeng Middle Road, Yuexiu District, Guangzhou City, Guangdong Province (cannot be used as a factory building)

Patentee after: Guangdong Hongdaxin Electronic Technology Co.,Ltd.

Country or region after: China

Address before: Room 804, No. 56 Feicui Oasis Forest Peninsula Street, Xintang Town, Zengcheng District, Guangzhou City, Guangdong Province, 511300

Patentee before: Zhong Guobiao

Country or region before: China