Nothing Special   »   [go: up one dir, main page]

Skip to main content
Log in

Research on automatic cleaning algorithm of multi-dimensional network redundant data based on big data

  • Special Issue
  • Published:
Evolutionary Intelligence Aims and scope Submit manuscript

Abstract

In order to realize the research on network redundant data cleaning based on big data, this paper designs a set of redundant data cleaning framework according to the data processing flow before data analysis. According to the spatial correlation of redundant data, a method of data cleaning is designed. In the data cleaning method, appropriate cleaning algorithms are designed for abnormal data and missing data respectively, in which mathematical probability design is applied to abnormal data to delete the data with obvious deviation from the normal data value. The spatial model and algorithm are designed by applying spatial correlation to the missing data to fill the missing data value after the redundant data is cleaned by other steps in the method. The accuracy of the model is compared with that of the common data prediction algorithm, and the accuracy between the algorithm and the redundant data set is verified.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Chai Q, Zheng W, Pan J, Lu S, Wen J (2018) Research on state monitoring and fault handling methods of intelligent distribution network based on big data analysis. Modern Electron Technol 10(4):3137–3147

    Google Scholar 

  2. Shen X, Li Y, Ma Y, Yang J (2019) Application of environmental monitoring system based on GIS technology in comprehensive pipe gallery. Municipal Technol 124(5):936–939

    Google Scholar 

  3. Liu B, Fu Z, Wang Y, Wang P, Gao X (2018) Big data mining technology based on parallel computing and its application in power plant boiler performance optimization. Chin J Power Eng 38(6):431–439

    Google Scholar 

  4. Wang H, Li Z, Zhang X (2017) An adaptive audit method for data integrity in cloud storage. Comput Res Dev 54(1):172–179

    Google Scholar 

  5. Zhang S, Wang Z, Wang B (2017) Integrity detection scheme of power consumption information collection terminal based on trusted computing. Electric Power Autom Equip 12:117–124

    Google Scholar 

  6. Zhang R, Ma Z (2017) Simulation research on missing optimization detection of big data network information system. Comput Simul 56(9):69–81

    Google Scholar 

  7. Zhou J, Wang J, He T, Wang J, Li P (2018) Multi-sensor data fusion of greenhouse environment based on spatio-temporal correlation. Jiangsu Agricult Sci 89(5):31–42

    Google Scholar 

  8. Wu F (2018) Data science and big data technology: the sweet pastry in emerging majors. Friends High School Stud 63(1):1–7

    Google Scholar 

  9. Ramírez-Gallego S, Krawczyk B, García S, Woźniak M, Herrera F (2017) A survey on data preprocessing for data stream mining: current status and future directions. Neurocomputing 239:39–57

    Article  Google Scholar 

  10. Sun X, Li P, Liu Y (2019) Design and implementation of smart home control system based on the internet of things. Electron Technol Softw Eng 62(7):4430–4442

    Google Scholar 

  11. Chen W (2019) Research and analysis of building energy consumption monitoring system based on internet things technology. Green Build 01:3650–3652

    Google Scholar 

  12. Wang L, Chen Q, Gao H, Ma Z, Zhang Y, He D (2018) Intelligent substation fault tracking architecture based on big data mining technology. Autom Electric Power Syst 42(03):84–91

    Google Scholar 

  13. Li H, Wan X (2017) Research on mass data sharing technology based on OS2 master station system. Electron Design Eng 20:1–6

    Google Scholar 

  14. Li H, Zhang L (2017) Multi-tenant data integrity verification scheme based on two-layer authentication tree. Chin Sci Technol Paper 107(8):203–216

    Google Scholar 

  15. Shah JS, Rai SN, DeFilippis AP, Hill BG, Bhatnagar A, Brock GN (2017) Distribution based nearest neighbor imputation for truncated high dimensional data with applications to pre-clinical and clinical metabolomics studies. BMC Bioinform 18(1):1–13

    Article  Google Scholar 

  16. Marshall DD, Powers R (2017) Beyond the paradigm: combining mass spectrometry and nuclear magnetic resonance for metabolomics. Prog Nuclear Magn Resonance Spectrosco 100:1–16

    Article  Google Scholar 

  17. Xu Y (2019) The application and prospect analysis of the Internet of Things technology in the stadium system. Dig Commun World 3(08):80–89

    Google Scholar 

  18. Yi T, Xi C, Weidong L, Baochang C, Liuqing D, Liyun S, Lihong H (2017) Global and untargeted metabolomics evidence of the protective effect of different extracts of Dipsacus asper Wall. ex C.B. Clarke on estrogen deficiency after ovariectomia in rats. J Ethnopharmacol 199:20–29

    Article  Google Scholar 

  19. Wang Z, Guo Z, Yang H, Liu B (2019) Analysis of the effect of population structure changes on medical and health expenditure based on vector autoregressive model. China Health Stat 37(2):307–332

    Google Scholar 

  20. Tao Y, Zhang H, Xu J (2018) Application research of outlier detection in big data analysis. Inf Sci 14(03):373–377

    Google Scholar 

  21. Hao S, Li G, Feng J, Wang N (2018) Overview of structured data cleaning technology. J Tsinghua Univ (Nat Sci Ed) 26(1):65–74

    Google Scholar 

  22. Qu C, Zhang Y, Wang Y, Zhao Y (2018) Energy Internet power energy big data cleaning model based on Spark framework. Electr Meas Instrum 86(3):221–236

    Google Scholar 

  23. Xu S, Mi W, Xu Z, Bo Z (2017) A dynamic data integrity verification scheme in smart grid. Comput Eng 12(8):366–371

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Fang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Fang, J. Research on automatic cleaning algorithm of multi-dimensional network redundant data based on big data. Evol. Intel. 15, 2609–2617 (2022). https://doi.org/10.1007/s12065-021-00620-y

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s12065-021-00620-y

Keywords

Navigation