Abstract
According to the complexity of ocean data, this paper adopts a parallel mining algorithm of association rules to explore the correlation and regularity of oxygen, temperature, phosphate, nitrate and silicate in the ocean. After the marine data is interpolated, this paper utilizes the parallel FP-growth algorithm to mine the data and then briefly analyzes the mining results of the frequent itemsets and association rules. The relationship between the parallel efficiency and the core number of CPU is analyzed through datasets with different scales. The experimental results indicate that the acceleration effect is ideal when each thread scored 200,000–300,000 data, which leads to more than 1.2 times of performance improvement.
Similar content being viewed by others
References
Dericquebourg P, Person A, Ségalen L et al (2015) Environmental significance of Upper Miocene phosphorites at hominid sites in the Lukeino Formation (Tugen Hills, Kenya). Sediment Geol 327:43–54
Gadino AN, Brunner JF, Chambers U et al (2016) A perspective on the extension of research-based information to orchard management decision-makers: lessons learned and potential future directions. Biol Control 102:121–127
Shinohara M, Kanazawa T, Shiobara H (2011) Recent progress in ocean bottom seismic observation and new results of marine seismology. In: Underwater Technology. IEEE, 2011, pp 1–7
King B (2001) Argo: the global array of profiling floats. Godae Project Office, Melbourne, pp 248–258
Chu PC, Fan CW (2016) Absolute geostrophic velocity inverted from World Ocean Atlas 2013 (WOAV13) with the P-vector method. Geosci Data J 2(2):78–82
Guinehut S, Traon PYL, Larnicol G et al (2004) Combining Argo and remote-sensing data to estimate the ocean three-dimensional temperature fields—a first approach based on simulated observations. J Mar Syst 46(1):85–98
Gengxin Ch, Yijun H, Xiaoqing Ch et al (2010) Vertical structure and evolution of the Luzon Warm Eddy. Chin J Oceanol Limnol 28(05):955–961
Kobashi F, Kubokawa A (2012) Review on North Pacific subtropical countercurrents and subtropical fronts: role of mode waters in ocean circulation and climate. J Oceanogr 68(1):21–43
Liu C, Armin K, Liu Z et al (2016) Deep-reaching thermocline mixing in the equatorial pacific cold tongue. Nat Commun 7:11576
Lin Kawuu W, Chung Sheng-Hao, Lin Chun-Cheng (2016) A fast and distributed algorithm for mining frequent patterns in congested networks. Computing 98(3):235–256
Yang XY, Liu Z, Fu Y (2010) MapReduce as a programming model for association rules algorithm on Hadoop. In: International Conference on Information Sciences and Interaction Sciences. IEEE, 2010, pp 99–102
Xiaohong L, Yan J, Yilong L et al (2016) Time series of raster-oriented method for marine abnormal events extraction. J Geo-Inf Sci 18(4):453–460
Agrawal R, Imielinski T, Swami AN (1993) Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD International Conference on Management of Data, pp 207–216
Hájek P, Havel I, Chytil M (1966) The GUHA method of automatic hypotheses determination. Computing 1(4):293–308
Agrawal R, Srikant R (1994) Fast algorithms for mining association rules in large databases. In: International Conference on Very Large Data Bases. Morgan Kaufmann Publishers Inc., pp 487–499
Zaki MJ (2000) Scalable algorithms for association mining. IEEE Trans Knowl Data Eng 12(3):372–390
Han J, Pei J, Yin Y et al (2004) Mining frequent patterns without candidate generation: a frequent-pattern tree approach. Data Min Knowl Disc 8(1):53–87
Rong Z, Xia D, Zhang Z (2013) Complex statistical analysis of big data: implementation and application of Apriori and FP-growth algorithm based on MapReduce. In: Proceedings of 2013 IEEE 4th International Conference on Software Engineering and Service Science (ICSESS), pp 968–972
Qu Z, Keeney J, Robitzsch S et al (2016) Multilevel pattern mining architecture for automatic network monitoring in heterogeneous wireless communication networks. China Commun 13(7):108–116
Shen J, Shen J, Chen X et al (2016) An efficient public auditing protocol with novel dynamic structure for cloud data. IEEE Trans Inf Forensics Secur 12(10):2402–2415
Xia Z, Wang X, Zhang L et al (2017) A privacy-preserving and copy-deterrence content-based image retrieval scheme in cloud computing. IEEE Trans Inf Forensics Secur 11(11):2594–2608
Xia Z, Wang X, Sun X et al (2016) A secure and dynamic multi-keyword ranked search scheme over encrypted cloud data. IEEE Trans Parallel Distrib Syst 27(2):340–352
Kong Y, Zhang M, Ye D (2017) A belief propagation-based method for task allocation in open and dynamic cloud environments. Knowl Based Syst 115:123–132
Wang Y, Cai S, Yin M (2017) Local search for minimum weight dominating set with two-level configuration checking and frequency based scoring function. J Artif Intell Res (JAIR) 58:267–295
Wang Y, Cai S, Yin M (2016) Two efficient local search algorithms for maximum weight clique problem. In: AAAI, pp 805–811
Wang Y, Yin M, Ouyang D et al (2017) A novel local search algorithm with configuration checking and scoring mechanism for the set k-covering problem. Int Trans Oper Res 24(6):1463–1485
Wang Y, Ouyang D, Zhang L et al (2017) A novel local search for unicost set covering problem using hyperedge configuration checking and weight diversity. Sci China Inf Sci 60(6):062103
Acknowledgements
This work was supported in part by the National Natural Science Foundation of China (51679105, 61672261, 51409117) and Jilin Province Department of Education Thirteen Five science and technology research projects [2016] No. 432, [2017] No. JJKH20170804KJ.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Jiang, Y., Zhao, M., Hu, C. et al. A parallel FP-growth algorithm on World Ocean Atlas data with multi-core CPU. J Supercomput 75, 732–745 (2019). https://doi.org/10.1007/s11227-018-2297-6
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-018-2297-6