Abstract
Big data processing is an urgent and unresolved challenge that originates from the intensive development of information technology. The recent techniques lose their effectiveness rapidly as the volumes of data increase. In this article, we will put down our vision of the basic approaches and models related to problem solving, based on processing large data volumes. This article introduces a two-stage decomposition of a problem, related to assessing management options. The first stage of our original approach implies a semantic analysis of textual information; the second stage is built around finding association rules in a database, processing them via mathematical statistics methods, and converting data and objectives to a vector. We suggest processing the collected news events by a semantic model, which describes their key features and interconnections between them in a specified subject area. The classification-based association rules allow assessing the likelihood of a particular event using a set chain of events. This approach can be applied through the analysis of online news in a specified market segment.
Similar content being viewed by others
References
Chen CP, Zhang CY (2014) Data-intensive applications, challenges, techniques and technologies: a survey on big data. Inf Sci 275:314–347
Laudon KC, Laudon JP (2015) Management information systems. Upper Saddle River, Pearson
Zaurbekov N, Aidosov A, Zaurbekova N, Aidosov G, Zaurbekova G, Zaurbekov I (2018) Emission spread from mass and energy exchange in the atmospheric surface layer: two-dimensional simulation. Energ Source Part A 40(23):2832–2841
Kwon O, Lee N, Shin B (2014) Data quality management, data usage experience and acquisition intention of big data analytics. Int J Inf Manag 34(3):387–394
Bulat PV, Zasuhin ON, Uskov VN (2012) On classification of flow regimes in a channel with sudden expansion. Thermophys Aeromech 19(2):233–246
Deng Q, Gönül S, Kabak Y, Gessa N, Glachs D, Gigante-Valencia F, Thoben KD (2019) An ontology framework for multisided platform interoperability. In: Popplewell K, Thoben KD, Knothe T, Poler R (eds) Enterprise interoperability VIII. Proceedings of the I-ESA conferences, vol 9. Springer, Cham
Rocha, V, Varela, L, Carmo-Silva, S (2016). Sharing product information for supporting collaborative product development. Dept. Production and Systems, School of Engineering, University of Minho, Braga, Portugal
Cunha FA, dos Passos Silva J, de Barros AC, Romeiro Filho E (2017) The use of information management tools as support to the product development process in a metal mechanical company. Product: Manag Develop 11(1):33–41
Welzer, T, Eder, J, Podgorelec, V, Latifić, AK (2019). Advances in Databases and Information Systems. In: 23rd European Conference, ADBIS 2019, Bled, Slovenia, Vol. 11695. Springer Nature
Beyer, M (2011). Gartner says solving “big data” challenge involves more than just managing volumes of data
Zikopoulos, PC, deRoos, D, Parasuraman, K, Deutsch, T, Corrigan, D, Giles, J, Melnyk, RB (2011). Harness the power of big data—The IBM Big Data Platform. McGraw-Hill
Wu X, Zhu X, Wu GQ, Ding W (2013) Data mining with big data. IEEE Trans Knowl Data Eng 26(1):97–107
Abacha, AB, Zweigenbaum, P (2011). Medical entity recognition: A comparison of semantic and statistical methods. In: Proceedings of BioNLP 2011 Workshop, pp. 56–64. Association for Computational Linguistics
Wiese, L (2015). Polyglot database architectures= Polyglot Challenges. In LWA, pp. 422–426
Bassaler, J, Zaïm, S, Prémont, C (2014). What can businesses do to capture the full potential of big data? Orange business services
Hurwitz, J, Nugent, A, Halper, F, Kaufman, M (2013). Big Data for Dummies. Wiley
Azarmi, B (2016). Scalable big data architecture. Apress
Gandomi A, Haider M (2015) Beyond the hype: big data concepts, methods, and analytics. Int J Inf Manag 35(2):137–144
Tian X, Han R, Wang L, Lu G, Zhan J (2015) Latency critical big data computing in finance. Journal of Finance and Data Science 1(1):33–41
Sukhobokov AA, Lakhvich DS (2015) The impact of big data tools on the development of scientific disciplines related to modeling, science and education. Online journal of N.E. Bauman MSTU 3:207–240
Barlow, M (2013). Real-time big data analytics: emerging architecture. O’Reilly
Thaduri A, Galar D, Kumar U (2015) Railway assets: a potential domain for big data analytics. Procedia Comput Sci 53:457–467
Karimi, HA (2014). Big data: techniques and Technologies in Geoinformatics. RC Press
Klemenkov PA, Kuznetsov SD (2012) Big data: current approaches to storage and processing. Proceedings of the Institute for System Programming of the Russian Academy of Sciences 23:143–156
Hutter M (2005) Universal Artificial Intelligence. Springer, Berlin
Evangelopoulos NE (2013) Latent semantic analysis. Wiley Interdiscip Rev Cogn Sci 4(6):683–692
Seeker W, Kuhn J (2013) Morphological and syntactic case in statistical dependency parsing. Comput Linguist 39(1):23–55
Hladik, J, Christl, C, Haferkorn, F, Graube, M (2013). Improving industrial collaboration with linked data, OWL. In: OWLED
Brunetti JM, García R, Auer S (2013) From overview to facets and pivoting for interactive exploration of semantic web data. IJSWIS 9(1):1–20
Wauer, M, Schuster, D, Meinecke, J (2010). Aletheia: an architecture for semantic federation of product information from structured and unstructured sources. In: Proceedings of the 12th International Conference on Information Integration and Web-based Applications & Services, pp. 325–332
Stolz, A, Rodriguez-Castro, B, Hepp, M (2013). Using BMEcat catalogs as a lever for product master data on the semantic web. In: Extended Semantic Web Conference, pp. 623–638. Springer, Berlin, Heidelberg
Otero-Cerdeira L, Rodríguez-Martínez FJ, Gómez-Rodríguez A (2015) Ontology matching: a literature review. Expert Syst Appl 42(2):949–971
Dragisic Z, Ivanova V, Li H, Lambrix P (2017) Experiences from the anatomy track in the ontology alignment evaluation initiative. J Biomed Semant 8(1):56
Wu J, Guo S, Huang H, Liu W, Xiang Y (2018) Information and communications technologies for sustainable development goals: state-of-the-art, needs and perspectives. IEEE Commun Surv Tut 20(3):2389–2406
Wu J, Guo S, Li J, Zeng D (2016) Big data meet green challenges: Big data toward green applications. IEEE Syst, J. 10(3):888–900
Singhal, A, Buckley, C, Mitra, M (2017). Pivoted document length normalization. In: Acm sigir forum, pp. 176–184. New York, NY, USA, ACM
Shehata, S, Karray, F, Kamel, M (2006). Enhancing text clustering using concept-based mining model. In: Sixth International Conference on Data Mining (ICDM’06), pp. 1043–1048. IEEE
Wu, ST, Li, Y, Xu, Y, Pham, B, Chen, P (2004). Automatic Pattern- Taxonomy Extraction for Web Mining. In: IEEE/WIC/ACM Int’l Conf. Web Intelligence (WI ‘04), pp. 242–248
Availability of data and material
Data will be available on request.
Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
Authors declare that they have no conflict of interests.
Additional information
This article is part of the Topical Collection: Special Issue on Security of Mobile, Peer-to-peer and Pervasive Services in the Cloud
Guest Editors: B. B. Gupta, Dharma P. Agrawal, Nadia Nedjah, Gregorio Martinez Perez, and Deepak Gupta
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Imanbayev, K., Sinchev, B., Sibanbayeva, S. et al. Analysis and mathematical modeling of big data processing. Peer-to-Peer Netw. Appl. 14, 2626–2634 (2021). https://doi.org/10.1007/s12083-020-00978-3
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12083-020-00978-3