Abstract
Big data applications can enhance the market competitive advantages of enterprises and organizations and can improve people’s quality of life. However, by the impact of many factors, failure rate of big data project is higher than the IT project. In order to reduce the risk of failure, big data projects must overcome a serial of challenges. Ambiguous requirements, poor data quality, and lacking changeability and extensity will directly affect the results of big data analytics, and even cause the wrong decision, inaccurate prediction and improper planning. Making the big data projects have potential failure risk. For this, this paper applies iterative and incremental development (IID) into the data preprocessing, draws up the iterative and incremental data quality improvement (IIDQI) procedure. Applied IIDQI procedure, iterative detects and identifies the defects of data quality, and incrementally strengthen big data quality and control the factors of failure risk. Iterative inspection activities can effectively enhance data quality, communication efficiency, and requirements precision to reduce the risk of big data project failure.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Cartner: the Gartner Business Intelligence & Analytics Summit (2015). www.gartner.com/newsroom/id/3130017
Almquist, E., Senior, J., Springer, T.: Three promises and perils of Big Data, Bain & Company, Inc. (2015)
Meng, X.F., Ci, X.: Big Data management: concepts. Tech. Challenges J. Comput. Res. Dev. 50(1), 146–169 (2013)
Lackey, D.A. : The Big, Big Data Workbook (2016). blazon.online
Cai, L., Zhu, Y.: The challenges of data quality and data quality assessment in the Big Data era. Data Sci. J. 14(2), 1–10 (2015)
Saha, B., Srivastava, D.: Data quality: the other face of Big Data. In: 2014 IEEE 30th International Conference on Data Engineering (ICDE), pp. 1294–1297 (2014)
Taleb, I., Dssouli, R., Serhani, M.A.: Big Data pre-processing: a quality framework. In: 2015 IEEE International Congress on, pp. 191–198 (2015)
Deshpande, B.: 5 situations which drive data pre-processing before data mining (2013). http://www.simafore.com/blog/bid/116618/5-situations-which-drive-data-pre-processing-before-data-mining
Szalvay, V.: An Introduction to Agile Software Development. CollabNet, Inc. (2004)
Larman, Craig, Victor, R.: Basili.: Iterative and incremental developments. a brief history. Computer 36(6), 47–56 (2003)
Zikopoulos, P. Eaton, C., et al.: Understanding big data: analytics for enterprise class hadoop and streaming data. McGraw-Hill Osborne Media (2011)
Chen, P.C.L., Zhang, C.-Y.: Data-intensive applications, challenges, techniques and technologies: a survey on Big Data Inform. Sci. 275, 314–347 (2014)
Wagner, D.: The importance of big data analytics in business, October, 2014, World of tech. http://www.techradar.com/news/world-of-tech/the-importance-of-big-data-analytics-in-business-1267606/2
Elgendy, N., Elragal, A.: Big Data analytics: a literature review paper. LNCS, pp. 214–227 (2014)
Clancy T.: Chaos Report, The Standish Group Report (2014)
Dong, X.L., Srivastava, D.: Big Data integration. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), pp. 1245–1248 (2013)
Acknowledgments
This research has supported by Ministry of Science and Technology research project funds (Project No.: MOST 105-2221-E-158-002).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Lai, ST., Leu, FY. (2018). An Iterative and Incremental Data Preprocessing Procedure for Improving the Risk of Big Data Project. In: Barolli, L., Enokido, T. (eds) Innovative Mobile and Internet Services in Ubiquitous Computing . IMIS 2017. Advances in Intelligent Systems and Computing, vol 612. Springer, Cham. https://doi.org/10.1007/978-3-319-61542-4_46
Download citation
DOI: https://doi.org/10.1007/978-3-319-61542-4_46
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-61541-7
Online ISBN: 978-3-319-61542-4
eBook Packages: EngineeringEngineering (R0)