A fast parallel re-computation with redundancy mechanism for parallel digital terrain analysis

Wanfeng Dou^1,2 &
Shoushuai Miao¹

238 Accesses
2 Citations
6 Altmetric
Explore all metrics

Abstract

According to many published literature, parallel computing is regarded as an efficient solution in digital terrain analysis (DTA) of geographic information system. The stable and credible services play an irreplaceable role in the high performance computing, especially when an error occurs in large-scale science computing. In this paper, a new approach for the parallel DTA considering the performance of fault-tolerance was proposed: fast parallel re-computation (FPR). FPR owns a fast self-recovery ability based on redundancy mechanisms compared to other fault-tolerant methods. Once some errors in application layers are detected, the data block having computation errors is further partitioned into several sub-blocks, which are re-computed by the surviving processes concurrently to improve the efficiency of failure recovery. The overlapping strategy of error detection and re-computation is presented through decomposing the data block into several logic sub-blocks. As a result, when an error of a logical sub-block of the data block is detected by a comparing thread the re-computing process immediately starts to correct the error. This strategy reduces the time of re-computation and error detection by overlapping them comparing the traditional re-computation method. The experiments show that the proposed FPR method can achieve better performance efficiency with fewer overhead.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Research on the Fast Parallel Recomputing for Parallel Digital Terrain Analysis

A fault-tolerant computing method for Xdraw parallel algorithm

Article 17 March 2018

Exploiting Scalable Parallelism for Remote Sensing Analysis Models by Data Transformation Graph

References

Song, X., Tang, G., Li, F., et al.: Extraction of loess shoulder-line based on the parallel GVF snake model in the Loess hilly area of China. J. Comput. Geosci. 52(1), 11–20 (2013)
Article Google Scholar
Group, W., Lusk, E.: Fault tolerance in MPI programs. Spec. Issue J. High Perform. Comput. Appl. 18, 363–372 (2002)
Article Google Scholar
Cauchi-Saunders, A., Lewis, I.: GPU enabled XDraw viewshed analysis. Int. J. Parallel Distrib. Comput. 84(7), 87–93 (2015)
Article Google Scholar
Gomez, L., Maruyama, N., Cappello, F., Matsuoka, S.: Distributed diskless checkpoint for large scale systems. In: 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing (CCGrid), Melbourne, Victoria, 17–20 May, pp. 63–72 (2010)
Li, Y., Lan, Z.: A fast restart mechanism for checkpoint recovery protocols in networked environments. In: IEEE International Conference on Dependable System and Networks, pp. 217–226 (2008)
Rao, S., Alvisi, L., Vin, H.: Egida: an extensible toolkit for low-overhead fault-tolerance. In: IEEE Fault-Tolerant Computing Symposium (FTCS-29), Madison, WI, June, pp. 48–55 (1999)
Patel, J., Fung, L.: Concurrent error detection in ALUs by re-computing with shifted operands. IEEE Trans. Comput. C.31(7), 589–595 (1982)
Article MATH Google Scholar
Mozaffari-Kerimani, M., Manoharan, N., Azarderakhsh, R.: Reliable radix-4 complex division for fault-sensitive applications. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 34(4), 656–667 (2015)
Mozaffari-Kerimani, M., Manoharan, N., Azarderakhsh, R.: Efficient error detection architectures for CORDIC through recomputing with encoded operands. In: IEEE International Symposium on Circuits and Systems (ISCAS), pp. 2154–2157 (2016)
Yang, X., Du, Y., Wang, P., et al.: The fault tolerant parallel algorithm: the parallel re-computing based failure recovery. In: 16th International Conference on Parallel Architecture and Compilation Techniques (PACT), Brasov, Romania, 15–19 September, pp. 199–209 (2007)
Du, Y., Tang, Y., Xie, X.: A new parallel recomputing code design methodology for fast failure recovery. J. Comput. Electr. Eng. 39(4), 1095–1113 (2013)
Article Google Scholar
Evans, J.: Fault Tolerance in Hadoop for Work Migration. Technical Report CSCI B534, November. Indiana University (2011)
Goiri, I., Julia, F., Guitart, J., Torres, J.: Checkpoint-based fault-tolerance infrastructure for virtualized service providers. In: IEEE/IFIP Network Operations and Management Symposium, April, pp. 455–462. IEEE, Osaka (2010)
Plank, J., Li, K., Puening, M.: Diskless check-pointing. IEEE Trans. Parallel Distrib. Syst. 9(10), 972–986 (1998)
Article Google Scholar
Engelmann, C., Geist, A.: A diskless check-pointing algorithm for super-scale architectures applied to the fast Fourier transform. In: IEEE 1st International Workshop on Challenges of Large Applications in Distributed Environments (CLADE), Seattle, WA, 21 June, pp. 47–52 (2003)
Song, X., Dou, W., Tang, G., Yang, K., Qian, K.: A diskless check-pointing algorithm for cluster architectures applied to geospatial raster data processing. J. Algorithms Comput. Technol. 8(4), 369–387 (2014)
Article Google Scholar
Bronevetsky, G., Marques, D., Pingali, K., Stodghill, P.: Automated application-level checkpoint of MPI program. In: ACM Symposium on Principles and Practice of Parallel Programming (PPoPP), San Diego, CA, 11–13 June, pp. 84–94 (2003)
Chen, Z., Dongarra, J.: Highly scalable self-healing algorithms for high performance scientific computing. IEEE Trans. Comput. 58(11), 1512–1524 (2009)
Dou, W., Miao, S.: Performance analysis for fast parallel recomputing algorithm under DTA. In: 14th International Symposium on Distributed Computing and Algorithms for Business, Engineering, and Sciences (DCABES), Guiyang, China, 18–24 August, pp. 46–49 (2015)
Miao, S., Dou, W., Li, Y.: An error-detecting approach for fault tolerance parallel recomputing with parallel digital terrain analysis. J. Algorithms Comput. Technol. 10(1), 52–61 (2016)
Article MathSciNet Google Scholar
Miao, S., Dou, W., Li, Y.: Study on error-detecting approach for fault tolerance recomputing oriented parallel digital terrain analysis. In: Distributed Computing and Algorithms for Business, Engineering, and Sciences (DCABES), Xianning, Hubei, 29–30 November, pp. 148–151 (2014)
Miao, S., Dou, W., Li, Y.: Research on the fast parallel re-computing for parallel digital terrain analysis, In: Second International Conference on Geo-informatics in Resource Management and Sustainable Ecosystem (GRMSE), CCIS 482, pp. 244–251 (2014)

Download references

Acknowledgments

This work has been substantially supported by the National Natural Science Foundation of China (No. 41171298). We also thank the reviewers’ pertinent comments to provide a qualified paper for readers.

Author information

Authors and Affiliations

School of Computer Science and Technology, Nanjing Normal University, Nanjing, 210023, Jiangsu, China
Wanfeng Dou & Shoushuai Miao
Jiangsu Research Center of Information Security & Privacy Technology, Nanjing, 210023, Jiangsu, China
Wanfeng Dou

Authors

Wanfeng Dou
View author publications
You can also search for this author in PubMed Google Scholar
Shoushuai Miao
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Wanfeng Dou.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Dou, W., Miao, S. A fast parallel re-computation with redundancy mechanism for parallel digital terrain analysis. Cluster Comput 19, 1769–1785 (2016). https://doi.org/10.1007/s10586-016-0644-z

Download citation

Received: 05 July 2015
Revised: 09 September 2016
Accepted: 10 September 2016
Published: 21 September 2016
Issue Date: December 2016
DOI: https://doi.org/10.1007/s10586-016-0644-z

A fast parallel re-computation with redundancy mechanism for parallel digital terrain analysis

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Research on the Fast Parallel Recomputing for Parallel Digital Terrain Analysis

A fault-tolerant computing method for Xdraw parallel algorithm

Exploiting Scalable Parallelism for Remote Sensing Analysis Models by Data Transformation Graph

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

A fast parallel re-computation with redundancy mechanism for parallel digital terrain analysis

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Research on the Fast Parallel Recomputing for Parallel Digital Terrain Analysis

A fault-tolerant computing method for Xdraw parallel algorithm

Exploiting Scalable Parallelism for Remote Sensing Analysis Models by Data Transformation Graph

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation