Abstract
This paper presents a new sorting algorithm that sorts input data elements without any comparison operations between the data—comparison-free sorting. Our algorithm’s time complexity is on the order of O(N) for both single- and multi-threaded CPU and many-core GPU implementations. Our results show speedups on average of 4.6 × , 4 × , and 3.5 × for single-threaded CPU, 8-threaded CPU, and many-threaded GPU implementations, respectively, for input sizes ranging from 27 to 230 elements as compared to common sorting algorithms for a wide variation of element distributions, ranging from all unique elements to a single repeated element. In addition, our proposed algorithm more efficiently utilizes the GPU architecture as compared to a multi-core CPU architecture, showing a speedup of approximately 4 × for input sizes ranging from 27 to 230 elements.
Similar content being viewed by others
References
Busse LM, Chehreghani MH, Buhmann JM (2012) The information content in sorting algorithms. In: IEEE International Symposium on Information Theory Proceedings (ISIT), Cambridge, MA, USA, pp 2746–2750
Canaan C, Garai MS, Daya M (2011) Popular sorting algorithms. World Appl Program 1(1):62–71
Knuth DE (2011) The art of computer programming. Addison-Wesley Professional, Boston
Henglein F (2009) What is a sorting function? J Logic Algebr Program 78(7):552–572
Ionescu MF, Schauser KE (1997) Optimize parallel bitonic sort. In: Proceedings 11th International Parallel Processing Symposium, Genva, pp 303–309
Fuguo D (2010) Several incomplete sort algorithms for getting the median value. Int J Digit Content Technol Appl 4(8):193–198
Cederman D, Tsigas P (2009) GPU-quicksort: a practical quicksort algorithm for graphics processors. ACM J Exp Algorithm (JEA) 14(4):1–22
Zhang R, Wei X, Watanabe T (2013) A sorting-based IO connection assignment for flip-chip designs. In: 2013 IEEE 10th International Conference on ASIC (ASICON), Shenzhen, pp 1–4
Han Y (2004) Deterministic sorting in O(n(log(log n))) time and linear space. J Algorithms Sci Direct Publ 50:602–608
Bentley JL, Sedgewick R (1997) Fast algorithms for sorting and searching strings. In: Proceedings of the Eighth annual ACM-SIAM Symposium on Discrete Algorithms (SODA ‘97), pp 360–369
Lin T-C, Kuo C-C, Hsieh Y-H, Wang B-F (2009) Efficient algorithms for the inverse sorting problem with bound constraints under the norm and the Hamming distance. J Comput Syst Sci 75:451–464
Abdel-hafeez S, Gordon-Ross A, Abubaker S (2016) A comparison-free sorting algorithm on CPUs. In: 13th International Conference on Applied Computing (AC), Mannheim, Germany, October 2016, pp 3–10
Capannini G, Silvestri F, Baraglia R (2012) Sorting on GPUs for large scale datasets: a through comparison. Int Process Manag 48:903–917
Abdel-hafeez S, Gordon-Ross A (2017) An efficient O(N) comparison-free sorting algorithm. J IEEE Trans Very Large Scale Integr (VLSI) Syst 2(5):1930–1942
Herruzo E, Ruiz G, Benavides J, Plata O, (2007) A new parallel sorting algorithm based on odd-even mergesort. In: 15th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, Naples, (PDP 07), pp 18–22
Inoue H, Moriyama T, Komatsu H, Nakatani T (2007) AA-SORT: a new parallel sorting algorithm for multi-core SIMD processors. In: 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007), Brasov, pp 189–198
Satish N, Kim C, Chhugani J, Nguyen A, Lee V, Kim D, Dubey P (2010) Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD Sort. In: SIGMOD’10, Indiana, vol 27, number 3, pp 351–362
Kundeti V, Rajasekaran S (2011) Efficient out-of-core sorting algorithms for the parallel disks model. J Parallel Distrib Comput 71:1427–1433
Satish N, Harris M, Garland M (2009) Designing efficient sorting algorithms for manycore GPUs. In: 23rd IEEE International Symposium on Parallel and Distributed Processing, Rome, pp 1–10
Garg A, Goswami S, Garg V (2016) CutShort: a hybrid sorting technique. In: 2016 International Conference on Computing, Communication and Automation (ICCCA), Noida, pp 139–142
Yan J-T (1999) An improved optimal algorithm for bubble-sorting based non-Manhattan channel routing. IEEE Trans Comput Aided Des Integr Circuits Syst 18(2):163–171
Sareen P (2013) Comparison of sorting algorithms (on the basis of average case). IJARCSSE 3(3):522–532
Xiao L, Zhang X, Kubricht SA (2000) Improving memory performance of sorting algorithms. ACM J Exp Algorithm 5(3):1–20
Bunse C, Hopfner H, Roychoudhury S, Mansour E (2009) Choosing the BEST Sorting Algorithm from Optimal Energy Consumption. In: ICSOFT 2. INSTICC Press, pp 199–206
Mishra AD, Garg D (2008) Selection of best sorting algorithm. Int J Intell Inf Process 2:363–368
Kirk DB, Hwu W-MW (2013) Programming massively parallel processors: a hands-on approach, Ch 6. Morgan Kaufman, San Francisco, pp 140–144
Grama A, Gupta A, Karypis G, Kumar V (2003) Introduction to parallel computing. Pearson Education Limited, Edinburgh
Bryant RE, O’Hallaron DR (2003) Computer systems: a programmer’s perspective. Pearson Education Inc, Edinburgh
Sorting Algorithms Animations. https://www.toptal.com/developers/sorting-algorithms/. Accessed 16 Mar 2016
Sareen Pankaj (2013) Comparison of sorting algorithms (on the basis of average case). IJARCSSE 3(3):522–532
Thorup Mikkel (2002) Randomized sorting in O(n log log n) time and linear space using addition, shift, and bit-wise boolean operations. Elsevier J Algorithms 42(2):205–230
Herruzo E, Ruiz G, Benavides JI, Plata O (2007) A new parallel sorting algorithm based on odd-even mergesort. In: 15th Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP 07), pp 18–22
Inoue H, Moriyama T, Komatsu H, Nakatani T (2007) AA-SORT: a new parallel sorting algorithm for multi-core SIMD processors. In: 16th International Conference on Parallel Architecture and Compilation Techniques (PACT 2007), Brasov, pp 189–198
Cederman D, Tsigas P (2009) GPU-quicksort: a practical quicksort algorithm for graphics processors. ACM J Exp Algorithm (JEA) 14(4):1–22
Satish N, Harris M, Garland M (2009) Designing efficient sorting algorithms for manycore GPUs. In: Proceedings of 23rd IEEE International Parallel and Distributed Processing Symposium, pp 1–10
Cederman D, Tsigas P (2008) On sorting and load balancing on GPUs. Distrib Comput Syst ACM SIGARCH Comput Archit News 36(5):11–18
Khan FG, Khan OU, Montrucchio B, Giaccone P (2011) Analysis of fast parallel sorting algorithms for GPU architectures. Front Inf Technol 2011:173–178
Farmahini-Farahani A, Duwe HJ, Schulte MJ, Compton K (2013) Modular design of high-throughput, low-latency sorting units. IEEE Trans Comput 62(7):1389–1402
Acknowledgements
The authors would like to acknowledge the support of National Science Foundation (CNS-0953447 and CNS-1718033) and Jordan University of Science and Technology for both providing the financial support to complete this work. The authors would like also to thank the computing center of Synchrotron Light for Experimental Science and Applications in the Middle East (SESAME) for providing the GPU platform resources for our simulations. Sincere thanks are given to Computing Engineers Mustafa A. Alzu’bi and Salman Matalgah for their continuous help in maintaining the GPU resources at the center and providing us with unrestricted access.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Abdel-hafeez, S., Gordon-Ross, A. & Abubaker, S. A comparison-free sorting algorithm on CPUs and GPUs. J Supercomput 74, 6369–6400 (2018). https://doi.org/10.1007/s11227-018-2567-3
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11227-018-2567-3