Abstract
The alternate direction method of multipliers (ADMM) algorithm has recently been proposed for LDPC decoding based on linear programming (LP) techniques. Even though it improves the error rate performance compared with usual message passing (MP) techniques, it shows a higher computation complexity. However, a significant step towards LP LDPC decoding scalability and optimization is made possible since the ADMM algorithm acts as an MP decoding one. In this paper, an overview of the ADMM approach and its error correction performances is provided. Then, its computation and memory complexities are evaluated. Finally, optimized software implementations of the decoder to take advantage of multi/many-core device features are described. Optimization choices are discussed and justified according to execution profiling figures and the algorithm’s parallelism levels. Experimentation results show that this LP based decoding technique can reach WiMAX and WRAN standards real time throughput requirements on mid-range devices.
Similar content being viewed by others
Notes
Similar description of the flooding based LDPC decoding for Min-Sum algorithm can be found for instance in [3].
In a program with parallel processing, a relatively few instructions that have to be performed in sequence will have a limiting factor on program speedup such that adding more processors may not make the program run faster.
Source codes are available on http://github.com/
References
Gallager, R.G. (1962). Low-density parity-check codes. IRE Transactions on Information Theory, 8(1), 21–28.
MacKay, D.J., & Neal, R.M. (1996). Near shannon limit performance of low density parity check codes. Electronics Letters, 32(18), 1645–1646.
Le Gal, B., & Jégo, C. (2015). High-throughput multi-core LDPC decoders based on x86 processor. In IEEE transactions on parallel and distributed systems.
Andrade, J., Falcao, G., & Silva, V. (2014). Optimized fast Walsh-Hadamard transform on GPUs for non-binary LDPC decoding. Parallel Computing, Elsevier.
Wu, M., Sun, Y., Wang, G., & Cavallaro, J.R. (2011). Implementation of a high throughput 3GPP turbo decoder on GPU. Journal of Signal Processing Systems, Springer.
Falcão, G., Silva, V., Sousa, L., & Marinho, J. (2008). High coded data rate and multicodeword wiMAX LDPC decoding on the cell/BE. Electronics Letters, 44(24), 1415–1417.
Zhao, J., Zhao, M., Yang, H., Chen, J., Chen, X., & Wang, J. (2011). High performance LDPC decoder on CELL BE for wiMAX system. In Proceedings of the CMC Conference (pp. 278–281).
Gronroos, S., Nybom, K., & Bjorkqvist, J. (2012). Efficient GPU and CPU-based LDPC decoders for long codewords. Journal of Analog Integrated Circuits and Signal Processing, 73, 583–595.
Pan, X., Fan Lu, X., Qi Li, M., & Fang Song, R. (2013). A high throughput LDPC decoder in CMMB based on virtual radio. In Proceedings of the WCNC workshop (pp. 95–99).
Han, X., Niu, K., & He, Z. (2013). Implementation of IEEE 802.11n LDPC codes based on general purpose processors. In Proceedings of the ICCT conference (pp. 218–222).
Le Gal, B., & Jégo, C. (2015). High-Throughput LDPC decoder on Low-Power embedded processors. IEEE Communications Letters, 19, 1861–1864.
Chang, Y.-L. (2010). High-throughput GPU-based LDPC. In Proceedings of the SPIE satellite data compression, communications and processing conference (Vol. 7810).
Falcao, G., Andrade, J., Silva, V., & Sousa, L. (2011). GPU-based DVB-s2 LDPC decoder with high throughput and fast error floor detection. Electronics Letters, 47(9), 542–543.
Falcao, G, Sousa, L., & Silva, V. (2011). Massively LDPC decoding on multicore architectures. IEEE Transactions on Parallel and Distributed Systems, 22(2), 309–322.
Wang, G., Wu, M., Sun, Y., & Cavallaro, J.R. (2011). GPU Accelerated scalable parallel decoding of LDPC codes. In Proceedings of the IEEE asilomar conference on signals, systems, and computers (pp. 2053–2057).
Ji, H., Cho, J., & Sung, W. (2011). Memory access optimized implementation of cyclic and quasi-cyclic LDPC codes on a GPGPU. Journal of Signal Processing Systems, 64, 149–159.
Kang, S., & Moon, J. (2012). Parallel LDPC decoder implementation on GPU based on unbalanced memory coalescing. In Proceedings of the IEEE conference on communications (pp. 3692–3697).
Wang, G., Wu, M., Yin, B., & Cavallaro, J. R. (2013). High throughput low latency LDPC decoding on GPU for SDR systems. In Proceedings of the IEEE GlobalSIP conference (pp. 1258–1261).
Lin, Y., & Niu, W. (2014). High throughput LDPC decoder on GPU. IEEE Communications Letters, 18(2), 344–347.
Le Gal, B., Jégo, C., & Crenne, J. (2014). A high throughput efficient approach for decoding LDPC codes onto GPU devices. IEEE Embedded Systems Letters, 6(2), 29–32.
Feldman, J. (2003). Decoding error-correcting codes via linear programming. Phd thesis, Massachussets Institute of Technology.
Zhang, X., & Siegel, P.H. (2013). Efficient iterative LP decoding of ldpc codes with alternating direction method of multipliers. In IEEE international symposium on information theory (ISIT).
Feldman, J., Wainwright, M., & Karger, D. (2005). Using linear programming to decode binary linear codes. IEEE Transactions on Information Theory, 51, 954–972.
Liu, X., Draper, S., & Recht, B. (2012). Suppressing pseudocodewords by penalizing the objective of LP decoding. In Proceedings of the IEEE information theory workshop (ITW) (pp. 367–371).
Zhang, X., & Siegel, P. (2012). Adaptive cut generation algorithm for improved linear programming decoding of binary linear codes. IEEE Transactions on Information Theory, 58, 6581–6594.
Boyd, S.P., Parikh, N., Chu, E., Peleato, B., & Eckstein, J. (2011). Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3(1), 1–122.
Barman, S., Liu, X., Draper, S.C., & Recht, B. (2013). Decomposition methods for large scale LP decoding. IEEE Transactions on Information Theory, 59(12), 7870–7886.
Zhang, X. (2012). LDPC codes: structural analysis and decoding techniques. PhD thesis, San Diego: University of California.
C++ ADMM Decoder by X. Liu. sites. http://google.com/site/xishuoliu/codes.
Debbabi, I., Le Gal, B., Khouja, N., Tlili, F., & Jégo, C. (2016). Multicore implementation of LDPC decoders based on ADMM algorithm. In Proceedings of the 41st IEEE international conference on acoustics speech and signal processing.
Wei, H., Jiao, X., & Mu, J. (2015). Reduced-complexity linear programming decoding based on ADMM for LDPC codes. IEEE Communications Letters, 19, 909–912.
Falcao, G., Yamagiwa, S., Silva, V., & Sousa, L. (2009). Parallel LDPC decoding on GPUs using a stream-based computing approach. Journal of Computer Science And Technology, 24, 913– 924.
Wang, G., Wu, M., Sun, Y., & Cavallaro, J.R. (2011). A massively parallel implementation of QC-LDPC decoder on GPU. In Proceedings of the symposium on application specific processors (pp. 82–85).
Jiao, X., Wei, H., Mu, J., & Chen, C. (2015). Improved ADMM penalized decoder for irregular low-density parity-check codes. IEEE Communications Letters, 19, 913–916.
Liu, X., & Draper, S.C. (2014). The ADMM penalized decoder for LDPC codes, CoRR, arXiv:1409.5140.
Barman, S., Liu, X., Draper, S.C., & Recht, B. (2012). Decomposition methods for large scale LP decoding, CoRR, arXiv:1204.0556.
Amdahl, G.M. (2007). Computer architecture and amdahl’s law. IEEE Solid-State Circuits Newsletter, 3(12), 4–9.
Zekri, A.S. (2014). Enhancing the matrix transpose operation using intel AVX instruction set extension. International Journal of Computer Science & Information Technology (IJCSIT), 6, 68–78.
Le Gal, B., & Jégo, C. (2016). High-throughput multi-core LDPC decoders based on ×86 processor. IEEE Transactions on Parallel and Distributed Systems, 27, 1373–1386.
Giard, P., Sarkis, G., Leroux, C., Thibeault, C., & Gross, W.J. (2016). Low-latency software polar decoders. Journal of Signal Processing Systems, Springer.
Gal, B.L., Leroux, C., & Jégo, C. (2014). Multi-gb/s software decoding of polar codes. In IEEE transactions on signal processing (pp. 349–359).
Cassagne, A., Tonnellier, T., Leroux, C., Gal, B.L., Aumage, O., & Barthou, D. (2016). Beyond Gbps turbo decoder on multi-core CPUs. In Proceedings of the 9th international symposium on turbo codes & iterative information processing (ISTC’16). France.
Chapman, B., Jost, G., & Van Der Pas, R. (2008). Using OpenMP: portable shared memory parallel programming. The MIT Press.
Li, R. (2013). A multi-standard efficient column-layered LDPC decoder for software defined radio on GPUs. In Proceedings of the SPAWC workshop (pp. 724–728).
Wasson, M., & Draper, S.C. (2015). Hardware based projection onto the parity polytope and probability simplex. In 49th asilomar conference on signals systems and computers.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Debbabi, I., Le Gal, B., Khouja, N. et al. Multicore and Manycore Implementations of ADMM-based Decoders for LDPC Decoding. J Sign Process Syst 90, 1551–1567 (2018). https://doi.org/10.1007/s11265-017-1284-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-017-1284-0