Abstract
This paper describes an application-specific instruction set for a configurable processor to accelerate motion-compensated frame rate conversion (MC-FRC) algorithms based on block motion estimation (BME). The paper shows that the key to achieve very high performance when creating new instructions is to leverage, at the same time, parallel computations, data reuse, and efficient cache use. This is supported by concrete examples that demonstrate how it can be done in the case of the two algorithms considered. The new instructions are used to implement two BME algorithms: one implements the full search (FS) block matching algorithm (BMA), while the other implements the One-Dimensional Full Search (ODFS) BMA. The obtained acceleration factors exceed one hundred for the MC-FRC algorithm embedding the FS algorithm and twenty for the ODFS algorithm. The results show that getting such global acceleration is the consequence of combining parallel computations, data reuse, and efficient cache use, not of only one of them.
Similar content being viewed by others
References
Hilman, K., Park, H. W., & Kim, Y. (2000). Using motion compensated frame-rate conversion for the correction of 3:2 pulldown artifacts in video sequences. IEEE Transactions on Circuits and Systems for Video Technology, 10, 869–877.
Dufaux, F., & Moscheni, F. (1995). Motion estimation techniques for digital TV: a review and a new contribution. Proceedings of the IEEE, 83, 858–876.
Xtensa LX [Online], http://www.tensilica.com.
Chen, M. J., Chen, L. G., & Chiueh, T. D. (1994). One-dimensional full search motion estimation algorithm for video coding. IEEE Transactions on Circuits and Systems for Video Technology, 4, 504–509.
Momcilovic, S., Dias, T., Roma, N., & Sousa, L. (2006). Application specific instruction set processor for adaptive video motion estimation. In Digital System Design : Architectures, Methods and Tools, 2006, DSD 2006, 9th EUROMICRO Conference on, pp. 160–167.
Beucher, N., Bélanger, N., Savaria, Y., & Bois, G. (2006). Motion compensated frame rate conversion using a specialized instruction set processor. IEEE Workshop on Signals Processing Systems, SIPS2006, 130–135.
Po, L. -M., & Ma, W. -C. (1996). A novel four-step search algorithm for fast block motion estimation. IEEE Transactions on Circuits and Systems for Video Technology, 6, 313–317.
Chau, L. -P., & Jing, X. (2003). Efficient three-step search algorithm for block motion estimation in video coding. In Acoustics, Speech, and Signal Processing, 2003, Proceedings, (ICASSP ’03), 2003 IEEE International Conference on, vol. 3, pp. III-421–4.
Yang, S., Wolf, W., & Vijaykrishnan, N. (2005). Power and performance analysis of motion estimation based on hardware and software realizations. IEEE Transactions on Computers, 54, 714–726.
Castagno, R., Haavisto, P., & Ramponi, G. (1996). A method for motion adaptive frame rate up-conversion. IEEE Transactions on Circuits and Systems for Video Technology, 6, 436–446.
Choi, B. -T., Lee, S. -H., & Ko, S. -J. (2000). New frame rate up-conversion using bi-directional motion estimation. IEEE Transactions on Consumer Electronics, 46, 603–609.
Ha, T., Lee, S., & Kim, J. (2004). Motion compensated frame rate conversion by overlapped block-based motion estimation algorithm. In IEEE International Symposium on Consumer Electronics, pp. 345–350.
Huang, A. -M., & Nguyen, T. (2006). A novel motion compensated frame interpolation based on block-merging and residual energy. In Multimedia Signal Processing, IEEE 8th Workshop, pp, 395–398.
Lagendijk, R. L., & Sezan, M. I. (1992). Motion compensated frame rate conversion of motion pictures. In Acoustics, Speech, and Signal Processing, 1992 IEEE International Conference on, ICASSP-92, vol. 3, 1992, pp. 453–46.
Tensilica Inc, Xtensa LX microprocessor data book for Xtensa LX processor cores, 2006.
Design compiler, [Online], http://www.synopsys.com/products/logic/design_compiler.html.
Hennessy, J. L., & Patterson, D. A. (1991). Computer architecture: a quantitative approach. Palo Alto, CA: Morgan Kaufmann.
Acknowledgements
This work was financially supported by Gennum Corp., the Natural Sciences and Engineering Research Council of Canada and the CFI/SOCRN. The research was done using design tools from Tensilica and Synopsys as distributed by CMC Microsystems.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Beucher, N., Bélanger, N., Savaria, Y. et al. High Acceleration for Video Processing Applications Using Specialized Instruction Set Based on Parallelism and Data Reuse. J Sign Process Syst Sign Image Video Technol 56, 155–165 (2009). https://doi.org/10.1007/s11265-008-0230-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11265-008-0230-6