research-article

Open access

Versatility of extended subwords and the matrix register file

Authors:

Asadollah Shahbahrami,

Ben Juurlink,

Stamatis VassiliadisAuthors Info & Claims

ACM Transactions on Architecture and Code Optimization (TACO), Volume 5, Issue 1

Article No.: 5, Pages 1 - 30

https://doi.org/10.1145/1369396.1369401

Published: 29 May 2008 Publication History

PDF eReader

Abstract

Extended subwords and the matrix register file (MRF) are two micro architectural techniques that address some of the limitations of existing SIMD architectures. Extended subwords are wider than the data stored in memory. Specifically, for every byte of data stored in memory, there are four extra bits in the media register file. This avoids the need for data-type conversion instructions. The MRF is a register file organization that provides both conventional row-wise, as well as column-wise, access to the register file. In other words, it allows to view the register file as a matrix in which corresponding subwords in different registers corresponds to a column of the matrix. It was introduced to accelerate matrix transposition which is a very common operation in multimedia applications. In this paper, we show that the MRF is very versatile, since it can also be used for other permutations than matrix transposition. Specifically, it is shown how it can be used to provide efficient access to strided data, as is needed in, e.g., color space conversion. Furthermore, it is shown that special-purpose instructions (SPIs), such as the sum-of-absolute differences (SAD) instruction, have limited usefulness when extended subwords and a few general SIMD instructions that we propose are supported, for the following reasons. First, when extended subwords are supported, the SAD instruction provides only a relatively small performance improvement. Second, the SAD instruction processes 8-bit subwords only, which is not sufficient for quarter-pixel resolution nor for cost functions used in image and video retrieval. Results obtained by extending the SimpleScalar toolset show that the proposed techniques provide a speedup of up to 3.00 over the MMX architecture. The results also show that using, at most, 13 extra media registers yields an additional performance improvement ranging from 1.38 to 1.57.

References

[1]

Austin, T., Larson, E., and Ernst, D. 2002. SimpleScalar: An infrastructure for computer system modeling. IEEE Comput. 35, 2, 59--67.

Abstract

References

Cited By

Index Terms

Recommendations

Matrix register file and extended subwords: two techniques for embedded media processors

Accelerating Color Space Conversion Using Extended Subwords and the Matrix Register File

Multiple-banked register file architectures

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations