2160 encoding can be achieved with negligible quality loss. 1616 prediction engine and 88 prediction engine work parallel for prediction and coefficients generating. A reordering interlaced reconstruction is also designed for fully pipelined architecture. It takes only 160 cycles to process one macroblock (MB). Hardware utilization of prediction and reconstruction modules is almost 100%. Furthermore, PE-reusable 88 intra predictor and hybrid SAD & SATD mode decision are proposed to save hardware cost. The design is implemented by 90 nm CMOS technology with 113.2 k gates and can encode 40962160 video sequences at 60 fps with operation frequency of 332 MHz." />
Nothing Special   »   [go: up one dir, main page]



A 530 Mpixels/s Intra Prediction Architecture for Ultra High Definition H.264/AVC Encoder

Gang HE
Dajiang ZHOU
Jinjia ZHOU
Tianruo ZHANG
Satoshi GOTO

Publication
IEICE TRANSACTIONS on Electronics   Vol.E94-C    No.4    pp.419-427
Publication Date: 2011/04/01
Online ISSN: 1745-1353
DOI: 10.1587/transele.E94.C.419
Print ISSN: 0916-8516
Type of Manuscript: Special Section PAPER (Special Section on Circuits and Design Techniques for Advanced Large Scale Integration)
Category: 
Keyword: 
H.264/AVC,  intra prediction,  data dependency,  hardware architecture,  

Full Text: PDF(2.1MB)>>
Buy this Article



Summary: 
Intra coding in H.264/AVC significantly enhances video compression efficiency. However, due to the high data dependency of intra prediction in H.264, both pipelining and parallel processing techniques are limited to be applied. Moreover, it is difficult to get high hardware utilization and throughput because of the long block/MB-level reconstruction loops. This paper proposes a high-performance intra prediction architecture that can support H.264/AVC high profile. The proposed MB/block co-reordering can avoid data dependency and improve pipeline utilization. Therefore, the timing constraint of real-time 40962160 encoding can be achieved with negligible quality loss. 1616 prediction engine and 88 prediction engine work parallel for prediction and coefficients generating. A reordering interlaced reconstruction is also designed for fully pipelined architecture. It takes only 160 cycles to process one macroblock (MB). Hardware utilization of prediction and reconstruction modules is almost 100%. Furthermore, PE-reusable 88 intra predictor and hybrid SAD & SATD mode decision are proposed to save hardware cost. The design is implemented by 90 nm CMOS technology with 113.2 k gates and can encode 40962160 video sequences at 60 fps with operation frequency of 332 MHz.