Nothing Special   »   [go: up one dir, main page]

CN106254870B - Video encoding method, system and computer-readable recording medium using adaptive color conversion - Google Patents

Video encoding method, system and computer-readable recording medium using adaptive color conversion Download PDF

Info

Publication number
CN106254870B
CN106254870B CN201610357374.8A CN201610357374A CN106254870B CN 106254870 B CN106254870 B CN 106254870B CN 201610357374 A CN201610357374 A CN 201610357374A CN 106254870 B CN106254870 B CN 106254870B
Authority
CN
China
Prior art keywords
coding
size
mode
unit
color space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610357374.8A
Other languages
Chinese (zh)
Other versions
CN106254870A (en
Inventor
张耀仁
林俊隆
涂日升
林敬杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial Technology Research Institute ITRI
Original Assignee
Industrial Technology Research Institute ITRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US14/757,556 external-priority patent/US20160360205A1/en
Priority claimed from TW105114323A external-priority patent/TWI597977B/en
Application filed by Industrial Technology Research Institute ITRI filed Critical Industrial Technology Research Institute ITRI
Publication of CN106254870A publication Critical patent/CN106254870A/en
Application granted granted Critical
Publication of CN106254870B publication Critical patent/CN106254870B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/12Selection from among a plurality of transforms or standards, e.g. selection between discrete cosine transform [DCT] and sub-band transform or selection between H.263 and H.264
    • H04N19/122Selection of transform size, e.g. 8x8 or 2x4x8 DCT; Selection of sub-band transforms of varying structure or type
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N9/00Details of colour television systems
    • H04N9/64Circuits for processing colour signals

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Discrete Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

一种视频编码方法及系统,该方法包括以下步骤。接收一原始视频画面(source video frame)。分割原始视频画面为一编码树单元(coding tree unit)。从编码树单元决定一编码单元(coding unit)。启用或禁用编码单元的一编码模式(coding mode)。若启用编码模式,则在启用的编码模式判断是否估计一转换单元(transform unit)的尺寸。在启用的编码模式决定编码单元的转换单元。编码单元的尺寸为NxN。

Figure 201610357374

A video encoding method and system, the method includes the following steps. Receive a source video frame. Split the original video picture into a coding tree unit (coding tree unit). A coding unit (coding unit) is determined from the coding tree unit. Enable or disable a coding mode of the coding unit. If the encoding mode is enabled, it is determined whether to estimate the size of a transformation unit (transform unit) in the enabled encoding mode. The enabled encoding mode determines the encoding unit's translation unit. The size of the coding unit is NxN.

Figure 201610357374

Description

采用自适性色彩转换的视频编码方法、系统和计算机可读取 记录介质Video encoding method, system and computer readable using adaptive color conversion recording medium

技术领域technical field

本公开涉及视频编码与解码方法及系统。The present disclosure relates to video encoding and decoding methods and systems.

背景技术Background technique

对于高质量图像的需求逐渐增加。随着4K及8K等视频规格的来临,极需提升视频编码与解码效率。此外,消费者期待能够通过各种传输媒介来传输与接收高质量图像。举例来说,消费者希望能够通过网络于携带式装置(如智能手机、平板计算机、笔记型计算机)及家用电视与计算机上观看高质量图像。消费者也希望能够在视频会议及屏幕共享的过程中显示高质量图像。The demand for high-quality images is gradually increasing. With the advent of video standards such as 4K and 8K, it is extremely necessary to improve the efficiency of video encoding and decoding. In addition, consumers expect to be able to transmit and receive high-quality images through a variety of transmission media. For example, consumers want to be able to view high-quality images on portable devices (eg, smart phones, tablet computers, notebook computers) and home TVs and computers via the Internet. Consumers also expect high-quality images to be displayed during video conferencing and screen sharing.

高效率视频编码标准(High Efficiency Video Coding,HEVC)H.265在改进视频压缩的编码与解码效能上提供了一个新的标准。相较于原先的AVC (Advanced VideoCoding)标准,由ISO/IEC JTC 1/SC 29/WG 11 MPEG (Moving Picture Experts Group)及ITU-T SG16 VCEG(Video Coding Experts Group)所建立的HEVC能够降低压缩高质量视频的数据率。AVC标准亦称为H.264。High Efficiency Video Coding (HEVC) H.265 provides a new standard for improving the encoding and decoding performance of video compression. Compared with the original AVC (Advanced Video Coding) standard, HEVC established by ISO/IEC JTC 1/SC 29/WG 11 MPEG (Moving Picture Experts Group) and ITU-T SG16 VCEG (Video Coding Experts Group) can reduce compression Data rate for high quality video. The AVC standard is also known as H.264.

HEVC利用帧间预测技术(Inter prediction)及帧内预测技术(Intraprediction)等各种编码工具来压缩视频。帧间预测技术利用视频串流的不同视频画面之间的时间冗余(temporal redundancies)来压缩视频数据。举例来说,含有相似内容的已编码及已解码的视频画面可用来编码目前的视频画面。这些已编码及已解码的视频画面可以用来预测目前视频画面的编码区域。相对地,帧内预测技术仅利用目前编码视频画面的内部数据来压缩视频数据。帧内预测技术并不使用不同视频画面的时间冗余。举例来说,目前视频画面是利用同一画面的另一部分来进行编码。帧内预测技术包括35种帧内模式,包含平面模式(Planar mode)、DC模式及33种定向模式(directional modes)。HEVC utilizes various coding tools such as Inter prediction and Intraprediction to compress video. Inter prediction techniques utilize temporal redundancies between different video pictures of a video stream to compress video data. For example, encoded and decoded video pictures containing similar content may be used to encode the current video picture. These encoded and decoded video pictures can be used to predict the encoding region of the current video picture. In contrast, intra-prediction techniques utilize only the internal data of currently encoded video pictures to compress video data. Intra prediction techniques do not use the temporal redundancy of different video frames. For example, current video pictures are encoded using another part of the same picture. The intra-frame prediction technology includes 35 intra-frame modes, including a Planar mode, a DC mode, and 33 directional modes.

相较于AVC标准,HEVC标准对各个输入视频画面采用扩张分割技术 (expansivepartitioning and dividing)。AVC标准在编码及解码时仅利用输入视频画面的大区块(macroblock)进行分割。相反地,HEVC标准可以分割输入视频画面成不同尺寸的数据单元及区块,相关说明如后。相较于AVC标准, HEVC标准对动态、多细节及多边缘的视频画面的编码及解码程序提供了更多的弹性。Compared with the AVC standard, the HEVC standard adopts an expansion partitioning and dividing technique for each input video picture. The AVC standard uses only macroblocks of the input video picture for segmentation during encoding and decoding. Conversely, the HEVC standard can divide the input video frame into data units and blocks of different sizes, as described below. Compared with the AVC standard, the HEVC standard provides more flexibility for the encoding and decoding procedures of dynamic, multi-detail and multi-edge video images.

一些能够改善视频编码程序的编码工具亦列入于HEVC标准中。此些编码工具被称为编码扩展(coding extensions)。屏幕内容编码扩展(Screen Content Codingextension,SCC extension)专注于改善HEVC标准下的视频屏幕内容的处理效能。屏幕内容为图案、文字或动画所成像(render)的视频,而不是照相机所提取的视频场景。成像的图案、文字或动画可以是动态或静态,且可以提供于照相机所提取的视频场景内的视频。SCC的应用实例可以包含屏幕镜射(Screen mirroring)、云端游戏(cloud gaming)、无线显示内容 (wireless display of content)、远端计算机存取时的显示(displays generatedduring remote computer desktop access)及屏幕共享(screen sharing)(例如是视频会议的即时屏幕共享)。Some coding tools that can improve the video coding process are also included in the HEVC standard. Such coding tools are called coding extensions. The Screen Content Coding extension (SCC extension) focuses on improving the processing performance of video screen content under the HEVC standard. The content of the screen is a video rendered by a pattern, text or animation, rather than a video scene extracted by a camera. The imaged patterns, text or animation can be dynamic or static and can be provided in video within the video scene captured by the camera. Application examples of SCC can include screen mirroring, cloud gaming, wireless display of content, displays generated during remote computer desktop access, and screen sharing (screen sharing) (eg instant screen sharing for video conferencing).

SCC内的一编码工具为自适性色彩转换(adaptive color transform,ACT)。 ACT为应用于编码单元(coding unit,CU)的残差像素样本(residue pixel samples)的色彩空间转换。对特定的色彩空间而言,已存在一编码单元(CU) 的一像素的色彩元素(colorcomponents)的相关性。当像素的色彩元素的相关性高时,像素执行ACT可以通过去相关性(de-correlating)来帮助相关的色彩元素集中能量。这种集中能量的作法能够提高编码效率,并降低编码成本。因此,ACT能够在HEVC编码过程中增进编码效能。An encoding tool within the SCC is adaptive color transform (ACT). ACT is a color space transformation applied to residual pixel samples (residue pixel samples) of a coding unit (CU). For a particular color space, correlations exist for color components of a pixel of a coding unit (CU). When the correlation of color elements of a pixel is high, the pixel performing ACT can help related color elements concentrate energy by de-correlating. This way of concentrating energy can improve coding efficiency and reduce coding cost. Therefore, ACT can improve coding performance during HEVC coding.

然而,在编码过程中,需要额外的码率失真函数(rate distortionoptimization,RDO)来评估是否启用ACT。RDO用来评估码率失真(rate distortion,RD)的成本。这些评估过程可能会增加编码复杂度及编码时间。再者,当像素的色彩元素已经去相关时,ACT可能就不是必须的。在这种情况下,由于执行ACT的成本高于编码的效益,进一步对色彩元素进行的去相关性程序可能无法带来任何好处。However, during encoding, an additional rate distortion optimization (RDO) function is required to evaluate whether ACT is enabled. RDO is used to estimate the cost of rate distortion (RD). These evaluation processes may increase coding complexity and coding time. Furthermore, ACT may not be necessary when the color elements of the pixels have been decorrelated. In this case, further decorrelation procedures on color elements may not bring any benefit since the cost of performing ACT outweighs the benefits of encoding.

发明内容SUMMARY OF THE INVENTION

根据本公开的一方面,提供一种视频编码方法。视频编码方法包括以下步骤。接收一原始视频画面(source video frame)。分割原始视频画面为一编码树单元(coding treeunit)。从编码树单元决定一编码单元(coding unit)。启用或禁用编码单元的一编码模式(coding mode)。若启用编码模式,则在启用该编码模式判断是否估计一转换单元(transform unit)的尺寸。在启用的编码模式决定编码单元的转换单元。编码单元的尺寸为NxN。According to an aspect of the present disclosure, a video encoding method is provided. The video encoding method includes the following steps. A source video frame is received. The original video picture is divided into a coding tree unit. A coding unit is determined from the coding tree unit. Enables or disables a coding mode of the coding unit. If the encoding mode is enabled, it is determined whether to estimate the size of a transform unit when the encoding mode is enabled. Determines the translation unit of the coding unit in the enabled coding mode. The size of the coding unit is NxN.

根据本公开的另一方面,提供一种视频编码系统。视频编码系统包括一存储器及一处理器。存储器用以存储一组指令。处理器用以执行此组指令。此组指令包括以下步骤。接收一原始视频画面(source video frame)。分割原始视频画面为一编码树单元(codingtree unit)。从编码树单元决定一编码单元(coding unit)。启用或禁用编码单元的一编码模式(coding mode)。若启用编码模式,则在启用的编码模式判断是否估计一转换单元(transform unit) 的尺寸。在启用的编码模式决定编码单元的转换单元。编码单元的尺寸为NxN。According to another aspect of the present disclosure, a video encoding system is provided. The video encoding system includes a memory and a processor. Memory is used to store a set of instructions. The processor is used to execute the set of instructions. This set of instructions includes the following steps. A source video frame is received. The original video picture is divided into a coding tree unit. A coding unit is determined from the coding tree unit. Enables or disables a coding mode of the coding unit. If the encoding mode is enabled, it is determined whether to estimate the size of a transform unit in the enabled encoding mode. Determines the translation unit of the coding unit in the enabled coding mode. The size of the coding unit is NxN.

根据本公开的另一方面,提供一种非暂态计算机可读取记录介质。非暂态计算机可读取记录介质用以存储一组指令。此组指令由一或多个处理器执行,以执行一视频编码方法。此视频编码方法包括以下步骤。接收一原始视频画面(source video frame)。分割原始视频画面为一编码树单元(coding tree unit)。从编码树单元决定一编码单元(codingunit)。启用或禁用编码单元的一编码模式(coding mode)。若启用编码模式,则在启用的编码模式判断是否估计一转换单元(transform unit)的尺寸。在启用的编码模式决定编码单元的转换单元。编码单元的尺寸为NxN。According to another aspect of the present disclosure, a non-transitory computer-readable recording medium is provided. A non-transitory computer-readable recording medium stores a set of instructions. The set of instructions are executed by one or more processors to perform a video encoding method. This video encoding method includes the following steps. A source video frame is received. The original video picture is divided into a coding tree unit. A coding unit is determined from the coding tree unit. Enables or disables a coding mode of the coding unit. If the encoding mode is enabled, it is determined whether to estimate the size of a transform unit in the enabled encoding mode. Determines the translation unit of the coding unit in the enabled coding mode. The size of the coding unit is NxN.

为了对本公开的上述及其他方面有更佳的了解,下文特举多个实施例,并配合附图,作详细说明如下:In order to have a better understanding of the above-mentioned and other aspects of the present disclosure, a number of embodiments are exemplified below, and are described in detail as follows in conjunction with the accompanying drawings:

附图说明Description of drawings

图1A~1J绘示视频画面及根据本公开数个实施例的相关的分割。1A-1J illustrate video frames and related segmentations according to several embodiments of the present disclosure.

图2绘示本公开的视频编码器。2 illustrates a video encoder of the present disclosure.

图3说明根据本公开一实施例的编码方法。FIG. 3 illustrates an encoding method according to an embodiment of the present disclosure.

图4说明根据本公开另一实施例的编码方法。FIG. 4 illustrates an encoding method according to another embodiment of the present disclosure.

图5说明根据本公开另一实施例的编码方法。FIG. 5 illustrates an encoding method according to another embodiment of the present disclosure.

图6说明根据本公开另一实施例的编码方法。FIG. 6 illustrates an encoding method according to another embodiment of the present disclosure.

图7说明非444色度格式的IPM的演算流程。FIG. 7 illustrates the calculation flow of IPM in non-444 chroma format.

图8绘示执行本公开的编码与解码方法的系统。FIG. 8 illustrates a system implementing the encoding and decoding methods of the present disclosure.

【符号说明】【Symbol Description】

101:视频画面(video frame)101: Video frame (video frame)

102:编码树单元(coding tree unit,CTU)102: coding tree unit (coding tree unit, CTU)

103:亮度编码树区块(luma coding tree block,luma CTB)103: Luma coding tree block (luma coding tree block, luma CTB)

104:Cb CTB104: Cb CTB

105:Cr CTB105: Cr CTB

106、111:相关说明106, 111: related instructions

107-1、107-2、107-3、107-4:亮度编码区块(luma coding block,luma CB)107-1, 107-2, 107-3, 107-4: Luma coding block (luma coding block, luma CB)

108:编码单元(Coding unit,CU)108: Coding unit (Coding unit, CU)

109:Cb CB109: Cb CB

110:Cr CB110: Cr CB

112:亮度预测区块(luma prediction block,PB)112: Luma prediction block (luma prediction block, PB)

113-1、113-2、113-3、113-4:转换区块(transform block,TB)113-1, 113-2, 113-3, 113-4: Transform block (TB)

114:转换单元(Transform unit,TU)114: Transformation unit (Transform unit, TU)

200:视频编码器200: Video encoder

202:画面分割模块(Frame Dividing Module)202: Frame Dividing Module

204:帧间预测启用ACT模块(Inter Prediction enabling adaptive colortransformation Module)204: Inter Prediction enabling adaptive colortransformation Module

206:帧间预测禁用ACT模块(Inter Prediction disabling ACT Module)206: Inter Prediction disabling ACT Module

208:画面寄存器(Frame Buffer)208: Picture register (Frame Buffer)

210:模式决定模块(Mode Decision Module)210: Mode Decision Module

212:帧内预测启用ACT模块(Intra Prediction enabling ACT Module)212: Intra Prediction enabling ACT Module (Intra Prediction enabling ACT Module)

214:帧内预测禁用ACT模块(Intra Prediction disabling ACT Module)214: Intra Prediction disabling ACT Module

216、218:加总模块(Summing Module)216, 218: Summing Module

220:切换器220: Switcher

222:自适性色彩转换(ACT)模块222: Adaptive Color Conversion (ACT) module

224:CCP、转换及量化模块(CCP,Transform,and Quantization Module)224: CCP, Transform, and Quantization Module

226:熵编码模块(Entropy Coding Module)226: Entropy Coding Module

228:逆运算CCP、转换及量化模块(Inverse CCP,Transform,and QuantizationModule)228: Inverse CCP, Transform, and Quantization Module (Inverse CCP, Transform, and QuantizationModule)

230:切换器230: Switcher

232:逆运算ACT模块(Inverse ACT Module)232: Inverse ACT Module (Inverse ACT Module)

300、400、500、600、700、800:编码方法300, 400, 500, 600, 700, 800: Coding method

304:分量相关性分析(component correlation analysis)304: Component correlation analysis

306:概略模式决定(Rough mode decision)306: Rough mode decision

308:结束308: end

310:码率失真函数模式决定(rate distortion optimization mode decision,RDO mode decision)310: Rate distortion optimization mode decision (RDO mode decision)

311:色度格式是否为非444(non-444)的判断311: Judging whether the chroma format is non-444 (non-444)

312:CU尺寸是否小于临界值T1的判断312: Judgment of whether the CU size is smaller than the critical value T1

314:TU尺寸决定(TU size decision)314: TU size decision

316:色度模式决定(chroma mode decision)316: chroma mode decision

402:CU尺寸是否小于临界值T2的判断402: Judgment of whether the CU size is smaller than the critical value T2

702:非暂态计算机可读取介质702: Non-transitory computer readable media

704:处理器704: Processor

具体实施方式Detailed ways

以下将搭配附图详细地说明示例性的实施例。在下面描述的附图中,除非另有说明,在不同附图的相同标号代表相同或近似的元件。以下提出的实施例并非代表本公开的所有实施情况。事实上,这些实施例仅仅是对应于权利要求书的系统与方法的一些实例。Exemplary embodiments will be described in detail below in conjunction with the accompanying drawings. In the drawings described below, unless otherwise indicated, the same reference numbers in different drawings represent the same or similar elements. The examples presented below are not representative of all implementations of the present disclosure. In fact, these embodiments are merely examples of systems and methods corresponding to the claims.

图1A~1J说明根据本公开的实施例的视频画面及其相关的分割。1A-1J illustrate a video picture and its associated segmentation according to embodiments of the present disclosure.

图1A绘示视频画面101。视频画面101包括数个像素。视频画面101被分割为数个编码树单元(coding tree units,CTUs)102。每个CTU 102的尺寸是根据L个垂直样本及L个水平样本(LxL)来决定。每个样本于CTU的不同像素位置对应于一像素值。举例来说,L可以是16、32、或64。像素位置可以是像素于CTU所在的位置或像素之间的位置。当像素位置是像素之间的位置,像素值可以是像素位置附近的一或多个像素的内插值。各个CTU 102 包括亮度编码树区块(luma coding tree block,luma CTB)、色度编码树区块 (chroma CTB)及相关说明(associated syntax)。FIG. 1A shows a video frame 101 . The video picture 101 includes several pixels. The video picture 101 is divided into a number of coding tree units (CTUs) 102 . The size of each CTU 102 is determined according to L vertical samples and L horizontal samples (LxL). Each sample corresponds to a pixel value at a different pixel location of the CTU. For example, L can be 16, 32, or 64. The pixel position may be the position of the pixel at the position where the CTU is located or the position between the pixels. When the pixel locations are locations between pixels, the pixel value may be an interpolated value of one or more pixels near the pixel location. Each CTU 102 includes a luma coding tree block (luma CTB), a chroma coding tree block (chroma CTB), and associated syntax.

图1B绘示数个CTB可以被包含于图1A的一个CTU 102中。举例来说, CTU 102可以包含亮度CTB(luma CTB)103、色度CTB(chroma CTB)(含Cb CTB 104Cr CTB 105)。CTU 102也可以包括相关说明(associated syntax) 106。Cb CTB 104为蓝色色差CTB(bluedifference chroma component CTB),其表示CTB在蓝色的变化。Cr CTB 105为红色色差CTB(red difference chroma component CTB),其表示CTB在红色的变化。相关说明106包含亮度CTB 103、 Cb CTB 104及Cr CTB 105如何被编码的信息、以及亮度CTB 103、Cb CTB104及Cr CTB 105的进一步分割。CTB 103、Cb CTB 104及Cr CTB 105的尺寸可以相同于CTU102的尺寸。或者,亮度CTB 103的尺寸可以相同于CTU 102的尺寸,但Cb CTB 104及Cr CTB105的尺寸可以小于CTU 102的尺寸。FIG. 1B shows that several CTBs may be included in one CTU 102 of FIG. 1A. For example, CTU 102 may include luma CTB (luma CTB) 103, chroma CTB (chroma CTB) (including Cb CTB 104Cr CTB 105). The CTU 102 may also include associated syntax 106 . The Cb CTB 104 is a bluedifference chroma component CTB (bluedifference chroma component CTB), which represents the change of the CTB in blue. The Cr CTB 105 is a red difference chroma component CTB (red difference chroma component CTB), which represents the change of the CTB in red. The related description 106 includes information on how the luma CTB 103 , Cb CTB 104 and Cr CTB 105 are encoded, as well as further partitioning of the luma CTB 103 , Cb CTB 104 and Cr CTB 105 . The size of CTB 103 , Cb CTB 104 and Cr CTB 105 may be the same as the size of CTU 102 . Alternatively, the size of luma CTB 103 may be the same as the size of CTU 102 , but the size of Cb CTB 104 and Cr CTB 105 may be smaller than the size of CTU 102 .

帧内预测(intra prediction)、帧间预测(inter prediction)及其他等编码工具运作于编码区块(coding blocks,CBs)上。为了决定编码的程序是要采用帧内预测还是帧间预测,CTB可以被分割为一或多个CB。CTB分割为 CB的程序是根据四分树分割(quad-treepartitioning)技术。因此,CTB可以分割为四个CB,各个CB可以再分割为四个CB。根据CTB的尺寸,可以继续进行这样的分割程序。Coding tools such as intra prediction, inter prediction and others operate on coding blocks (CBs). In order to decide whether the coding procedure is to use intra-frame prediction or inter-frame prediction, the CTB can be divided into one or more CBs. The procedure for partitioning CTB into CB is based on quad-tree partitioning techniques. Therefore, the CTB can be divided into four CBs, and each CB can be further divided into four CBs. Depending on the size of the CTB, such a segmentation procedure may continue.

图1C绘示图1B的亮度CTB 103被分割为一或多个亮度CB 107-1、107-2、 107-3或107-4。以64x64的亮度CTB为例,对应的亮度CB 107-1、107-2、 107-3或107-4可以是NxN的尺寸,例如是64x64、32x32、16x16或8x8。在图1C中,亮度CTB 103的尺寸为64x64。而亮度CTB103的尺寸可以为32x32 或16x16。FIG. 1C shows the luma CTB 103 of FIG. 1B divided into one or more luma CBs 107-1, 107-2, 107-3, or 107-4. Taking a luminance CTB of 64x64 as an example, the corresponding luminance CB 107-1, 107-2, 107-3 or 107-4 may be NxN in size, for example, 64x64, 32x32, 16x16 or 8x8. In Figure 1C, the size of the luma CTB 103 is 64x64. The size of the luminance CTB 103 may be 32x32 or 16x16.

图1D绘示图1B的亮度CTB 103进行四分树分割的一实例,其中亮度 CTB 103分割为图1C的亮度CB 107-1、107-2、107-3或107-4。在图1D中,亮度CTB 103的尺寸为64x64。然而,亮度CTB 103的尺寸也可以是32x32 或16x16。FIG. 1D illustrates an example of quadtree partitioning performed by the luma CTB 103 of FIG. 1B , wherein the luma CTB 103 is partitioned into the luma CBs 107-1, 107-2, 107-3, or 107-4 of FIG. 1C. In Figure ID, the size of the luma CTB 103 is 64x64. However, the size of the luminance CTB 103 may also be 32x32 or 16x16.

在图1D中,亮度CTB 103分割为四个32x32的亮度CB 107-2。各个32x32 的亮度CB可以更分割为四个16x16的亮度CB 107-3。各个16x16的亮度CB 可以更分割为四个8x8的亮度CB 107-4。In Figure ID, the luma CTB 103 is divided into four 32x32 luma CBs 107-2. Each 32x32 luminance CB can be further divided into four 16x16 luminance CBs 107-3. Each 16x16 luminance CB can be further divided into four 8x8 luminance CBs 107-4.

编码单元(Coding unit,CU)用以编码CB。CTB可以包括唯一一个 CU、或者分割为数个CU。因此CU的尺寸也可以是NxN,例如是64x64、 32x32、16x16或8x8。各个CU包括一个亮度CB、两个色度CB及相关说明。于编码及解码程序中产生的残差CU的尺寸可相同于其对应的CU的尺寸。A coding unit (Coding unit, CU) is used to encode the CB. A CTB may include a single CU, or be divided into several CUs. Therefore, the size of the CU can also be NxN, such as 64x64, 32x32, 16x16 or 8x8. Each CU includes one luma CB, two chrominance CBs, and related descriptions. The size of the residual CU generated in the encoding and decoding process may be the same as the size of its corresponding CU.

图1E绘示CB(图1C的亮度CB 107-1)的示意图,此些CB可以是CU 108的一部分。举例来说,CU 108可以包括亮度CB 107-1及色度CB(Cb CB 109)及色度CB(Cr CB 110)。CU108可以包括相关说明111。相关说明111 包含如何对亮度CB 107-1、Cb CB 109及Cr CB110进行编码的信息,例如是四分树信息的说明(亮度CB及色度CB的尺寸、位置与进一步的分割)。各个CU 108可于亮度CB 107-1、Cb CB 109及Cr CB 110具有相关的预测区块(prediction blocks,PBs)。预测区块组合成预测单元(prediction units,PUs)。FIG. 1E shows a schematic diagram of CBs (luminance CB 107 - 1 of FIG. 1C ), which may be part of CU 108 . For example, CU 108 may include luma CB 107-1 and chrominance CB (Cb CB 109) and chrominance CB (Cr CB 110). The CU 108 may include associated instructions 111 . The related description 111 contains information on how to encode the luma CB 107-1, Cb CB 109 and Cr CB 110, such as a description of the quadtree information (size, position and further partitioning of the luma CB and chroma CB). Each CU 108 may have associated prediction blocks (PBs) at luma CB 107-1, Cb CB 109, and Cr CB 110. The prediction blocks are grouped into prediction units (PUs).

图1F绘示图1D的CB 107-1分割为亮度PB 112的各种可能情况。亮度 CB 107-1例如是根据亮度CB 107-1的不同区域的可预测性来分割为亮度PB 112。举例来说,亮度CB107-1可以包含单一个亮度PB 112,其尺寸相同于亮度CB 107-1。或者,亮度CB 107-1可以垂直地或水平地分割为两个偶数亮度PB 112。或者亮度CB 107-1可以垂直地或水平地分割为四个亮度PB 112。需说明的是图1F仅仅作为示例。在HEVC标准下的任何分割为PB的方式都属于本公开的范围。图1F所绘示将亮度CB 107-1分割为亮度PB 112的方式是互斥的。举例来说,在HEVC的帧内预测模式中,64x64、32x32及16x16 的CB可能被分割为单一个PB,其尺寸相同于CB。然而,8x8的CB可能被分割为单一个8x8的PB或四个4x4的PB。FIG. 1F illustrates various possible scenarios of partitioning of CB 107-1 into luminance PB 112 of FIG. 1D. The luminance CB 107-1 is, for example, divided into luminance PBs 112 according to the predictability of different regions of the luminance CB 107-1. For example, luminance CB 107-1 may include a single luminance PB 112 that is the same size as luminance CB 107-1. Alternatively, the luminance CB 107-1 may be divided into two even luminance PBs 112 vertically or horizontally. Or the luminance CB 107-1 may be divided into four luminance PBs 112 vertically or horizontally. It should be noted that FIG. 1F is only an example. Any way of partitioning into PBs under the HEVC standard is within the scope of this disclosure. The manner in which the luminance CB 107-1 is divided into the luminance PB 112 shown in FIG. 1F is mutually exclusive. For example, in the intra prediction mode of HEVC, CBs of 64x64, 32x32 and 16x16 may be partitioned into a single PB with the same size as the CB. However, an 8x8 CB may be split into a single 8x8 PB or four 4x4 PBs.

一但采用帧内预测或帧间预测,由预测区块与来源视频图像区块之间不同处所产生的残差信号(residual signal)被转换至另一域(domain),以进一步进行离散余弦转换(discrete cosine transform,DCT)或离散正弦变换(discrete sine transform,DST)的编码。为了提供这些转换,各个CU或各个CB需要利用一或多个转换区块(transform block,TB)。Once intra-frame prediction or inter-frame prediction is used, the residual signal generated by the difference between the prediction block and the source video image block is converted to another domain for further discrete cosine transform (discrete cosine transform, DCT) or discrete sine transform (discrete sine transform, DST) encoding. To provide these transformations, each CU or each CB needs to utilize one or more transform blocks (TBs).

图1G绘示图1E或图1F的亮度CB 107-1如何被分割为不同的TB 113-1、 113-2、113-3及113-4。若亮度CB 107-1为64x64的CB,TB 113-1为32x32 的TB,TB 113-2为16x16的TB,TB 113-3为8x8的TB,并且TB 113-4为 4x4的TB。亮度CB 107-1可以被分割为4个TB113-1、16个TB 113-2、64 个TB 113-3及256个TB 113-4。一个亮度CB 107-1可以被分割为相同尺寸的TB 113或不同尺寸的TB 113。1G illustrates how the luminance CB 107-1 of FIG. 1E or FIG. 1F is divided into different TBs 113-1, 113-2, 113-3, and 113-4. If luminance CB 107-1 is a 64x64 CB, TB 113-1 is a 32x32 TB, TB 113-2 is a 16x16 TB, TB 113-3 is an 8x8 TB, and TB 113-4 is a 4x4 TB. The luminance CB 107-1 can be divided into 4 TBs 113-1, 16 TBs 113-2, 64 TBs 113-3, and 256 TBs 113-4. One luminance CB 107-1 may be divided into TBs 113 of the same size or TBs 113 of different sizes.

将CB分割为TB的程序根据四分树分割(quad-tree splitting)。因此,一个CB可以被分割为一或多个TB,其中各个TB可以更进一步被分割为4个 TB。这样的分割程序可以根据CB的尺寸来继续进行。The procedure for splitting CBs into TBs is based on quad-tree splitting. Thus, a CB can be partitioned into one or more TBs, where each TB can be further partitioned into 4 TBs. Such a segmentation procedure may continue depending on the size of the CB.

图1H绘示图1E或图1F的亮度CB 107-1的四分树分割,其利用各种分割方式分割为图1G的TB 113-1、113-2、113-3或113-4。在图1H中,亮度 CB 107-1的尺寸为64x64。然而,亮度CB 107-1的尺寸也可以是32x32或16x16。1H illustrates a quadtree partition of luma CB 107-1 of FIG. 1E or FIG. 1F, which is partitioned into TBs 113-1, 113-2, 113-3, or 113-4 of FIG. 1G using various partitioning methods. In FIG. 1H, the size of luminance CB 107-1 is 64x64. However, the size of the luminance CB 107-1 may also be 32x32 or 16x16.

在图1H中,亮度CB 107-1被分割为四个32x32的TB 113-1。各个32x32 的TB可以更进一步被分割为4个16x16的TB 113-2。各个16x16的TB可以更进一步被分割为4个8x8的TB113-3。各个8x8的TB可以更进一步被分割为4个4x4的TB 113-4。In Figure 1H, the luminance CB 107-1 is divided into four 32x32 TBs 113-1. Each 32x32 TB can be further divided into four 16x16 TBs 113-2. Each 16x16 TB can be further divided into four 8x8 TB113-3. Each 8x8 TB can be further divided into four 4x4 TBs 113-4.

TB 113接着以进行DCT或任何HEVC标准的转换。转换单元(Transform units,TUs)汇总TB 113。一或多个TB被各个CB采用。CB形成各个CU。因此,转换单元(TU)的结构于不同的CU 108是不同的,并且是由CU 108 来决定的。TB 113 is then followed by conversion to DCT or any HEVC standard. Transform units (TUs) summarize TB 113. One or more TBs are employed by each CB. CBs form individual CUs. Therefore, the structure of the translation unit (TU) is different for different CUs 108 and is determined by the CUs 108 .

图1I绘示TU 114各种不同分割的TB 113-1、113-2、113-3及113-4。各个TU汇总图1G或图1H分割的TB。32x32的TU 114可以采用32x32的单一个TB 113-1、或一或多个16x16的TB 113-2、8x8的TB 113-3、或4x4的 TB 113-4。对采用HEVC的帧间预测的CU而言,TU可以大于PU,使得TU 可以包含PU边界(boundaries)。然而,对采用HEVC的帧内预测的CU而言,TU可以不穿越(cross)PU边界。FIG. 1I depicts TBs 113-1, 113-2, 113-3, and 113-4 of various partitions of TU 114. Each TU summarizes the TB partitioned in Figure 1G or Figure 1H. A 32x32 TU 114 may employ a single 32x32 TB 113-1, or one or more 16x16 TB 113-2, 8x8 TB 113-3, or 4x4 TB 113-4. For a CU employing HEVC inter-prediction, the TU may be larger than the PU, so that the TU may contain PU boundaries. However, for a CU employing HEVC intra-prediction, the TU may not cross PU boundaries.

图1J绘示图1I的TU 114的四分树分割,其利用图1I的各种TB 113-1、 113-2、113-3或113-4。在图1J中,TU 114的尺寸为32x32。然而,TU的尺寸可以是16x16、8x8、或4x4。FIG. 1J depicts a quadtree partition of the TU 114 of FIG. 1I utilizing the various TBs 113-1, 113-2, 113-3, or 113-4 of FIG. 1I. In Figure 1J, the dimensions of TU 114 are 32x32. However, the size of the TU can be 16x16, 8x8, or 4x4.

在图1J中,TU 114被分割为一个32X32的TB 113-1及4个16x16的TB 113-2。各个16x16的TB可以更进一步分割为4个8x8的TB 113-3。各个8x8 的TB可以更进一步分割为4个4x4的TB 113-4。In Figure 1J, TU 114 is divided into one 32x32 TB 113-1 and four 16x16 TBs 113-2. Each 16x16 TB can be further divided into four 8x8 TBs 113-3. Each 8x8 TB can be further divided into four 4x4 TBs 113-4.

本公开所述的CTU、CTB、CB、CU、PU、PB、TU或TB皆可以包括 HEVC标准的任何特征(feature)、尺寸(size)及特性(property)。第1C、1E及1F图所述的分割也可以应用于色度CTB(Cb CTB 104)、色度CTB(Cr CTB 105)及色度CB(Cb CB 109)、色度CB(Cr CB 110)。The CTU, CTB, CB, CU, PU, PB, TU, or TB described in this disclosure may include any feature, size, and property of the HEVC standard. The segmentation described in Figures 1C, 1E and 1F can also be applied to Chroma CTB (Cb CTB 104), Chroma CTB (Cr CTB 105) and Chroma CB (Cb CB 109), Chroma CB (Cr CB 110) .

图2绘示执行本公开的编码方法的视频编码器200。视频编码器200可以包括一或多个附加元件,其提供HEVC-SCC的附加的编码功能,如调色盘模式(palette mode)、样本自适性偏移(sample adaptive offset)及去块滤波 (de-blocking filtering)。此外,本公开考虑到ACT的帧内预测模式及其他编码模式,例如是ACT的帧间预测模式。FIG. 2 illustrates a video encoder 200 performing the encoding method of the present disclosure. Video encoder 200 may include one or more additional components that provide additional encoding functions of HEVC-SCC, such as palette mode, sample adaptive offset, and deblocking filtering ( de-blocking filtering). Furthermore, the present disclosure contemplates the intra-prediction mode of ACT and other encoding modes, such as the inter-prediction mode of ACT.

视频编码器200接收输入的一原始视频画面(source video frame)。此输入原始视频画面先输入至画面分割模块(Frame Dividing Module)202。画面分割模块202分割原始视频画面为至少一原始CTU(source CTU)。原始CU (source CU)再由原始CTU来获得。原始CTU的尺寸及原始CU的尺寸由画面分割模块202来决定。接着,以逐CU的方式进行编码。原始CU由画面分割模块202输出后,输入至帧间预测启用ACT模块(Inter Predictionenabling adaptive color transformation Module)204、帧间预测禁用ACT模块 (InterPrediction disabling ACT Module)206、帧内预测启用ACT模块(Intra Predictionenabling ACT Module)212及帧内预测禁用ACT模块(Intra Prediction disabling ACTModule)214。The video encoder 200 receives an input source video frame. The input original video frame is first input to the Frame Dividing Module 202 . The frame division module 202 divides the original video frame into at least one source CTU (source CTU). The original CU (source CU) is then obtained from the original CTU. The size of the original CTU and the size of the original CU are determined by the picture segmentation module 202 . Next, encoding is performed on a CU-by-CU basis. After the original CU is output by the screen segmentation module 202, it is input to the Inter Prediction enabling ACT module (Inter Predictionenabling adaptive color transformation Module) 204, the Inter Prediction disabling ACT Module (InterPrediction disabling ACT Module) 206, and the intra prediction enabling ACT module (Intra Predictionenabling ACT Module) 212 and Intra Prediction disabling ACT Module 214 .

输入画面的原始CU被帧间预测启用ACT模块204编码,其利用帧间预测技术且启用自适性色彩转换(ACT)自输入画面决定一原始CU的预测。输入画面的原始CU也被帧间预测禁用ACT模块206编码,其利用帧间预测技术且不启用自适性色彩转换(ACT)自输入画面决定一原始CU的预测(即禁用ACT)。The original CU of the input picture is encoded by the inter prediction enabled ACT module 204, which utilizes inter prediction techniques and enables adaptive color conversion (ACT) to determine a prediction of an original CU from the input picture. The original CU of the input picture is also encoded by the inter prediction disabled ACT module 206, which utilizes inter prediction techniques and does not enable adaptive color conversion (ACT) to determine the prediction of an original CU from the input picture (ie, disable ACT).

在帧间预测时可以使用存储于画面寄存器(Frame Buffer)208的参考CU。原始PU及PB也由原始CU来获得,且使用于帧间预测启用ACT模块204 及帧间预测禁用ACT模块206的帧间预测程序。帧间预测利用不同时间的视频画面的区域来进行运动检测。帧间预测启用ACT模块204及帧间预测禁用 ACT模块206的已编码帧间预测CU预定为最高画面质量。已编码帧间预测 CU接着被输入至模式决定模块(Mode Decision Module)210。The reference CU stored in the frame buffer 208 may be used for inter prediction. The original PU and PB are also obtained from the original CU, and use the inter prediction procedure for the ACT module 204 for inter prediction and the ACT module 206 for inter prediction to be disabled. Inter-frame prediction utilizes regions of a video picture at different times for motion detection. The coded inter-prediction CUs of the inter-prediction-enabled ACT module 204 and the inter-prediction-disabled ACT module 206 are predetermined for the highest picture quality. The coded inter-predicted CU is then input to Mode Decision Module 210.

输入画面的原始CU也藉由帧内预测启用ACT模块212进行编码,其利用帧内预测技术且启用自适性色彩转换(ACT)自输入画面决定一原始CU 的预测。The original CU of the input picture is also encoded by the intra-prediction enabled ACT module 212, which utilizes intra-prediction techniques and enables adaptive color conversion (ACT) to determine a prediction of an original CU from the input picture.

输入画面的原始CU也藉由帧内预测禁用ACT模块214进行编码,其利用帧内预测技术且不启用自适性色彩转换(ACT)自输入画面决定一原始CU 的预测(即禁用ACT)。The original CU of the input picture is also encoded by the intra-prediction disabled ACT module 214, which utilizes intra-prediction techniques and does not enable adaptive color conversion (ACT) to determine the prediction of an original CU from the input picture (ie, disables ACT).

帧内预测启用ACT模块212及帧内预测禁用ACT模块214进行帧内预测时,可使用存储于画面寄存器208的同一画面的原始CU。原始PU及PB 也由原始CU来获得,且使用于帧内预测启用ACT模块212及帧内预测禁用 ACT模块214的帧内预测程序。已编码的帧内预测CU预定为最高画面质量。从帧内预测启用ACT模块212及帧内预测禁用ACT模块214输出的已编码帧内预测CU输入至模式决定模块210。The intra-prediction-enabled ACT module 212 and the intra-prediction-disabled ACT module 214 may use the original CU of the same picture stored in the picture register 208 when performing intra-prediction. The original PU and PB are also obtained from the original CU, and use the intra-prediction procedure for intra-prediction-enabled ACT module 212 and intra-prediction-disable ACT module 214. The coded intra-predicted CU is predetermined for the highest picture quality. The encoded intra-prediction CUs output from the intra-prediction-enable ACT module 212 and the intra-prediction-disable ACT module 214 are input to the mode decision module 210 .

在模式决定模块210中,采用帧间预测启用ACT、帧间预测禁用ACT、帧内预测启用ACT及帧内预测禁用ACT等方式进行原始CU的编码的成本搭配预测CU的质量进行比较。根据比较的结果,决定哪一编码模式的预测 CU(例如是帧间预测CU或帧内预测CU)。被选择的预测CU接着被传送至加总模块(Summing Module)216、218。In the mode decision module 210 , the cost of encoding the original CU is compared with the quality of the predicted CU by using inter prediction enabled ACT, inter prediction disabled ACT, intra prediction enabled ACT, and intra prediction disabled ACT. According to the result of the comparison, the prediction CU of which coding mode (for example, an inter-frame prediction CU or an intra-frame prediction CU) is determined. The selected predicted CUs are then passed to Summing Modules 216 , 218 .

在加总模块216中,原始CU减去已选择的预测CU,以提供一剩余CU (residualCU)。若已选择的预测CU是来自在帧间预测启用ACT模块204 及帧内预测启用ACT模块121的其中之一,切换器(switch)220切换至位置A。在位置A,剩余CU输入至ACT模块(ACT Module)222,并接着输入至CCP、转换及量化模块224(CCP,Transform,and Quantization Module)224。然而,若已选择的预测CU是来自在帧间预测禁用ACT模块206及帧内预测禁用ACT模块214的其中之一,切换器220切换至位置B。在位置B, ACT模块222被跳过且在编码过程中不被执行。剩余CU从加总模块216直接被输入至CCP、转换及量化模块224。In the summing module 216, the selected predicted CU is subtracted from the original CU to provide a residual CU (residual CU). The switch 220 switches to position A if the selected prediction CU is from one of the inter-prediction-enabled ACT module 204 and the intra-prediction-enabled ACT module 121 . At position A, the remaining CUs are input to the ACT Module 222 and then to the CCP, Transform, and Quantization Module 224 . However, if the predicted CU that has been selected is from one of the inter-prediction-disable ACT module 206 and the intra-prediction-disable ACT module 214, the switch 220 switches to position B. In position B, the ACT block 222 is skipped and not executed during the encoding process. The remaining CUs are input directly from the summation module 216 to the CCP, transform and quantization module 224 .

在ACT模块222,自适性色彩转换(adaptive color transform)执行于剩余CU。ACT模块222的输出接入至CCP、转换及量化模块224。At ACT module 222, adaptive color transform is performed on the remaining CUs. The output of the ACT module 222 is routed to the CCP, conversion and quantization module 224 .

CCP、转换及量化模块224执行跨组件预测(cross component prediction, CCP)、转换(如离散续余弦转换(Discrete Cosine Transform,DCT)或离散正弦转换(DiscreteSine Transform,DST)、及剩余CU的量化。CCP、转换及量化模块224的输出接入至熵编码模块(Entropy Coding Module)226及逆运算CCP、转换及量化模块(Inverse CCP,Transform,and Quantization Module)228。The CCP, transform and quantization module 224 performs cross component prediction (CCP), transforms (eg, Discrete Cosine Transform (DCT) or Discrete Sine Transform (DST)), and quantization of the remaining CUs. The output of the CCP, transform, and quantization module 224 is connected to an entropy coding module (Entropy Coding Module) 226 and an inverse CCP, transform, and quantization module (Inverse CCP, Transform, and Quantization Module) 228 .

熵编码模块226执行剩余熵编码(entropy encoding)。举例来说,全文自适应二进位算术编码(Context Adaptive Binary Arithmetic Coding,CABAC) 可以被执行以编码剩余CU。HEVC所提供的任何其他熵编码程序皆可执行于熵编码模块226中。The entropy encoding module 226 performs residual entropy encoding. For example, Context Adaptive Binary Arithmetic Coding (CABAC) may be performed to encode the remaining CUs. Any other entropy encoding process provided by HEVC can be performed in the entropy encoding module 226 .

在执行熵编码之后,输入视频画面的CU的已编码比特流自视频编码器 200输出。输出的已编码比特流可以存储于一存储器、通过传输线广播或网络、或提供至一显示器等。After performing entropy encoding, the encoded bitstream of the CU of the input video picture is output from the video encoder 200. The output encoded bitstream may be stored in a memory, broadcast over a transmission line or network, provided to a display, or the like.

在逆运算CCP、转换及量化模块228中,执行CCP、转换及量化模块224 的相反决定于剩余CU,以提供一重建的剩余CU。In the inverse CCP, transform and quantization module 228, the inverse of the CCP, transform and quantization module 224 is performed to determine the remaining CUs to provide a reconstructed remaining CU.

若已选择的预测CU是来自在帧间预测启用ACT模块204或帧内预测启用ACT模块212,切换器(switch)230切换至位置C。在位置C,重建的剩余CU输入至逆运算ACT模块(Inverse ACT Module)232并接着输入至加总模块(Summing Module)218。然而,若已选择的预测CU是来自在帧间预测禁用ACT模块206或帧内预测禁用ACT模块214,切换器230切换至位置D。在位置D,逆运算ACT模块232被跳过而不被执行,且重建的剩余CU被直接输入至加总模块218。The switch 230 switches to position C if the selected prediction CU is from the inter-prediction-enabled ACT module 204 or the intra-prediction-enabled ACT module 212 . At position C, the reconstructed remaining CU is input to Inverse ACT Module 232 and then to Summing Module 218 . However, if the predicted CU that has been selected is from the inter-prediction-disable ACT module 206 or the intra-prediction-disable ACT module 214, the switch 230 switches to position D. In position D, the inverse ACT block 232 is skipped and not executed, and the reconstructed remaining CUs are input directly to the summing block 218 .

逆运算ACT模块232对重建的剩余CU执行ACT模块232的自适性色彩转换的逆运算。逆运算ACT模块232的输出输入至加总模块218。Inverse operation The ACT module 232 performs the inverse operation of the adaptive color conversion of the ACT module 232 on the reconstructed remaining CUs. The output of the inverse ACT block 232 is input to the summing block 218 .

在加总模块218中,来自模式决定模块210的已选择的预测CU加上重建的剩余CU,以提供重建的原始CU(reconstructed source CU)。重建的原始 CU接着被存储于画面寄存器208,以供其他CU的帧间预测及帧内预测使用。In the summation module 218, the selected predicted CU from the mode decision module 210 is added to the reconstructed remaining CUs to provide a reconstructed source CU (reconstructed source CU). The reconstructed original CU is then stored in the picture register 208 for use in inter-prediction and intra-prediction of other CUs.

以下说明的编码方法300、400及500如何执行于帧内预测启用ACT模块212内。编码方法300、400及500能够改善编码效率与编码时间。The encoding methods 300 , 400 and 500 described below are performed within the intra-prediction enabled ACT module 212 . The coding methods 300, 400 and 500 can improve coding efficiency and coding time.

帧间预测启用ACT模块204、帧间预测禁用ACT模块206、帧内预测启用ACT模块212及帧内预测禁用ACT模块214并非局限于以平行的方式排列。在一实施例中,帧间预测启用ACT模块204、帧间预测禁用ACT模块 206、帧内预测启用ACT模块212及帧内预测禁用ACT模块214可以依序排列。帧间预测启用ACT模块204、帧间预测禁用ACT模块206、帧内预测启用ACT模块212及帧内预测禁用ACT模块214的排列方式可以变更。The inter-prediction-enabled ACT module 204, the inter-prediction-disabled ACT module 206, the intra-prediction-enabled ACT module 212, and the intra-prediction-disabled ACT module 214 are not limited to being arranged in parallel. In one embodiment, the inter prediction enabled ACT module 204, the inter prediction disabled ACT module 206, the intra prediction enabled ACT module 212, and the intra prediction disabled ACT module 214 may be arranged in order. The arrangement of the inter prediction enabled ACT module 204, the inter prediction disabled ACT module 206, the intra prediction enabled ACT module 212, and the intra prediction disabled ACT module 214 may vary.

图3说明根据本公开一实施例的编码方法300,其决定TU尺寸的估计 (TU sizeevaluation)是否需执行在启用ACT的帧内预测编码程序(ACT enabled intra predictionencoding process)中。更具体来说,编码方法300利用关于CU尺寸的临界值计算(threshold calculation)来决定是否需执行TU尺寸的估计。FIG. 3 illustrates an encoding method 300 for determining whether TU size evaluation needs to be performed in an ACT enabled intra prediction encoding process according to an embodiment of the present disclosure. More specifically, the encoding method 300 utilizes a threshold calculation on the size of the CU to decide whether to perform estimation of the size of the TU.

在步骤304中,分量相关性分析(component correlation analysis)执行于一原始CU,以决定CU的ACT的编码模式是否需被启用。CU内的各个像素的色彩分量的相关性被分析出来。在各个像素中,色彩分量的相关性与一像素相关临界值(pixel correlationthreshold)进行比较,以分析出相关性是否高于、等于或低于像素相关性临界值。In step 304, a component correlation analysis is performed on an original CU to determine whether the ACT encoding mode of the CU needs to be enabled. The correlation of the color components of each pixel within the CU is analyzed. In each pixel, the correlation of the color components is compared with a pixel correlation threshold to analyze whether the correlation is higher, equal to or lower than the pixel correlation threshold.

在一CU中,计算出高于像素相关性临界值的像素的总数量,其中等于像素相关性临界值的像素也视为高于像素相关性临界值而被计算在内。像素的总数量接着与一CU相关性临界值(CU correlation threshold)进行比较。In a CU, the total number of pixels above the pixel correlation threshold is calculated, wherein pixels equal to the pixel correlation threshold are also counted as being above the pixel correlation threshold. The total number of pixels is then compared to a CU correlation threshold.

若像素的总数量低于CU相关性临界值,则判定CU的色彩分量具有低相关性。因此,CU并不需要ACT,故流程进入步骤308,而在CU的编码禁用ACT。If the total number of pixels is below the CU correlation threshold, it is determined that the color components of the CU have low correlation. Therefore, the CU does not need ACT, so the flow proceeds to step 308, and ACT is disabled in the coding of the CU.

然而,若像素的总数量高于CU相关性临界值,则判定CU的色彩分量具有高相关性。在这种情况下,ACT是需要用来去除CU的各个像素的分量相关性。当确认为高相关性,ACT被启用且流程进入步骤306。在步骤306,在帧内预测启用ACT之下,进行一概略模式决定。However, if the total number of pixels is higher than the CU correlation threshold, it is determined that the color components of the CU have high correlation. In this case, ACT is the component correlation needed to remove the individual pixels of the CU. When high correlation is confirmed, ACT is enabled and the flow proceeds to step 306 . At step 306, with intra prediction enabled ACT, a rough mode decision is made.

步骤304的相关性分析可以更进一步或可选择性地根据CU的色彩空间 (colorspace)来进行。举例来说,在步骤304,CU内的像素的色彩分量可以进行分析,且CU的色彩空间可以进行判定。色彩空间可以判定为红色、绿色及蓝色(RGB)空间或亮度与色度(luminance and chrominance,YUV) 空间。The correlation analysis of step 304 may further or alternatively be performed according to the colorspace of the CU. For example, in step 304, the color components of the pixels within the CU may be analyzed, and the color space of the CU may be determined. The color space can be determined as a red, green and blue (RGB) space or a luminance and chrominance (YUV) space.

当判定为色彩空间为RGB色彩空间,流程进入步骤306。在步骤306,在帧内预测启用ACT之下,进行概略模式决定(Rough mode decision)。由于RGB像素分量通常具有高相关性,需要进行ACT来去除CU内各个像素的分量的相关性,以将像素能量(pixel energy)隔离为单一成份(single component)。When it is determined that the color space is the RGB color space, the flow proceeds to step 306 . At step 306, with intra prediction enabled ACT, a rough mode decision is made. Since RGB pixel components usually have high correlation, ACT is required to remove the correlation of the components of each pixel within the CU to isolate the pixel energy into a single component.

相对地,当色彩空间判定为YUV色彩空间,流程进入步骤308,而禁用 ACT。这是由于YUV像素分量通常具有低相关性,且多数像素能量(pixel energy)存储于单一像素分量(single pixel component)。由于CU像素分量的进一步去相关性动作(de-correlation)不会产生额外的编码效益,故并不需要于YUV像素分量启用ACT。On the other hand, when the color space is determined to be the YUV color space, the flow proceeds to step 308, and ACT is disabled. This is because YUV pixel components generally have low correlation, and most pixel energy is stored in a single pixel component. It is not necessary to enable ACT on YUV pixel components since further de-correlation of CU pixel components does not yield additional coding benefits.

在帧内预测启用ACT模块212中,在编码方法300禁用ACT时,帧内预测启用ACT的编码模式被禁用,且在帧内预测启用ACT模块212不会输出预测至模式决定模块210。In the intra-prediction-enabled ACT module 212 , when the encoding method 300 disables ACT, the encoding mode of the intra-prediction-enabled ACT is disabled, and the intra-prediction-enabled ACT module 212 does not output predictions to the mode decision module 210 .

在帧间预测启用ACT模块204,在帧间预测编码禁用ACT时,帧间预测启用ACT的编码模式被禁用,且帧间预测启用ACT模块204不会输出预测至模式决定模块210。In the inter prediction enabled ACT module 204 , when the inter prediction encoding is disabled ACT, the encoding mode of the inter prediction enabled ACT is disabled, and the inter prediction enabled ACT module 204 does not output predictions to the mode decision module 210 .

在步骤306中,帧内预测启用ACT下进行概略模式决定。概略模式决定可以是一成本模式决定(cost-based mode decision)。举例来说,在概略模式决定中,可以决定为低复杂度成本的已选择编码模式,以快速做出决定,其通常具有最高质量及最低编码成本。In step 306, a rough mode decision is made with ACT enabled for intra prediction. The rough mode decision may be a cost-based mode decision. For example, in a rough mode decision, the selected encoding mode of low complexity cost may be decided to make the decision quickly, which typically has the highest quality and lowest encoding cost.

在步骤310中,在启用ACT的编码模式下,进行码率失真函数模式决定 (ratedistortion optimization mode decision,RDO mode decision)。在此,当 ACT、CCP、转换、量化及熵编码执行时,计算原始视频的变异(deviation) 及编码模式的比特成本。变异可以由错误计算(error calculation)来获得,例如是均方差(mean squared error,MSE)。接着,ROD分析选择出具有最低编码成本及最高编码质量的编码模式由。In step 310, a rate distortion optimization mode decision (RDO mode decision) is performed in the ACT-enabled encoding mode. Here, when ACT, CCP, transform, quantization, and entropy encoding are performed, the variation of the original video and the bit cost of the encoding mode are calculated. Variation can be obtained by an error calculation, eg mean squared error (MSE). Next, the ROD analysis selects the coding mode with the lowest coding cost and the highest coding quality.

举例来说,在帧内预测启用ACT模块212中,35个帧内预测模式(intra predictionmodes,IPMs)可供编码。帧内预测启用ACT模块212在步骤306 的概略模式决定中,采用简单、低复杂度编码成本决定法来从这些帧内预测模式选择出最低编码成本及最高编码质量者。举例来说,绝对转换误差和(sum of absolute transform distortion,SATD)成本可用来决定出各个IPM的低复杂度编码成本。举例来说,最低编码成本及最高偏码质量的选择可以是选择 3个IPM或选择8个IPM。帧内预测启用ACT模块212在步骤310的RDO 模式决定中,对各个已选择IPM进行RDO模式决定。当ACT、CCP、转换、量化及熵编码执行时,计算各个已选择IPM的原始视频的变异及编码的比特成本。变异可以由错误计算(error calculation)来获得,例如是均方差(mean squared error,MSE)。接着,藉由ROD分析从已选择IPM中选择出具有最低编码成本及最高编码质量的IPM。For example, in the intra prediction enabled ACT module 212, 35 intra prediction modes (IPMs) are available for encoding. The intra-prediction-enabled ACT module 212 uses a simple, low-complexity coding cost determination method to select the lowest coding cost and highest coding quality from these intra-prediction modes in the rough mode determination in step 306 . For example, the sum of absolute transform distortion (SATD) cost can be used to determine the low complexity coding cost of each IPM. For example, the selection of the lowest coding cost and the highest partial code quality may be to choose 3 IPMs or to choose 8 IPMs. Intra prediction enabled ACT module 212, in the RDO mode decision of step 310, makes an RDO mode decision for each selected IPM. When ACT, CCP, transformation, quantization, and entropy encoding are performed, the variation of the original video for each selected IPM and the bit cost of encoding are calculated. Variation can be obtained by an error calculation, eg mean squared error (MSE). Next, the IPM with the lowest coding cost and the highest coding quality is selected from the selected IPMs by ROD analysis.

上述相关于帧内预测启用ACT模块212的变化流程也可以执行在帧间预测启用ACT模块204。举例来说,当帧间预测启用ACT模块204执行编码方法300,在步骤306,进行时间相邻的视频画面的最佳帧间预测的概略模式决定,其提供最低编码成本及最高编码质量。在步骤310,进行帧间预测的 RDO模式决定。在此,当ACT、CCP、转换、量化及熵编码执行时,计算帧间预测的原始视频的变异(deviation)及编码比特成本。变异可以由错误计算(errorcalculation)来获得,例如是均方差(mean squared error,MSE)。接着,ROD分析选择出具有最低编码成本及最高编码质量的帧间预测。Variations of the flow described above with respect to enabling the ACT module 212 for intra-prediction may also be performed to enable the ACT module 204 for inter-prediction. For example, when the inter prediction enabled ACT module 204 performs the encoding method 300, at step 306, a rough mode determination of the best inter prediction for temporally adjacent video frames is made, which provides the lowest encoding cost and highest encoding quality. At step 310, an RDO mode decision for inter prediction is made. Here, when ACT, CCP, transform, quantization, and entropy coding are performed, the variation and coding bit cost of the inter-predicted original video are calculated. Variation can be obtained by error calculation (error calculation), eg mean squared error (MSE). Next, the ROD analysis selects the inter prediction with the lowest coding cost and the highest coding quality.

在步骤312,计算目前处理的CU的CU尺寸。CU的尺寸可以是NxN,其中N可以是4、8、16、32或64。CU的N值与临界值T1比较。临界值 T1可以是4、8、16、32或64。根据比较结果,判定出CU尺寸是否小于临界值T1,并藉此估计出欲启用编码模式的转换单元的尺寸。若CU尺寸小于临界值T1,流程进入步骤314,以进行TU尺寸的决定(TU size decision)。然而,若CU尺寸等于或大于临界值T1,流程进入步骤316,而跳过步骤314 的TU尺寸决定步骤。在步骤312,当CU尺寸大于临界值T1,对于CU的 TU尺寸可以大于临界值T1。若CU尺寸CU等于或大于临界值T1,TU四分树结构(quadtree structure)可以决定为最大可能的TU尺寸。举例来说,当 CU尺寸等于或大于临界值T1,对于尺寸为64x64的CU,可以决定出四个 32x32的TU。在另一实施例中,当CU尺寸等于或大于临界值T1,对于32x32、 16x16、8x8或4x4的CU来说,TU可以与CU相同尺寸。举例来说,若CU 的尺寸为32x32,对应的PU尺寸可以是32x32。In step 312, the CU size of the currently processed CU is calculated. The size of a CU may be NxN, where N may be 4, 8, 16, 32, or 64. The N value of CU is compared with the critical value T1. The threshold value T1 can be 4, 8, 16, 32 or 64. According to the comparison result, it is determined whether the size of the CU is smaller than the threshold value T1, and thereby the size of the conversion unit to be activated in the coding mode is estimated. If the CU size is smaller than the threshold value T1, the process proceeds to step 314 to make a TU size decision. However, if the CU size is equal to or greater than the threshold value T1, the flow proceeds to step 316 and the TU size determination step of step 314 is skipped. At step 312, when the CU size is greater than the threshold T1, the TU size for the CU may be greater than the threshold T1. If the CU size CU is equal to or greater than the threshold T1, the TU quadtree structure can be determined as the largest possible TU size. For example, when the CU size is equal to or greater than the threshold T1, for a CU with a size of 64x64, four 32x32 TUs can be determined. In another embodiment, when the CU size is equal to or greater than the threshold T1, for a 32x32, 16x16, 8x8 or 4x4 CU, the TU may be the same size as the CU. For example, if the size of the CU is 32x32, the corresponding PU size may be 32x32.

由于TU尺寸的决定耗费时间且增加编码成本,步骤312可改善编码时间及效率。因此,若TU尺寸的决定能够省略,则可解省编码成本即时间。再者,CU尺寸等于或大于临界值T1表示CU的内容并不复杂。举例来说, CU尺寸大于临界值T1可能表示视频图像有大范围区域没有边界、动态或复杂图像。因此,TU尺寸的决定可以不需要进行,以有效率地进行高视频质量的CU的编码。Since TU size determination is time consuming and increases encoding cost, step 312 may improve encoding time and efficiency. Therefore, if the determination of the TU size can be omitted, the coding cost, that is, time can be saved. Furthermore, the CU size equal to or greater than the threshold T1 indicates that the content of the CU is not complicated. For example, a CU size larger than the threshold T1 may indicate that the video image has a large area without borders, dynamic or complex images. Therefore, the determination of the TU size may not need to be performed to efficiently perform encoding of CUs with high video quality.

在步骤314中,若CU尺寸低于临界值T1,则执行TU尺寸的决定。在此,决定了原始CU的TU。藉由步骤310的RDO成本估计,分析出TU尺寸,已获得最高效率及高视频质量的CU的ACT转换。举例来说,可分析出 4x4、8x8、16x16及32x32的TU尺寸。当能够获得最高效率的ACT转换的 TU尺寸被决定出来,此TU尺寸被选择用来作CU的ACT转换并进入步骤 316。已选择的TU尺寸作为最佳的TU四分树结构尺寸。In step 314, if the CU size is lower than the threshold T1, the TU size determination is performed. Here, the TU of the original CU is determined. Through the RDO cost estimation in step 310, the TU size is analyzed, and the ACT conversion of the CU with the highest efficiency and high video quality has been obtained. For example, TU sizes of 4x4, 8x8, 16x16, and 32x32 can be analyzed. When the TU size that can obtain the most efficient ACT conversion is determined, the TU size is selected for the ACT conversion of the CU and step 316 is entered. The selected TU size is used as the optimal TU quadtree size.

在步骤316,进行色度模式决定(chroma mode decision)。色度模式的决定是依据步骤310的预测模式的决定来进行,且利用已决定的预测模式 (determined predictionmode)来使色度预测(chroma prediction)产生色度 PU(chroma PU)及对应的色度TU(chroma TU)。从步骤312或步骤314决定的TU也可用来产生色度TU。色度TU亦根据色度格式(chroma format) 进行二次采样(subsample)。因此,在一实施例中,当色度格式为4:2:0,且亮度TU的尺寸为32x32,决定的色度TU为16x16的色度TU。At step 316, a chroma mode decision is made. The determination of the chroma mode is performed according to the determination of the prediction mode in step 310, and the determined prediction mode is used to make the chroma prediction (chroma prediction) to generate the chroma PU (chroma PU) and the corresponding chroma TU (chroma TU). The TUs determined from step 312 or step 314 may also be used to generate chroma TUs. The chroma TU is also subsampled according to the chroma format. Therefore, in one embodiment, when the chroma format is 4:2:0 and the size of the luma TU is 32x32, the determined chroma TU is a 16x16 chroma TU.

在步骤308,帧内预测启用ACT模块的选择最佳帧内预测模式及选择最佳TU四分树结构尺寸的程序已完成。预测及RDO成本已产生,且输入至模式决定模块210,以与其他预测模块输入至模式决定模块210的RDO成本进行比较。举例来说,帧间预测启用ACT模块204可能产生启用ACT的CU 的预测及RDO成本,并输入预测CU及RDO成本至模式决定模块210。帧间预测禁用ACT模块206及帧内预测禁用ACT模块214也产生预测CU及 RDO成本,并输入其各自的预测CU及RDO成本至模式决定模块210。模式决定模块210比较帧间预测启用ACT模块204、帧间预测禁用ACT模块206、帧内预测启用ACT模块212及帧内预测禁用ACT模块214所输入的预测CU 及RDO成本,并决定将要输入至加总模块216、218的预测CU。At step 308, the process of selecting the best intra-prediction mode and selecting the best TU quadtree size of the intra-prediction-enabled ACT module is complete. Prediction and RDO costs have been generated and input to the mode decision module 210 for comparison with the RDO costs input to the mode decision module 210 by other prediction modules. For example, inter-prediction-enabled ACT module 204 may generate prediction and RDO costs for ACT-enabled CUs and input the predicted CU and RDO costs to mode decision module 210 . Inter-prediction-disable ACT module 206 and intra-prediction-disable ACT module 214 also generate predicted CU and RDO costs and input their respective predicted CU and RDO costs to mode decision module 210 . The mode decision module 210 compares the predicted CU and RDO costs input by the inter prediction enabled ACT module 204, the inter prediction disabled ACT module 206, the intra prediction enabled ACT module 212, and the intra prediction disabled ACT module 214, and determines the predicted CU and RDO costs to be input to The predicted CUs of the modules 216, 218 are summed.

图4说明根据本公开另一实施例的编码方法400,其根据本公开的另一实施例决定ACT是否需要启用。更具体来说,编码方法400利用了关于CU 尺寸的临界值计算(thresholdcalculation)及CU像素的色彩分量的相关性的决定。根据临界值计算,ACT可以启用或禁用。相同标号的元件可参考前述相关说明。FIG. 4 illustrates an encoding method 400 according to another embodiment of the present disclosure, which determines whether ACT needs to be enabled according to another embodiment of the present disclosure. More specifically, the encoding method 400 utilizes a threshold calculation regarding the size of the CU and the determination of the correlation of the color components of the CU pixels. ACT can be enabled or disabled based on threshold calculations. Elements with the same reference numerals may refer to the foregoing related descriptions.

在步骤304,分量相关性分析(component correlation analysis)执行于原始CU,以决定ACT是否需启用或禁用。步骤304如同编码方法300的说明。若CU的色彩分量的相关性高,则启用ACT且流程进入步骤306、310、314、 316及308(同上述编码步骤300)。然而,若相关性低,则流程进入步骤402。At step 304, a component correlation analysis is performed on the original CU to determine whether ACT needs to be enabled or disabled. Step 304 is as described for the encoding method 300 . If the correlation of the color components of the CU is high, ACT is enabled and the flow proceeds to steps 306, 310, 314, 316 and 308 (same as the encoding step 300 described above). However, if the correlation is low, the flow proceeds to step 402 .

在步骤402,决定目前处理的CU的尺寸。如前所述,CU尺寸为NxN,其中N可以是4、8、16、32或64。CU的N值与临界值T2进行比较,以比较出CU尺寸是否小于临界值T2。临界值T2可以是4、8、16、32或64。若 CU尺寸小于临界值T2,则启用ACT且流程进入步骤310,如同编码方法300 的步骤310的RDO模式决定。然而,若CU尺寸等于或大于临界值T2,流程进入步骤308,而禁用ACT。In step 402, the size of the currently processed CU is determined. As before, the CU size is NxN, where N can be 4, 8, 16, 32, or 64. The N value of the CU is compared with the critical value T2 to compare whether the CU size is smaller than the critical value T2. The threshold value T2 can be 4, 8, 16, 32 or 64. If the CU size is less than the threshold value T2, ACT is enabled and the flow proceeds to step 310, as in the RDO mode determination of step 310 of the encoding method 300. However, if the CU size is equal to or greater than the threshold value T2, the flow proceeds to step 308 and ACT is disabled.

在帧间预测启用ACT模块204,当编码方法400中ACT被禁用,帧间预测启用ACT模块204的输出为未应用ACT的帧间预测CU。因此,在这种情况下,帧间预测启用ACT模块204输出的CU相同于帧间预测禁用ACT 模块206的输出。同样地,在帧内预测启用ACT模块212,当编码方法400 中ACT被禁用,帧内预测启用ACT模块212的输出为未应用ACT的帧内预测CU。因此,在这种情况下,帧内预测启用ACT模块212的输出CU相同于帧内预测禁用ACT模块214的输出。In the inter prediction enabled ACT module 204, when ACT is disabled in the encoding method 400, the output of the inter prediction enabled ACT module 204 is the inter prediction CU without ACT applied. Thus, in this case, the CU output by the inter-prediction enabled ACT module 204 is the same as the output of the inter-prediction disabled ACT module 206 . Likewise, in the intra-prediction-enabled ACT module 212, when ACT is disabled in the encoding method 400, the output of the intra-prediction-enabled ACT module 212 is the intra-prediction CU without ACT applied. Thus, in this case, the output CU of the intra-prediction-enabled ACT module 212 is the same as the output of the intra-prediction-disabled ACT module 214 .

由于CU尺寸相同或大于临界值T2表示CU的内容不复杂,步骤402可改善编码时间及编码效率。CU尺寸大于临界值T2可能表示视频图像有大范围区域没有边界、动态或复杂图像。在组合已经充分去相关性的色彩分量下,为了有效率地编码CU,可能不需要ACT。Since the size of the CU is the same or larger than the threshold T2, it means that the content of the CU is not complicated, step 402 can improve the coding time and coding efficiency. A CU size larger than the critical value T2 may indicate that the video image has a large area without borders, dynamic or complex images. In combining color components that have been sufficiently decorrelated, ACT may not be required in order to encode the CU efficiently.

图5说明根据本公开另一实施例的编码方法500,其根据本公开的另一实施例决定ACT是否需要启用以及是否需要通过两个临界值计算来进行TU 尺寸估计。更具体来说,编码方法500使用关于CU尺寸的第一临界值计算 (first threshold calculation)以及用以判断是否要启用ACT的CU像素色彩分量的相关性决定。编码方法500也使用关于CU尺寸的第二临界值计算(second threshold calculation),以决定TU尺寸的估计是否需执行。相同标号的元件可参考前述相关说明。5 illustrates an encoding method 500 according to another embodiment of the present disclosure that determines whether ACT needs to be enabled and whether TU size estimation needs to be performed through two threshold calculations, according to another embodiment of the present disclosure. More specifically, encoding method 500 uses a first threshold calculation with respect to CU size and a correlation decision for CU pixel color components used to determine whether to enable ACT. The encoding method 500 also uses a second threshold calculation with respect to the size of the CU to decide whether or not estimation of the size of the TU needs to be performed. Elements with the same reference numerals may refer to the foregoing related descriptions.

在步骤304,分量相关性分析(component correlation analysis)执行于原始CU,以决定ACT是否需启用或禁用。步骤304如同编码方法300的说明。若CU的色彩分量的相关性高,则启用ACT且流程进入步骤306,以进行概略模式决定及步骤310的RDO模式决定。步骤306及310如同前述编码方法 300的叙述。然而,若相关性低,则流程进入步骤402。At step 304, a component correlation analysis is performed on the original CU to determine whether ACT needs to be enabled or disabled. Step 304 is as described for the encoding method 300 . If the correlation of the color components of the CU is high, ACT is enabled and the flow proceeds to step 306 for rough mode determination and RDO mode determination in step 310 . Steps 306 and 310 are the same as described for the encoding method 300 above. However, if the correlation is low, the flow proceeds to step 402 .

在步骤402,决定目前处理的CU的尺寸(如前述图4的编码方法400 所述)。若CU尺寸小于临界值T2,则启用ACT,并进入步骤310,以进行 RDO模式决定。然而,若CU尺寸等于或大于临界值T2,则流程进入步骤 308,而禁用ACT。In step 402, the size of the currently processed CU is determined (as described in the encoding method 400 of FIG. 4 above). If the CU size is smaller than the threshold value T2, the ACT is enabled, and the process proceeds to step 310 for RDO mode determination. However, if the CU size is equal to or greater than the threshold T2, the flow proceeds to step 308 and ACT is disabled.

在帧间预测启用ACT模块204,当编码方法500中ACT被禁用,帧间预测启用ACT模块204的输出为未应用ACT的帧间预测CU。因此,在这种情况下,帧间预测启用ACT模块204输出的CU相同于帧间预测禁用ACT 模块206的输出。In the inter prediction enabled ACT module 204, when ACT is disabled in the encoding method 500, the output of the inter prediction enabled ACT module 204 is the inter prediction CU without ACT applied. Thus, in this case, the CU output by the inter-prediction enabled ACT module 204 is the same as the output of the inter-prediction disabled ACT module 206 .

同样地,在帧内预测启用ACT模块212,当编码方法500中ACT被禁用,帧内预测启用ACT模块212的输出为未应用ACT的帧内预测CU。因此,在这种情况下,帧内预测启用ACT模块212的输出CU相同于帧内预测禁用 ACT模块214的输出。Likewise, in the intra-prediction-enabled ACT module 212, when ACT is disabled in the encoding method 500, the output of the intra-prediction-enabled ACT module 212 is the intra-prediction CU to which ACT is not applied. Therefore, in this case, the output CU of the intra-prediction-enabled ACT module 212 is the same as the output of the intra-prediction-disabled ACT module 214.

在步骤310,RDO模式决定如同前述编码方法300所述的内容。At step 310, the RDO mode is determined as described in the encoding method 300 described above.

在步骤312,目前处理的CU尺寸的计算如同前述编码方法300所述的内容,来决定CU尺寸是否小于临界值T1。若CU尺寸小于临界值T1,则流程进入步骤314,以进行TU尺寸决定。然而,若CU尺寸等于或大于临界值 T1,流程进入步骤316,而跳过步骤314的TU尺寸决定。步骤314、316的决定过程如同前述的编码方法300。In step 312, the calculation of the size of the CU currently being processed is the same as that described in the encoding method 300, to determine whether the size of the CU is smaller than the threshold value T1. If the CU size is smaller than the threshold value T1, the process proceeds to step 314 to determine the TU size. However, if the CU size is equal to or greater than the threshold T1, the flow proceeds to step 316, and the TU size determination of step 314 is skipped. The decision process of steps 314 and 316 is the same as the encoding method 300 described above.

临界值T1及T2可以设定为相同或不同值。The thresholds T1 and T2 can be set to the same or different values.

图5的编码方法500结合临界值计算来改善编码效率及时间。如上所述, CU尺寸等于或大于临界值T2表示CU的内容不复杂,且可预期有大范围区域的无边界、动态或复杂图样。在组合已经充分去相关性的色彩分量下,为了有效率地编码CU,可能不需要ACT。再者,步骤314的TU尺寸决定被省略后,能够节省编码成本。The encoding method 500 of FIG. 5 incorporates threshold calculation to improve encoding efficiency and time. As described above, a CU size equal to or greater than the threshold T2 indicates that the content of the CU is not complex, and a large area of unbounded, dynamic or complex patterns can be expected. In combining color components that have been sufficiently decorrelated, ACT may not be required in order to encode the CU efficiently. Furthermore, after the TU size determination in step 314 is omitted, the coding cost can be saved.

图6说明根据本公开另一实施例的编码方法600(类似于编码方法300),其根据本公开的另一实施例决定是否需要在启用ACT的帧内预测程序中执行TU尺寸估计。更具体来说,编法方法600使用关于CU尺寸的临界值计算(threshold calculation),并根据临界值计算判断是否需要执行TU尺寸估计。6 illustrates an encoding method 600 (similar to encoding method 300) according to another embodiment of the present disclosure for determining whether TU size estimation needs to be performed in an ACT-enabled intra prediction procedure according to another embodiment of the present disclosure. More specifically, the coding method 600 uses a threshold calculation with respect to the CU size, and determines whether or not to perform TU size estimation according to the threshold calculation.

在步骤304,分量相关性分析(component correlation analysis)执行于原始CU,以决定ACT是否需启用或禁用。步骤304如同编码方法300的说明。若CU的色彩分量的相关性高,则启用ACT且流程进入步骤306,以进行概略模式决定及步骤310的RDO模式决定。步骤306及310如同前述编码方法 300的叙述。然而,若在步骤304的相关性低,或色彩空间判断为YUV色彩空间,则启用ACT的编码模式并直接进入步骤310,但不执行步骤306的概略模式决定。在此,对于低相关性像素分量或YUV色彩空间,ACT仍然启用,以确认像素分量的去相关性可能会产生附加的编码效益。At step 304, a component correlation analysis is performed on the original CU to determine whether ACT needs to be enabled or disabled. Step 304 is as described for the encoding method 300 . If the correlation of the color components of the CU is high, ACT is enabled and the flow proceeds to step 306 for rough mode determination and RDO mode determination in step 310 . Steps 306 and 310 are the same as described for the encoding method 300 above. However, if the correlation in step 304 is low, or the color space is determined to be the YUV color space, the encoding mode of ACT is enabled and step 310 is directly entered, but the rough mode determination in step 306 is not performed. Here, for low-correlation pixel components or YUV color spaces, ACT is still enabled to confirm that decorrelation of pixel components may yield additional coding benefits.

在步骤310,RDO模式决定的计算如同前述编码方法300。At step 310, the calculation of the RDO mode decision is the same as the encoding method 300 described above.

在步骤312,目前处理的CU尺寸的计算如同前述编码方法300所述的内容,来决定CU尺寸是否小于临界值T1。若CU尺寸小于临界值T1,则流程进入步骤314,以进行TU尺寸决定。然而,若CU尺寸等于或大于临界值 T1,流程进入步骤316,而跳过步骤314的TU尺寸决定。步骤314、316的决定过程如同前述的编码方法300。In step 312, the calculation of the size of the CU currently being processed is the same as that described in the encoding method 300, to determine whether the size of the CU is smaller than the threshold value T1. If the CU size is smaller than the threshold value T1, the process proceeds to step 314 to determine the TU size. However, if the CU size is equal to or greater than the threshold T1, the flow proceeds to step 316, and the TU size determination of step 314 is skipped. The decision process of steps 314 and 316 is the same as the encoding method 300 described above.

临界值T1及T2可以设定为相同或不同值。The thresholds T1 and T2 can be set to the same or different values.

执行编码方法300、400、500、600的相反步骤的解码方法可以有效率地对编码方法300、400、500、600所编码的视频进行解码。因此,本公开的上述内容足以了解执行编码方法300、400、500、600的相反步骤的解码方法。本公开上述内容亦足以了解对编码方法300、400、500、600所编码的视频进行解码所需的其他解码程序。A decoding method that performs the reverse steps of the encoding methods 300, 400, 500, 600 can efficiently decode the video encoded by the encoding methods 300, 400, 500, 600. Therefore, the foregoing of the present disclosure suffices to understand decoding methods that perform the opposite steps of encoding methods 300 , 400 , 500 , 600 . The above content of the present disclosure is also sufficient to understand other decoding procedures required for decoding the video encoded by the encoding methods 300 , 400 , 500 , 600 .

如果大CU使用IPM作为屏幕虚拟内容(screen visual content),则可能表示该区域的内容并不复杂,且并不需要估计TU的尺寸。因此,非444色度格式的IPM被禁止部分大CU的TU分割。图7说明非444色度格式的IPM 的演算流程。步骤306及310如同前述编码方法300的叙述。在步骤310, RDO模式决定的计算如同前述编码方法300。If a large CU uses IPM as screen visual content, it may indicate that the content of this area is not complex and the size of the TU does not need to be estimated. Therefore, IPMs in non-444 chroma formats are prohibited from TU partitioning for some large CUs. FIG. 7 illustrates the calculation flow of IPM in non-444 chroma format. Steps 306 and 310 are the same as those of the encoding method 300 described above. At step 310, the RDO mode decision is calculated as in the encoding method 300 described above.

在步骤311,判断色度格式是否为非444。若色度格式为非444,则进入步骤312。若色度格不是非444,则进入步骤314,以近行TU尺寸决定。In step 311, it is determined whether the chroma format is non-444. If the chroma format is not 444, go to step 312 . If the chroma scale is not non-444, then go to step 314, and determine according to the nearest row TU size.

在步骤312,目前处理的CU尺寸的计算如同前述编码方法300所述的内容,来决定CU尺寸是否小于临界值T1。若CU尺寸小于临界值T1,则流程进入步骤314,以进行TU尺寸决定。然而,若CU尺寸等于或大于临界值 T1,流程进入步骤316,而跳过步骤314的TU尺寸决定。步骤314、316的决定过程如同前述的编码方法300。In step 312, the calculation of the size of the CU currently being processed is the same as that described in the encoding method 300, to determine whether the size of the CU is smaller than the threshold value T1. If the CU size is smaller than the threshold value T1, the process proceeds to step 314 to determine the TU size. However, if the CU size is equal to or greater than the threshold T1, the flow proceeds to step 316, and the TU size determination of step 314 is skipped. The decision process of steps 314 and 316 is the same as the encoding method 300 described above.

临界值T1及T2可以设定为相同或不同值。The thresholds T1 and T2 can be set to the same or different values.

图8绘示执行本公开的编码与解码方法的系统700。系统700包括一非暂态计算机可读取介质(non-transitory computer-readable medium)702,其可以是存储数组指令的存储器。此些指令可被处理器704执行。值得注意的是,一或多个非暂态计算机可读取介质702和/或一或多个处理器704可以选择性地采用,以执行本公开的编码与解码方法。FIG. 8 illustrates a system 700 implementing the encoding and decoding methods of the present disclosure. System 700 includes a non-transitory computer-readable medium 702, which may be a memory that stores array instructions. Such instructions may be executed by processor 704 . Notably, one or more non-transitory computer-readable media 702 and/or one or more processors 704 may optionally be employed to perform the encoding and decoding methods of the present disclosure.

非暂态计算机可读取介质702可以是任何类型的非暂态计算机可读取记录介质(non-transitory computer-readable storage medium,non-transitory CRM)。非暂态计算机可读取记录介质可以包括软性磁盘(floppy disk)、可挠性盘片 (flexible disk)、硬盘(hard disk)、硬盘机(hard drive)、固态硬盘(solid state drive)、磁带(magnetictape)、任何磁性数据存储介质(magnetic data storage medium)、光碟机(CD-ROM)、任何光学数据存储介质(optical data storage medium)、任何具有孔洞图样的物理性介质、动态随机存取存储器(RAM)、可编程只读存储器(PROM)、可抹除可编程只读存储器(EPROM)、快闪可抹除可编程只读存储器(FLASH-EPROM)、任何快闪存储器、非易失性存储器(NVRAM)、快取(cache)、寄存器(register)、存储器芯片(memory chip)、胶卷(cartridge)及网络。计算机可读取记录介质可存储由至少一处理器执行的数组指令。此些指令包含令处理器去执行本公开的编码与解码方法的步骤或阶段。再者,一或多个计算机可读取记录介质可以用来实现本公开的编码与解码方法。“计算机可读取记录介质”包含有形物体但不包含载体载波信号和瞬态信号。The non-transitory computer-readable medium 702 may be any type of non-transitory computer-readable storage medium (non-transitory CRM). The non-transitory computer-readable recording medium may include a floppy disk, a flexible disk, a hard disk, a hard drive, a solid state drive, magnetic tape, any magnetic data storage medium, CD-ROM, any optical data storage medium, any physical medium with a hole pattern, dynamic random access Memory (RAM), Programmable Read Only Memory (PROM), Erasable Programmable Read Only Memory (EPROM), Flash Erasable Programmable Read Only Memory (FLASH-EPROM), any flash memory, non-volatile NVRAM, cache, register, memory chip, cartridge and network. The computer-readable recording medium can store an array of instructions to be executed by at least one processor. Such instructions include causing a processor to perform steps or stages of the encoding and decoding methods of the present disclosure. Furthermore, one or more computer-readable recording media may be used to implement the encoding and decoding methods of the present disclosure. A "computer-readable recording medium" contains tangible objects but does not contain carrier carrier signals and transient signals.

处理器704可以是任何形式的数字信号处理器(digital signal processor,DSP)、特定应用集成电路(application specific integrated circuit,ASIC)、数字信号处理装置(digital signal processing device,DSPD)、可编程逻辑装置 (programmablelogic device,PLD)、可编程逻辑阵列(field programmable gate arrays,FPGA)、控制器(controller)、微控制器(micro-controller)、微处理器(micro-processor)、计算机或任何其他能够执行本公开的编码与解码方法的电子元件。The processor 704 may be any form of digital signal processor (DSP), application specific integrated circuit (ASIC), digital signal processing device (DSPD), programmable logic device (programmable logic device, PLD), programmable logic array (field programmable gate arrays, FPGA), controller (controller), microcontroller (micro-controller), microprocessor (micro-processor), computer or any other capable of executing Electronic components of the encoding and decoding methods of the present disclosure.

实验结果Experimental results

以下说明本公开的编码方法的实验结果。The experimental results of the encoding method of the present disclosure are described below.

此处的实验室采用HEVC SCC参考模型、一般测试条件(common test conditions,CTC)下的SCM 4.0。本公开的编码方法的编码效能是与HEVC 的参考模型进行比较。HEVC参考模型花费了编码时间A来进行编码。本公开的测试编码方法花费了来进行编码时间B来进行编码。编码时间百分比为编码时间B除以编码时间A。实验可采用HEVC一般测试流程。视频可混合文字、图像、动态画面、混合内容、动画、照相机提取内容。视频可以是具有720p、1080p、或1440p的画质的RGB色彩空间及YUV色彩空间。实验采用有损条件(lossycondition)下的全帧内预测、随机存取及低B预测(low-B prediction)。全帧内预测采用目前正被压缩的画面内的信息来压缩视频画面,而随机存取及低B预测采用先前已编码的画面及目前正被压缩的画面的信息来压缩视频画面。在以下的说明中,低B预测也可以是指低延迟B预测(low delay B prediction)。在每次的实验中,编码时间及解码时间都被以百分比记录下来,此些百分比表示相对于参考模型的编码方法与解码方法的比例。相对于原始视频源,关于各个G/Y、B/U及R/V分量的正的百分比表示比特率编码损失(bit rate codingloss),负的百分比表示比特率编码增益(bit rate coding gain)。举例来说,G/Y分量的0.1%数值表示已编码视频的G/Y分量相对于原始视频的G/Y分量的编码损失为0.1%。在另一实例中,G/Y分量的 -0.1%数值表示已编码视频的G/Y分量相对于原始视频的G/Y分量的编码增益为0.1%。The laboratory here uses the HEVC SCC reference model, SCM 4.0 under common test conditions (CTC). The coding performance of the coding method of the present disclosure is compared with the reference model of HEVC. The HEVC reference model took encoding time A to encode. The test encoding method of the present disclosure took to encode time B to encode. The encoding time percentage is encoding time B divided by encoding time A. The experiment can adopt the general test process of HEVC. Video can be mixed with text, images, motion pictures, mixed content, animation, and camera-extracted content. Video can be in RGB color space and YUV color space with 720p, 1080p, or 1440p quality. The experiments use full intra-frame prediction, random access and low-B prediction under lossy conditions. Full intra prediction uses information within the picture currently being compressed to compress video pictures, while random access and low-B prediction uses information from previously encoded pictures and the picture currently being compressed to compress video pictures. In the following description, low B prediction may also refer to low delay B prediction. In each experiment, encoding time and decoding time were recorded as percentages, which represented the ratio of encoding method to decoding method relative to the reference model. A positive percentage for each G/Y, B/U, and R/V component represents a bit rate coding loss relative to the original video source, and a negative percentage represents a bit rate coding gain. For example, a value of 0.1% for the G/Y component means that the encoding loss of the G/Y component of the encoded video relative to the G/Y component of the original video is 0.1%. In another example, a value of -0.1% for the G/Y component represents a coding gain of 0.1% for the G/Y component of the encoded video relative to the G/Y component of the original video.

请参考图5的编码方法500及下表1。在编码方法500,实验室执行于以下三种设定之下。在设定一,临界值T2及临界值T1皆设定为64。在设定二,临界值T2设定为64,临界值T1设定为32。在设定三,临界值T2设定为64,临界值T1设定为16。帧内预测为预定的编码模式。Please refer to the encoding method 500 of FIG. 5 and Table 1 below. In the encoding method 500, the laboratory operates under the following three settings. In setting 1, both the threshold value T2 and the threshold value T1 are set to 64. In setting 2, the threshold value T2 is set to 64, and the threshold value T1 is set to 32. In setting three, the threshold value T2 is set to 64, and the threshold value T1 is set to 16. Intra prediction is a predetermined coding mode.

在设定一,当像素分量具有低相关性,尺寸大于或等于64x64的CU被以不启用ACT的方式编码。尺寸小于64x64的CU被以启用ACT的方式编码。再者,在CU尺寸大于64x64的情况下,省略TU尺寸决定的步骤314。对于CU尺寸小于64x64的情况下,执行TU尺寸决定的步骤314。In setting one, when pixel components have low correlation, CUs with size greater than or equal to 64x64 are encoded without ACT enabled. CUs with sizes smaller than 64x64 are encoded in an ACT-enabled manner. Furthermore, when the CU size is larger than 64×64, step 314 of determining the TU size is omitted. In the case where the CU size is smaller than 64×64, step 314 of TU size determination is performed.

在设定二,当像素分量具有低相关性,尺寸大于或等于64x64的CU被以不启用ACT的方式编码。尺寸小于64x64的CU被以启用ACT的方式编码。再者,在CU尺寸大于32x32的情况下,省略TU尺寸决定的步骤314。对于CU尺寸小于32x32的情况下,执行TU尺寸决定的步骤314。In setting two, when the pixel components have low correlation, CUs of size greater than or equal to 64x64 are encoded without ACT enabled. CUs with sizes smaller than 64x64 are encoded in an ACT-enabled manner. Furthermore, when the CU size is larger than 32×32, step 314 of determining the TU size is omitted. In the case where the CU size is smaller than 32x32, step 314 of TU size determination is performed.

在设定三,当像素分量具有低相关性,尺寸大于或等于64x64的CU被以不启用ACT的方式编码。尺寸小于64x64的CU被以启用ACT的方式编码。再者,在CU尺寸大于16x16的情况下,省略TU尺寸决定的步骤314。对于CU尺寸小于16x16的情况下,执行TU尺寸决定的步骤314。In setting three, when the pixel components have low correlation, CUs with size greater than or equal to 64x64 are encoded without ACT enabled. CUs with sizes smaller than 64x64 are encoded in an ACT-enabled manner. Furthermore, when the CU size is larger than 16×16, step 314 of determining the TU size is omitted. For the case where the CU size is smaller than 16x16, step 314 of TU size determination is performed.

Figure GDA0001970712540000201
Figure GDA0001970712540000201

Figure GDA0001970712540000211
Figure GDA0001970712540000211

表1Table 1

如表1所示,设定一、设定二及设定三的编码效能均有改善。设定一降低了3%的编码复杂度(encoding complexity),设定二降低了6%的编码复杂度。设定三降低了9%的编码复杂度(设定三降低最多)。因此,所有的设定都能够改善编码效率。各个设定在比特率的最小损失(minimal loss of bit rate) 下,编码时间及效率都有改善。As shown in Table 1, the encoding performance of Setting 1, Setting 2 and Setting 3 are improved. Setting one reduces encoding complexity by 3%, setting two reduces encoding complexity by 6%. Setting three reduces coding complexity by 9% (setting three reduces the most). Therefore, all settings can improve coding efficiency. Each setting improves encoding time and efficiency at a minimal loss of bit rate.

请参照编码方法500及下表2、3。在此,实验是在全帧内、随机存取及低延迟B(lowdelay B)之下进行。在实验一,临界值T2及临界值T1皆设定为32。在实验二,临界值T2及临界值T1皆设定为16。如同编码方法500,在实验一,尺寸大于或等于32x32的CU禁用TU估计(TU evaluation),且尺寸大于或等于32x32以不启用ACT的方式编码。在实验二,尺寸大于或等于16x16的CU禁用TU估计,且尺寸大于或等于16x16以不启用ACT的方式编码。尺寸小于16x16的CU在启用ACT的方式编码。实验是在有损条件 (lossy condition)及全画面帧内区块复制技术(full frame intra block copy) 下进行。Please refer to the encoding method 500 and Tables 2 and 3 below. Here, the experiments are performed under full intra-frame, random access and low delay B (low delay B). In experiment 1, both the threshold value T2 and the threshold value T1 are set to 32. In experiment 2, the threshold value T2 and the threshold value T1 are both set to 16. Like the encoding method 500, in experiment 1, TU evaluation is disabled for CUs with a size greater than or equal to 32x32, and ACT is not enabled for encoding with a size greater than or equal to 32x32. In experiment two, CUs with size greater than or equal to 16x16 disabled TU estimation, and sizes greater than or equal to 16x16 were coded without ACT enabled. CUs with size less than 16x16 are encoded in ACT-enabled. The experiments are carried out under lossy condition and full frame intra block copy technology.

Figure GDA0001970712540000221
Figure GDA0001970712540000221

Figure GDA0001970712540000231
Figure GDA0001970712540000231

Figure GDA0001970712540000232
Figure GDA0001970712540000232

Figure GDA0001970712540000241
Figure GDA0001970712540000241

表2Table 2

如表2所述,在实验一,全帧内模式(all intra mode)降低了5%的编码复杂度。随机存取及低延迟B各降低了1%的编码复杂度。各个设定显示出非常低的比特率损失,全帧内及随机存取几乎没有改变比特率。As described in Table 2, in Experiment 1, all intra mode reduces coding complexity by 5%. Random access and low latency B each reduce coding complexity by 1%. The individual settings showed very low bitrate loss, with full intraframe and random access barely changing the bitrate.

在实验二,全帧内模式降低了8%的编码复杂度。随机存取降低了1%的编码复杂度。低延迟B没有改变编码复杂度。相较于实验一,各个模式具有较多的比特率损失,但比特率损失仍然维持在最小(仅在百分比的小数范围内)。相较于原始视频,已编码视频仅略微降低比特率,故仅损失少部分的视频质量。由于编码方法500改善了编码时间,故这样的视频质量在大部分的应用是可以接受的。In experiment two, the full intra mode reduces the coding complexity by 8%. Random access reduces coding complexity by 1%. Low latency B does not change the coding complexity. Compared with experiment 1, each mode has more bit rate loss, but the bit rate loss is still kept to a minimum (only in the decimal range of the percentage). Compared to the original video, the bitrate of the encoded video is only slightly reduced, so only a small amount of video quality is lost. Since the encoding method 500 improves encoding time, such video quality is acceptable for most applications.

Figure GDA0001970712540000242
Figure GDA0001970712540000242

Figure GDA0001970712540000251
Figure GDA0001970712540000251

Figure GDA0001970712540000261
Figure GDA0001970712540000261

Figure GDA0001970712540000262
Figure GDA0001970712540000262

Figure GDA0001970712540000271
Figure GDA0001970712540000271

Figure GDA0001970712540000281
Figure GDA0001970712540000281

表3table 3

如表3所述,在实验一及实验二,各个模式在全部或平均来看,对比特率都没有改变。全帧内降低了最多比率的编码复杂度(在各实验均降低1%)。As shown in Table 3, in Experiment 1 and Experiment 2, each mode does not change the bit rate in total or on average. Encoding complexity is reduced by the most ratio (1% reduction in each experiment) in full frame.

请参考图5的编码方法500及下表4。在此,实验是在有损条件(lossy condition)、4-CTU帧内区块复制技术(Intra block copy)及4:4:4色度模式下进行。帧内区块复制技术利用运动向量从先前已编码CU复制一区块至目前编码视频画面。4-CTU指出运动向量能够搜寻的范围。Please refer to the encoding method 500 of FIG. 5 and Table 4 below. Here, the experiments are performed under lossy condition, 4-CTU Intra block copy technology and 4:4:4 chroma mode. Intra-block copy techniques use motion vectors to copy a block from a previously coded CU to the currently coded video picture. The 4-CTU indicates the range within which the motion vector can be searched.

在实验一,临界值T2及临界值T1皆设定为32。在实验二,临界值T2 及临界值T1皆设定为16。如同编码方法500,在实验一,尺寸大于或等于 32x32的CU禁用TU估计。在实验二,尺寸大于或等于16x16的CU禁用 TU估计。在实验一,尺寸大于32x32的CU启用ACT,尺寸大于或等于32x32 的CU禁用ACT。在实验二,尺寸小于16x16的CU启用ACT,尺寸大于或等于16x16的CU禁用ACT。In experiment 1, both the threshold value T2 and the threshold value T1 are set to 32. In experiment 2, the threshold value T2 and the threshold value T1 are both set to 16. As with the encoding method 500, in Experiment 1, TU estimation is disabled for CUs of size greater than or equal to 32x32. In experiment 2, TU estimation is disabled for CUs with size greater than or equal to 16x16. In Experiment 1, ACT is enabled for CUs with size greater than 32x32, and ACT is disabled for CUs with size greater than or equal to 32x32. In experiment 2, ACT is enabled for CUs with a size smaller than 16x16, and ACT is disabled for CUs with a size greater than or equal to 16x16.

Figure GDA0001970712540000282
Figure GDA0001970712540000282

Figure GDA0001970712540000291
Figure GDA0001970712540000291

Figure GDA0001970712540000301
Figure GDA0001970712540000301

Figure GDA0001970712540000302
Figure GDA0001970712540000302

Figure GDA0001970712540000311
Figure GDA0001970712540000311

表4Table 4

如表4所述,在实验一及实验二,全帧内、随机存取或低延迟B模式皆为最小比特率改变。全帧内降低了最多的编码复杂度,其在实验一降低了5%,在实验二降低了8%。As described in Table 4, in Experiment 1 and Experiment 2, full intra-frame, random access or low-latency B-mode are all the minimum bit rate changes. Intra-frame reduces the coding complexity the most, by 5% in experiment one and 8% in experiment two.

请参考图4的编码方法400及以下的表5.1及表5.2。在此,临界值T2 设定为64。因此,当步骤304的分量相关性分析分析出CU的色彩分量具有低相关性时,执行步骤402,以判断CU尺寸是否小于64x64。若CU尺寸小于64x64,则启用ACT且执行步骤310的RDO模式决定。若CU尺寸大于或等于64x64,则禁用ACT且进入步骤308。实验一采用全画面帧内区块复制技术(full frame intra block copy)的有损全帧内编码模式(lossy all intra encodingmode),实验二采用4CTU IBC技术的有损全帧内编码模式。色度模式于各个实验选择为4:4:4。Please refer to the encoding method 400 of FIG. 4 and Table 5.1 and Table 5.2 below. Here, the threshold value T2 is set to 64. Therefore, when the component correlation analysis in step 304 shows that the color components of the CU have low correlation, step 402 is executed to determine whether the size of the CU is smaller than 64×64. If the CU size is less than 64x64, ACT is enabled and the RDO mode determination of step 310 is performed. If the CU size is greater than or equal to 64x64, then ACT is disabled and step 308 is entered. Experiment 1 adopts the lossy all intra encoding mode of full frame intra block copy technology (full frame intra block copy), and experiment 2 adopts the lossy full intra encoding mode of 4CTU IBC technology. The chrominance mode was chosen to be 4:4:4 in each experiment.

Figure GDA0001970712540000312
Figure GDA0001970712540000312

表5.1:实验一Table 5.1: Experiment 1

Figure GDA0001970712540000313
Figure GDA0001970712540000313

表5.2:实验二Table 5.2: Experiment 2

如表5.1所示,在YUV色彩空间且全帧内、有损、全画面帧内区块复制技术下,编码方法400在最小比特率损失下,降低了1%到3%的编码时间。如表5.2所示,在全帧内、有损、4CTU帧内区块复制技术下,编码方法400 在最小比特损失下,降低编码时间的比率近似于表5.1的实验一。As shown in Table 5.1, the encoding method 400 reduces the encoding time by 1% to 3% with minimal bit rate loss under the YUV color space and full intra, lossy, full picture intra block copy technology. As shown in Table 5.2, under the full Intra, lossy, 4CTU Intra block copy technology, the encoding method 400 reduces the encoding time at a rate similar to that of experiment 1 in Table 5.1 with minimal bit loss.

请参考编码方法400及下表6。在此,临界值T2设定为64。在4:4:4 的色度模式执行无损帧内编码(Lossless intra encoding)。Please refer to the encoding method 400 and Table 6 below. Here, the threshold value T2 is set to 64. Lossless intra encoding is performed in 4:4:4 chroma mode.

Figure GDA0001970712540000321
Figure GDA0001970712540000321

表6Table 6

在YUV色彩空间,编码方法节省了0%到2%的编码时间。In YUV color space, the encoding method saves 0% to 2% of encoding time.

请参考图3的编码方法300及下表7。在此,临界值T1在实验一设定为 32,在实验二设定为16。如同编码方法300,在实验一,CU尺寸大于或等于 32x32时,将省略步骤314的TU尺寸决定;CU尺寸小于32x32时,则执行步骤314的TU尺寸决定。在实验二,CU尺寸大于或等于16x16时,将省略步骤314的TU尺寸决定;CU尺寸小于16x16时,则执行步骤314的TU尺寸决定。实验执行启用ACT的有损全帧内编码。Please refer to the encoding method 300 of FIG. 3 and Table 7 below. Here, the critical value T1 was set to 32 in the first experiment and 16 in the second experiment. Like the encoding method 300, in experiment 1, when the CU size is greater than or equal to 32x32, the TU size determination in step 314 is omitted; when the CU size is smaller than 32x32, the TU size determination in step 314 is performed. In experiment 2, when the CU size is greater than or equal to 16x16, the TU size determination in step 314 is omitted; when the CU size is smaller than 16x16, the TU size determination in step 314 is performed. The experiments performed ACT-enabled lossy full intra-coding.

Figure GDA0001970712540000322
Figure GDA0001970712540000322

Figure GDA0001970712540000331
Figure GDA0001970712540000331

表7Table 7

实验一的编码时间节省了3%到6%。实验二的编码时间节省了6%到10%。因此,仅在CU尺寸低于32x32或16x16之下才允许进行TU尺寸决定,以帮助编码效率。The encoding time for experiment one was saved by 3% to 6%. The encoding time for experiment two was saved by 6% to 10%. Therefore, TU size decisions are only allowed under CU sizes below 32x32 or 16x16 to help coding efficiency.

上述内容用以说明本公开的技术,然其并非用以局限本发明的内容。实施例的修改与调整均落于本公开的范围。举例来说,所公开的实施例包含软件及硬件,但本公开的系统与方法可以仅以硬件来实现。The above content is used to illustrate the technology of the present disclosure, but it is not intended to limit the content of the present invention. Modifications and adjustments of the embodiments fall within the scope of the present disclosure. For example, the disclosed embodiments include both software and hardware, but the systems and methods of the present disclosure may be implemented in hardware only.

软件开发者可基于本公开的方法开发一计算机程序,其可采用各种计算机程序技术来开发。举例来说,程序片段或程序模块可以藉由Java、C、C++、组合语言或任何其他程序语言来开发。一或多个软件片段与模块可以安装于一计算机系统、非暂态计算机可读取介质、或现存的通信软件。A software developer can develop a computer program based on the methods of the present disclosure, which can be developed using various computer programming techniques. For example, program segments or program modules may be developed in Java, C, C++, assembly language, or any other programming language. One or more software segments and modules may be installed on a computer system, non-transitory computer readable medium, or existing communications software.

再者,虽然上述以公开各种实施例,然本公开的范围包含各种元件的均等、修改、省略、组合(例如不同实施例之间的组合)、应用、或选择。权利要求书的元件以最广的范围来作解释,而不局限于实施例的内容。此外,方法的步骤可以进行修改(包含调整顺序、插入或删除步骤)。虽然本公开已以优选实施例公开如上,然其并非用以限定本公开。本公开的保护范围当视所附权利要求书界定范围为准。Furthermore, while the foregoing discloses various embodiments, the scope of the present disclosure includes equivalents, modifications, omissions, combinations (eg, between different embodiments), applications, or selections of various elements. The elements of the claims are to be construed in the broadest sense and are not limited to the content of the embodiments. Furthermore, the steps of the method may be modified (including reordering, insertion or deletion of steps). Although the present disclosure has been disclosed above in terms of preferred embodiments, it is not intended to limit the present disclosure. The scope of protection of the present disclosure should be determined by the scope defined by the appended claims.

本发明所属领域技术人员也可根据本公开的说明了解其他的实施例。本公开的范围包含结合一般知识的各种变化、实施与应用。说明书与实施例仅仅作为示例,本公开的保护范围当视所附权利要求书界定范围为准。Those skilled in the art to which the present invention pertains will also recognize other embodiments from the description of the present disclosure. The scope of the present disclosure includes various changes, implementations, and applications incorporating general knowledge. The description and the embodiments are only used as examples, and the protection scope of the present disclosure should be determined by the scope defined by the appended claims.

Claims (14)

1.一种视频编码方法,包括:1. A video coding method, comprising: 接收原始视频画面;Receive the original video picture; 分割该原始视频画面为编码树单元;dividing the original video picture into a coding tree unit; 从该编码树单元决定编码单元;determine a coding unit from the coding tree unit; 启用该编码单元的自适性色彩转换的编码模式,其中该编码单元的尺寸为NxN;a coding mode that enables adaptive color conversion of the coding unit, wherein the size of the coding unit is NxN; 判断N是否小于一第一临界值;determining whether N is less than a first critical value; 当N小于该第一临界值,决定一转换单元的尺寸且决定该编码单元的色度模式;以及When N is less than the first threshold, determining the size of a conversion unit and determining the chroma mode of the coding unit; and 当N不小于该第一临界值,不决定该转换单元的尺寸且决定该编码单元的色度模式。When N is not less than the first threshold, the size of the conversion unit is not determined and the chroma mode of the coding unit is determined. 2.如权利要求1所述的方法,还包括:2. The method of claim 1, further comprising: 判断该编码单元的色彩空间;Determine the color space of the coding unit; 其中判断该编码单元的该色彩空间包括判断该色彩空间是否为红色、绿色及蓝色色彩空间或亮度及色度色彩空间。The determining of the color space of the coding unit includes determining whether the color space is a red, green and blue color space or a luminance and chrominance color space. 3.如权利要求2所述的方法,还包括:3. The method of claim 2, further comprising: 若启用的该编码模式为启用自适性色彩转换的帧内预测模式,当该色彩空间被判断为该红色、绿色及蓝色色彩空间时,执行成本模式决定。If the enabled encoding mode is an adaptive color conversion enabled intra prediction mode, the cost mode determination is performed when the color space is determined to be the red, green and blue color spaces. 4.如权利要求2所述的方法,还包括:4. The method of claim 2, further comprising: 若启用的该编码模式为启用自适性色彩转换的帧内预测模式,当该色彩空间被判断为亮度及色度色彩空间时,执行成本模式决定。If the enabled encoding mode is an adaptive color conversion-enabled intra prediction mode, the cost mode determination is performed when the color space is determined to be a luma and chroma color space. 5.如权利要求2所述的方法,还包括:5. The method of claim 2, further comprising: 判断N是否小于第二临界值;以及determining whether N is less than a second critical value; and 当该色彩空间被判断为该亮度及色度色彩空间且N大于或等于该第二临界值,禁用该编码单元的该编码模式。When the color space is determined to be the luminance and chrominance color space and N is greater than or equal to the second threshold, the encoding mode of the encoding unit is disabled. 6.如权利要求2所述的方法,还包括:6. The method of claim 2, further comprising: 判断N是否小于该第二临界值;以及determining whether N is less than the second threshold; and 当该色彩空间被判断为该亮度及色度色彩空间且N小于该第二临界值,启用该编码模式,该编码模式启用自适性色彩转换。When the color space is determined to be the luminance and chrominance color space and N is less than the second threshold, the encoding mode is enabled, and the encoding mode enables adaptive color conversion. 7.如权利要求2所述的方法,还包括:7. The method of claim 2, further comprising: 判断N是否大于或等于第二临界值;Determine whether N is greater than or equal to the second critical value; 当该色彩空间被判断为该亮度及色度色彩空间且N不大于或等于该第二临界值,启用该编码模式,该编码模式启用自适性色彩转换。When the color space is determined to be the luminance and chrominance color space and N is not greater than or equal to the second threshold, the encoding mode is enabled, and the encoding mode enables adaptive color conversion. 8.如权利要求1所述的方法,还包括:8. The method of claim 1, further comprising: 若该原始视频画面为非444且N小于该第一临界值,评估该转换单元的尺寸。If the original video frame is non-444 and N is less than the first threshold, the size of the conversion unit is evaluated. 9.一种视频编码系统,包括:9. A video coding system, comprising: 存储器,用以存储组指令;以及memory for storing group instructions; and 处理器,用以执行该组指令,该组指令包括:a processor to execute the set of instructions, the set of instructions includes: 接收原始视频画面;Receive the original video picture; 分割该原始视频画面为编码树单元;dividing the original video picture into a coding tree unit; 从该编码树单元决定编码单元;determine a coding unit from the coding tree unit; 启用该编码单元的自适性色彩转换的编码模式,其中该编码单元的尺寸为NxN;a coding mode that enables adaptive color conversion of the coding unit, wherein the size of the coding unit is NxN; 判断N是否小于一第一临界值;determining whether N is less than a first critical value; 当N小于该第一临界值,决定一转换单元的尺寸且决定该编码单元的色度模式;及When N is less than the first threshold, determining the size of a conversion unit and determining the chroma mode of the coding unit; and 当N不小于该第一临界值,不决定该转换单元的尺寸且决定该编码单元的色度模式。When N is not less than the first threshold, the size of the conversion unit is not determined and the chroma mode of the coding unit is determined. 10.如权利要求9所述的系统,其中该处理器用以执行的该组指令还包括:10. The system of claim 9, wherein the set of instructions executed by the processor further comprises: 判断该编码单元的色彩空间;Determine the color space of the coding unit; 其中该色彩空间被判断是否为红色、绿色及蓝色色彩空间或亮度及色度色彩空间。The color space is judged whether it is the red, green and blue color space or the luminance and chrominance color space. 11.如权利要求10所述的系统,其中该处理器用以执行的该组指令还包括:11. The system of claim 10, wherein the set of instructions for execution by the processor further comprises: 判断N是否小于第二临界值;Determine whether N is less than the second critical value; 当色彩空间被判断为该亮度及色度色彩空间且N小于该第二临界值,启用该编码模式,该编码模式启用自适性色彩转换。When the color space is determined to be the luminance and chrominance color space and N is less than the second threshold, the encoding mode is enabled, and the encoding mode enables adaptive color conversion. 12.如权利要求10所述的系统,其中该处理器用以处理的该组指令还包括:12. The system of claim 10, wherein the set of instructions for processing by the processor further comprises: 判断N是否大于或等于第二临界值;Determine whether N is greater than or equal to the second critical value; 当该色彩空间被判断为该亮度及色度色彩空间且N不大于或等于该第二临界值,启用该编码模式,该编码模式启用自适性色彩转换。When the color space is determined to be the luminance and chrominance color space and N is not greater than or equal to the second threshold, the encoding mode is enabled, and the encoding mode enables adaptive color conversion. 13.如权利要求9所述的系统,其中该处理器用以处理的该组指令还包括:13. The system of claim 9, wherein the set of instructions for processing by the processor further comprises: 若该原始视频画面为非444且N小于该第一临界值,评估该转换单元的尺寸。If the original video frame is non-444 and N is less than the first threshold, the size of the conversion unit is evaluated. 14.一种非暂态计算机可读取记录介质,用以存储组指令,该组指令由一或多个处理器执行,以执行视频编码方法,其中该视频编码方法包括:14. A non-transitory computer-readable recording medium for storing a set of instructions executable by one or more processors to perform a video encoding method, wherein the video encoding method comprises: 接收原始视频画面;Receive the original video picture; 分割该原始视频画面为编码树单元;dividing the original video picture into a coding tree unit; 从该编码树单元决定编码单元;determine a coding unit from the coding tree unit; 启用该编码单元的自适性色彩转换的编码模式,其中该编码单元的尺寸为NxN;a coding mode that enables adaptive color conversion of the coding unit, wherein the size of the coding unit is NxN; 判断N是否小于一第一临界值;judging whether N is less than a first critical value; 当N小于该第一临界值,决定一转换单元的尺寸且决定该编码单元的色度模式;以及When N is less than the first threshold, determining the size of a conversion unit and determining the chroma mode of the coding unit; and 当N不小于该第一临界值,不决定该转换单元的尺寸且决定该编码单元的色度模式。When N is not less than the first threshold, the size of the conversion unit is not determined and the chroma mode of the coding unit is determined.
CN201610357374.8A 2015-06-08 2016-05-26 Video encoding method, system and computer-readable recording medium using adaptive color conversion Active CN106254870B (en)

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
US201562172256P 2015-06-08 2015-06-08
US62/172,256 2015-06-08
US14/757,556 US20160360205A1 (en) 2015-06-08 2015-12-24 Video encoding methods and systems using adaptive color transform
US14/757,556 2015-12-24
US201662290992P 2016-02-04 2016-02-04
US62/290,992 2016-02-04
TW105114323 2016-05-09
TW105114323A TWI597977B (en) 2015-06-08 2016-05-09 Video encoding methods and systems using adaptive color transform

Publications (2)

Publication Number Publication Date
CN106254870A CN106254870A (en) 2016-12-21
CN106254870B true CN106254870B (en) 2020-08-18

Family

ID=57626642

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610357374.8A Active CN106254870B (en) 2015-06-08 2016-05-26 Video encoding method, system and computer-readable recording medium using adaptive color conversion

Country Status (1)

Country Link
CN (1) CN106254870B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106851272B (en) * 2017-01-20 2019-11-12 杭州当虹科技股份有限公司 A kind of method of HDR and SDR adaptive rate control
US10820017B2 (en) * 2017-03-15 2020-10-27 Mediatek Inc. Method and apparatus of video coding
WO2019076138A1 (en) 2017-10-16 2019-04-25 Huawei Technologies Co., Ltd. METHOD AND APPARATUS FOR ENCODING
CN108174214A (en) * 2017-12-08 2018-06-15 重庆邮电大学 A Remote Desktop Sharing Method Based on Screen Content Video Coding
CN111758255A (en) 2018-02-23 2020-10-09 华为技术有限公司 Position dependent spatially varying transforms for video coding
IL279095B2 (en) 2018-05-31 2024-12-01 Huawei Tech Co Ltd Spatially varying transform with a type of adaptive transform
CN117579830A (en) 2019-06-21 2024-02-20 北京字节跳动网络技术有限公司 Selective use of adaptive intra-annular color space conversion and other codec tools
WO2021018082A1 (en) * 2019-07-26 2021-02-04 Beijing Bytedance Network Technology Co., Ltd. Determination of picture partition mode based on block size
CN115152219A (en) 2019-11-07 2022-10-04 抖音视界有限公司 Quantization characteristics of adaptive in-loop color space transforms for video coding
TWI743919B (en) * 2020-08-03 2021-10-21 緯創資通股份有限公司 Video processing apparatus and processing method of video stream

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130136180A1 (en) * 2011-11-29 2013-05-30 Futurewei Technologies, Inc. Unified Partitioning Structures and Signaling Methods for High Efficiency Video Coding

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
AHG6: On Adaptive Color Transform (ACT) in SCM2.0;PoLin Lai, Shan Liu, Shawmin Lei;《19.JCT-VC MEETING》;20141008;摘要,第2-4小节 *
High Efficiency Video Coding (HEVC) Test Model 16 (HM 16) Encoder Description;K. McCann;《18.JCT-VC MEETING》;20141014;4.1.3小节,4.1.5小节,图4-4,图4-6 *
Screen content coding test model 2 (SCM 2);Rajan Joshi;《17.JCT-VC MEETING》;20141017;全文 *

Also Published As

Publication number Publication date
CN106254870A (en) 2016-12-21

Similar Documents

Publication Publication Date Title
CN106254870B (en) Video encoding method, system and computer-readable recording medium using adaptive color conversion
CN113812148B (en) Reference picture resampling and inter coding tools for video coding
TWI839662B (en) Video decoding method
US20160360205A1 (en) Video encoding methods and systems using adaptive color transform
EP3598758B1 (en) Encoder decisions based on results of hash-based block matching
US10390020B2 (en) Video encoding methods and systems using adaptive color transform
KR102606414B1 (en) Encoder, decoder and corresponding method to derive edge strength of deblocking filter
JP6355715B2 (en) Encoding device, decoding device, encoding method, decoding method, and program
CN112385233B (en) Intra-frame smoothing (MDIS) of combined dependent modes and interpolation filter switching with position dependent intra-frame prediction combining (PDPC)
WO2015138954A1 (en) Color-space inverse transform both for lossy and lossless encoded video
CN113545051B (en) Reconstruction of blocks of video data using block size restriction
GB2582929A (en) Residual signalling
JP7592968B2 (en) Encoder, decoder and corresponding method of deblocking filter adaptation
CA2878440A1 (en) Restricted intra deblocking filtering for video coding
CN112673636A (en) Rounding motion vectors to adaptive motion vector difference resolution and improve motion vector storage accuracy in video coding
CN113330743A (en) Encoder, decoder and corresponding method for deblocking filter adaptation
CN114598873B (en) Decoding method and device for quantization parameter
EP3104606B1 (en) Video encoding methods and systems using adaptive color transform
TWI597977B (en) Video encoding methods and systems using adaptive color transform
WO2015142618A1 (en) Systems and methods for low complexity encoding and background detection
US20150117521A1 (en) Method and apparatus for inter color component prediction
CN111327899A (en) Video decoder and corresponding method
KR20230123947A (en) Adaptive loop filter with fixed filters
EP2899975A1 (en) Video encoder with intra-prediction pre-processing and methods for use therewith
RU2786626C2 (en) Method and device for image separation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant