WO2022227082A1 - Block division methods, encoders, decoders, and computer storage medium - Google Patents
Block division methods, encoders, decoders, and computer storage medium Download PDFInfo
- Publication number
- WO2022227082A1 WO2022227082A1 PCT/CN2021/091736 CN2021091736W WO2022227082A1 WO 2022227082 A1 WO2022227082 A1 WO 2022227082A1 CN 2021091736 W CN2021091736 W CN 2021091736W WO 2022227082 A1 WO2022227082 A1 WO 2022227082A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- block
- value
- current block
- division
- video image
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 143
- 238000004364 calculation method Methods 0.000 claims abstract description 22
- 238000007781 pre-processing Methods 0.000 claims abstract description 4
- 230000015654 memory Effects 0.000 claims description 40
- 238000005192 partition Methods 0.000 claims description 25
- 238000012545 processing Methods 0.000 claims description 23
- 238000004590 computer program Methods 0.000 claims description 21
- 238000004458 analytical method Methods 0.000 claims description 13
- 230000008569 process Effects 0.000 abstract description 31
- 230000003044 adaptive effect Effects 0.000 abstract description 17
- 230000007246 mechanism Effects 0.000 abstract description 13
- 230000009466 transformation Effects 0.000 abstract description 13
- 238000010586 diagram Methods 0.000 description 17
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 description 12
- 238000005516 engineering process Methods 0.000 description 12
- 238000001914 filtration Methods 0.000 description 12
- 238000013139 quantization Methods 0.000 description 11
- 238000013461 design Methods 0.000 description 10
- 230000006870 function Effects 0.000 description 7
- 239000013598 vector Substances 0.000 description 7
- 238000004891 communication Methods 0.000 description 6
- 238000012360 testing method Methods 0.000 description 6
- 230000001360 synchronised effect Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 230000000903 blocking effect Effects 0.000 description 2
- 239000002699 waste material Substances 0.000 description 2
- KLDZYURQCUYZBL-UHFFFAOYSA-N 2-[3-[(2-hydroxyphenyl)methylideneamino]propyliminomethyl]phenol Chemical compound OC1=CC=CC=C1C=NCCCN=CC1=CC=CC=C1O KLDZYURQCUYZBL-UHFFFAOYSA-N 0.000 description 1
- 230000003139 buffering effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 201000001098 delayed sleep phase syndrome Diseases 0.000 description 1
- 208000033921 delayed sleep phase type circadian rhythm sleep disease Diseases 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/102—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
- H04N19/119—Adaptive subdivision aspects, e.g. subdivision of a picture into rectangular or non-rectangular coding blocks
Definitions
- the embodiments of the present application relate to the technical field of video coding and decoding, and in particular, to a block division method, an encoder, a decoder, and a computer storage medium.
- H.265/High Efficiency Video Coding (HEVC) has been unable to meet the needs of the rapid development of video applications.
- JVET Joint Video Exploration Team
- VVC VVC's reference software test platform
- the Quad Tree Quad Tree with nested Multi-type Tree, QTMT
- the prediction and rate-distortion cost calculation of large-size coding blocks may generate a lot of unnecessary overhead, waste computing resources, and increase coding time.
- Embodiments of the present application provide a block division method, an encoder, a decoder, and a computer storage medium, which can reduce coding complexity and further improve coding and decoding efficiency.
- an embodiment of the present application provides a block division method, which is applied to an encoder, and the method includes:
- the current block is encoded according to the block partition parameter.
- an embodiment of the present application provides a block division method, which is applied to a decoder, and the method includes:
- the code stream is parsed to determine the predicted value of the current block
- the code stream is parsed to determine the residual value of the current block
- the reconstructed value of the current block is determined.
- an embodiment of the present application provides an encoder, the encoder includes a first determination unit, a block division unit, and an encoding unit; wherein,
- a first determining unit configured to determine the maximum unit size information of the current block based on the texture information of the video image
- a block division unit configured to preprocess the current block according to the maximum unit size information, to determine a division mode of the current block; and to determine a block division parameter of the current block according to the division mode;
- an encoding unit configured to encode the current block according to the block division parameter.
- an embodiment of the present application provides an encoder, where the encoder includes a first memory and a first processor; wherein,
- a first memory for storing a computer program executable on the first processor
- the first processor is configured to execute the method according to the first aspect when running the computer program.
- an embodiment of the present application provides a decoder, where the decoder includes a parsing unit and a second determining unit; wherein,
- a parsing unit configured to parse the code stream, and determine the block division parameters of the current block
- the parsing unit is further configured to parse the code stream based on the block division parameter to determine the predicted value of the current block; and based on the block division parameter, parse the code stream to determine the residual value of the current block;
- the second determination unit is configured to determine the reconstruction value of the current block based on the predicted value and the residual value.
- an embodiment of the present application provides a decoder, the decoder includes a second memory and a second processor; wherein,
- a second memory for storing a computer program executable on the second processor
- the second processor is configured to execute the method according to the second aspect when running the computer program.
- an embodiment of the present application provides a computer storage medium, where the computer storage medium stores a computer program, and when the computer program is executed, the method described in the first aspect or the method described in the second aspect is implemented.
- the embodiments of the present application provide a block division method, an encoder, a decoder, and a computer storage medium.
- the maximum unit size information of the current block is determined;
- the block is preprocessed to determine the division mode of the current block;
- the block division parameter of the current block is determined according to the division mode; and
- the current block is encoded according to the block division parameter.
- the code stream is parsed to determine the block division parameters of the current block; based on the block division parameters, the code stream is parsed to determine the predicted value of the current block; based on the block division parameters, the code stream is parsed to determine the residual value of the current block; And, based on the predicted value and the residual value, the reconstructed value of the current block is determined.
- the maximum unit size information of the current block is determined according to the texture information of the video image, that is, the technical solution of the present application designs an adaptive image texture mechanism for the maximum unit size, which can directly skip the prediction and transformation process of large-size blocks.
- the calculation of leads to an exponential decrease in the total number of recursion of block division, thus significantly reducing the coding complexity and reducing the coding time while keeping the performance gain basically unchanged, thereby improving the coding and decoding efficiency.
- FIG. 1 is a schematic structural diagram of a multi-type tree provided by the related art
- FIG. 2 is a schematic flowchart of a block division provided by the related art
- FIG. 3 is a schematic structural diagram of another block division provided by the related art.
- 4A is a schematic block diagram of the composition of an encoder according to an embodiment of the present application.
- 4B is a schematic block diagram of the composition of a decoder according to an embodiment of the present application.
- FIG. 5 is a schematic flowchart of a block division method provided by an embodiment of the present application.
- FIG. 6 is a schematic flowchart of determining maximum unit size information according to an embodiment of the present application.
- FIG. 7 is a detailed schematic flow chart of determining maximum unit size information according to an embodiment of the present application.
- FIG. 8 is a schematic flowchart of another block division method provided by an embodiment of the present application.
- FIG. 10 is a schematic diagram of a specific hardware structure of an encoder provided by an embodiment of the application.
- FIG. 11 is a schematic diagram of the composition and structure of a decoder provided by an embodiment of the application.
- FIG. 12 is a schematic diagram of a specific hardware structure of a decoder provided by an embodiment of the present application.
- a first image component, a second image component, and a third image component are generally used to represent a coding block (Coding Block, CB); wherein, the three image components are a luminance component and a blue chrominance component respectively. and a red chrominance component, specifically, the luminance component is usually represented by the symbol Y, the blue chrominance component is usually represented by the symbol Cb or U, and the red chrominance component is usually represented by the symbol Cr or V; in this way, the video image can use the YCbCr format Representation can also be represented in YUV format.
- CB coding block
- the three image components are a luminance component and a blue chrominance component respectively.
- a red chrominance component specifically, the luminance component is usually represented by the symbol Y, the blue chrominance component is usually represented by the symbol Cb or U, and the red chrominance component is usually represented by the symbol Cr or V; in this way, the video image can use the
- MPEG Moving Picture Experts Group
- JVET Joint Video Experts Team
- VVC Versatile Video Coding
- VVC's reference software test platform VVC Test Model, VTM
- HPM High-Performance Model
- Quad Tree Quad Tree (Quad Tree, QT)
- Quad Tree with nested Multi-type Tree QTMT
- Quantization Parameter (QP) Quantization Parameter
- Coding Unit Coding Unit
- FIG. 1 shows a schematic structural diagram of a multi-type tree provided by the related art.
- the CTU is firstly divided by the quadtree, and then the leaf nodes of the quadtree can be further divided by the MT.
- the flow of CU division is shown in FIG. 2 .
- the CTU/quadtree node first determine whether to perform quadtree division; if the value of the identification information (such as flag) at this time is 1, indicating that quadtree division is performed, then quadtree division can be obtained.
- the quad-leaf node/multi-type tree node can be obtained, and then it is judged again whether to perform multi-type tree division, Until the division obtains a multi-type tree node divided by vertical binary division/multi-type tree node divided by vertical trigeminal division/multi-type tree node divided by horizontal binary division/multi-type tree node divided by horizontal trigeminal division.
- QT nodes can be divided according to the 5 ways shown in Figure 2, and MT nodes can be divided according to the 4 ways shown in Figure 2.
- the results of the five division methods of QT nodes include: quad-tree node, multi-type tree node divided by vertical binary tree, multi-type tree node divided by vertical ternary tree, multi-type tree node divided by horizontal binary tree, multi-type tree node divided by horizontal ternary tree Multi-type tree nodes;
- the results of the four division methods of MT nodes include: multi-type tree nodes divided by vertical binary tree, multi-type tree nodes divided by vertical trigeminal tree, multi-type tree nodes divided by horizontal binary tree, multi-type tree nodes divided by horizontal trigeminal tree tree node;
- FIG. 3 shows a schematic structural diagram of a QTMT block division provided by the related art, which can be regarded as a specific example of a final CTU division manner of VVC.
- the QTMT block division is located in the intra/inter prediction module. According to different block divisions, the corresponding reference blocks are found for prediction, and then the division mode with the least rate-distortion cost is found to obtain the final prediction residual. If it is poor, the next steps such as transformation and quantization can be performed.
- an implementation method of an encoder using QTMT is: first, the input image is divided into multiple non-overlapping CTU blocks. Then, each CTU is processed in turn according to the raster scanning order, and the CTU is divided into several CUs, which mainly includes the following four steps: 1 Calculate the first rate-distortion cost result of predictive coding when it is not divided (represented by RdCost0); 2 Set the The CTU is divided according to the QT mode and predicted and encoded, and the second rate-distortion cost result (represented by RdCost1) is calculated; 3 Compare RdCost0 and RdCost1, if RdCost1 is smaller, continue to process 4 sub-CUs in sequence; 4 Each CU is in QT or MT mode Partition prediction, calculate the rate-distortion cost result of each division method, select the current optimal RdCost by comparison, and repeat recursively until a block division mode with the smallest rate-distortion cost is selected. Finally, the residual block
- an implementation method of a decoder using QTMT is: first, perform entropy decoding, inverse quantization, and inverse transformation on the input code stream to obtain a residual block; then, reconstruct an image according to the residual block , the reconstruction process mainly includes the following three steps: 1. Determine the partition tree of the current CTU according to the prediction information such as the block partition mode; 2. Process each CU of the partition tree in turn according to the raster scan order, and use the motion vector and other information to find the prediction block; 3. The residual value and the predicted value of the current CU are superimposed to obtain the reconstructed CU. Finally, the reconstructed image is sent to the Deblocking Filter (DBF)/Sample Adaptive Offset (SAO) filter/Adaptive Loop Filter (ALF), and the filtered image Send it to the buffer area and wait for the video to play.
- DPF Deblocking Filter
- SAO Sample Adaptive Offset
- ALF Adaptive Loop Filter
- the embodiment of the present application provides a block division method.
- the maximum unit size information of the current block is determined; the current block is preprocessed according to the maximum unit size information, and the division of the current block is determined. mode; determining a block division parameter of the current block according to the division mode; and encoding the current block according to the block division parameter.
- the code stream is parsed to determine the block division parameters of the current block; based on the block division parameters, the code stream is parsed to determine the predicted value of the current block; based on the block division parameters, the code stream is parsed to determine the residual value of the current block; And, based on the predicted value and the residual value, the reconstructed value of the current block is determined.
- the technical solution of the present application designs an adaptive image texture mechanism for the maximum unit size, which can directly skip the prediction and transformation process of large-size blocks.
- the calculation of leads to an exponential decrease in the total number of recursion of block division, thus significantly reducing the coding complexity and reducing the coding time while keeping the performance gain basically unchanged, thereby improving the coding and decoding efficiency.
- the encoder 10 includes a transform and quantization unit 101, an intra-frame estimation unit 102, an intra-frame prediction unit 103, a motion compensation unit 104, a motion estimation unit 105, an inverse transform and inverse quantization unit 106, and a filter control unit.
- a video coding block can be obtained by dividing the coding tree unit (Coding Tree Unit, CTU), and then the residual pixel information obtained after intra-frame or inter-frame prediction is transformed and quantized by the quantization unit 101.
- the video coding block is transformed, including transforming residual information from the pixel domain to the transform domain, and quantizing the resulting transform coefficients to further reduce the bit rate;
- the intra-frame estimation unit 102 and the intra-frame prediction unit 103 are used for Intra prediction is performed on the video coding block; specifically, the intra prediction unit 102 and the intra prediction unit 103 are used to determine the intra prediction mode to be used to encode the video coding block;
- the motion compensation unit 104 and the motion estimation unit 105 is used to perform inter-predictive encoding of the received video encoding block relative to one or more blocks in one or more reference frames to provide temporal prediction information;
- the motion estimation performed by the motion estimation unit 105 is to generate a motion vector.
- the motion vector can estimate the motion of the video coding block, and then the motion compensation unit 104 performs motion compensation based on the motion vector determined by the motion estimation unit 105; after determining the intra prediction mode, the intra prediction unit 103 also For providing the selected intra prediction data to the encoding unit 109, and the motion estimation unit 105 also sends the calculated motion vector data to the encoding unit 109; in addition, the inverse transform and inverse quantization unit 106 is used for the video Reconstruction of the coding block, reconstructing the residual block in the pixel domain, the reconstructed residual block removing the blocking artifacts by the filter control analysis unit 107 and the filtering unit 108, and then adding the reconstructed residual block to the decoding A predictive block in the frame of the image buffer unit 110 is used to generate a reconstructed video coding block; the coding unit 109 is used for coding various coding parameters and quantized transform coefficients.
- the context content can be based on adjacent coding blocks, and can be used to encode information indicating the determined intra-frame prediction mode, and output a code stream of the video signal; and the decoded image buffer unit 110 is used to store the reconstructed video coding blocks, for Forecast reference. As the video image coding proceeds, new reconstructed video coding blocks are continuously generated, and these reconstructed video coding blocks are all stored in the decoded image buffer unit 110 .
- the decoder 20 includes a decoding unit 201, an inverse transform and inverse quantization unit 202, an intra prediction unit 203, a motion compensation unit 204, a filtering unit 205, a decoded image buffer unit 206, etc., wherein the decoding unit 201 Header decoding and CABAC decoding can be implemented, and the filtering unit 205 can implement DBF filtering/SAO filtering/ALF filtering.
- the decoding unit 201 Header decoding and CABAC decoding can be implemented
- the filtering unit 205 can implement DBF filtering/SAO filtering/ALF filtering.
- the code stream of the video signal is output; the code stream is input into the video decoding system 20, and firstly passes through the decoding unit 201 to obtain the decoded transform coefficient; Inverse transform and inverse quantization unit 202 processes to generate residual blocks in the pixel domain; intra prediction unit 203 may be used to generate based on the determined intra prediction mode and data from previously decoded blocks of the current frame or picture Prediction data for the current video decoding block; motion compensation unit 204 determines prediction information for the video decoding block by parsing the motion vector and other associated syntax elements, and uses the prediction information to generate predictive information for the video decoding block being decoded block; a decoded video block is formed by summing the residual block from inverse transform and inverse quantization unit 202 and the corresponding predictive block produced by intra prediction unit 203 or motion compensation unit 204; the decoded video signal Video quality may be improved by filtering unit 205 in order to remove blocking artifacts; decoded video blocks are then stored in decoded image buffer unit 206, which stores reference
- the block division method in the embodiment of the present application can be applied to a video codec chip, and the use of the QTMT mode can significantly improve the coding performance.
- it can be applied to the intra/inter prediction part as shown in FIG. 4A (represented by a black bold box, specifically including the intra-frame estimation unit 102, the intra-frame prediction unit 103, the motion compensation unit 104, the motion estimation unit 105), it can also be applied to the intra/inter prediction part as shown in FIG. 4B (represented by a bold black box, specifically including the intra prediction unit 203 and the motion compensation unit 204).
- the block division method in the embodiments of the present application can be applied to a video encoding system (referred to as “encoder” for short), also can be applied to a video decoding system (referred to as “decoder” for short), or even simultaneously It is applied to the video coding system and the video decoding system, but no limitation is made here.
- the “current block” specifically refers to the block currently to be encoded in the video image (may also be referred to as “encoding blocks” for short);
- the “current block” specifically refers to the block currently to be decoded in the video image (it may also be referred to as a "decoding block” for short).
- FIG. 5 shows a schematic flowchart of a block division method provided by an embodiment of the present application. As shown in Figure 5, the method may include:
- S501 Determine maximum unit size information of the current block based on the texture information of the video image.
- the block division method in the embodiment of the present application is applied to an encoder.
- the video image can be divided into a plurality of image blocks, and each image block to be encoded can be called an encoding block, and the current block here specifically refers to the encoding block currently to be encoded. It may be a CTU, or even a CU, etc., which is not limited in any embodiment of the present application.
- the embodiments of the present application mainly provide a fast block division technology for high bit depth video based on texture analysis, that is, applied to high bit depth video. Therefore, in some embodiments, the method may further include:
- the step of determining the maximum unit size information of the current block based on the texture information of the video image is performed.
- the determination of whether a video image is a high bit depth video may be represented by the identification information of the video image.
- the determining the identification information of the video image may include:
- the identification information of the video image indicates that the video image is a high bit depth video, then determine that the value of the identification information of the video image is the first value; or,
- the value of the identification information of the video image is determined to be the second value.
- the first value and the second value are different, and the first value and the second value may be in the form of parameters or in the form of numbers.
- the identification information of the video image is a parameter written in the profile (profile), but the identification information of the video image may also be a flag (flag), which is not limited here.
- the first value may be set to 1, and the second value may be set to 0; in another specific example, the first value may be set to 0.
- One value can also be set to true, and the second value can also be set to false; even in another specific example, the first value can also be set to 0, and the second value can also be set to 1; or, the first value can also be set to Can be set to false, and the second value can also be set to true.
- the first value and the second value in this embodiment of the present application are not limited in any way.
- the embodiments of the present application provide an encoding method, and specifically provide a block division method. More specifically, the embodiments of the present application design an adaptive maximum unit size mechanism based on image texture for high-bit-depth video, so that Subsequent computations of prediction and transform processes for large-sized blocks can be skipped directly.
- the method may further include: encoding the identification information of the video image, and writing the encoded bits into the code stream.
- the decoder can directly determine whether the video image is a high-bit-depth video by parsing the code stream, so as to facilitate the decoder to perform subsequent operations.
- the maximum unit size information it can be the BT maximum unit size (represented by maxBtSize) used to limit the binary tree division, or the TT maximum unit size information used to limit the ternary tree division (represented by maxTtSize indicates).
- the maximum cell size information may be represented by maxBtSize or maxTtSize.
- its level may be at least one of the following: sequence level and image level.
- a video sequence may be input, and then the initial frame is used as a video image, and the maximum unit size information corresponding to the initial frame is determined accordingly.
- the maximum unit size information is at the sequence level, the entire video sequence can use this maximum unit size information.
- the maximum unit size information is at the picture level (or called "frame level"), then for the entire video sequence, the maximum unit size information corresponding to each frame may be different, which is Each frame is used as a video image to determine the corresponding maximum unit size information of each frame. It should also be noted that for different blocks in the same frame, the maximum unit size information is the same.
- the maximum unit size information can be further determined, that is, the adaptive maximum unit size based on the image texture is realized.
- Cell size mechanism
- S502 Preprocess the current block according to the maximum unit size information to determine the division mode of the current block.
- the current block can be preprocessed according to the maximum unit size information, such as calculating the rate-distortion cost value under different division modes, and then selecting the optimal rate-distortion cost value (or called “minimum rate-distortion cost”) to determine the partition mode of the current block.
- the division mode may be determined by calculating the rate-distortion cost.
- the current block is preprocessed according to the maximum unit size information, and the division mode of the current block is determined, which may include:
- the division mode of the current block is determined according to the comparison result of the first rate-distortion cost value and the second rate-distortion cost value.
- determining the division mode of the current block according to the comparison result of the first rate-distortion cost value and the second rate-distortion cost value may include:
- the second rate-distortion cost value is less than the first rate-distortion cost value, use a preset division mode to divide the second node sub-block to obtain at least one next-level second node sub-block, and calculate the third rate-distortion cost value;
- the division mode of the current block is determined according to the second rate-distortion cost value and the third rate-distortion cost value.
- determining the division mode of the current block according to the second rate-distortion cost value and the third rate-distortion cost value may include:
- the third rate-distortion cost value is less than the second rate-distortion cost value
- update the second rate-distortion cost value by using the third rate-distortion cost value
- return to perform dividing the second node sub-block by using the preset dividing mode The steps of obtaining at least one next-level second node sub-block, and calculating the third rate-distortion cost value, until the minimum rate-distortion cost value is determined;
- the division mode of the current block is determined according to the division mode corresponding to the minimum rate-distortion cost value.
- the first node sub-block may be a node sub-block obtained by dividing the current block for the first time
- the second node sub-block may be a node sub-block obtained by continuing division based on a preset division mode, or It is regarded as starting from the second division, and the node sub-blocks obtained by the subsequent step-by-step division can be collectively referred to as the second node sub-blocks.
- the second rate-distortion cost value represents the rate-distortion cost value that does not continue to divide the ith level; and by using the preset division mode to continue dividing the node sub-blocks of the current level, the ith level can be obtained.
- the third rate-distortion cost value can be calculated at this time, and the third rate-distortion cost value here represents the rate-distortion cost value of continuing to divide the i-th level. Then perform a comparison between the second rate distortion cost value and the third rate distortion cost value.
- the third rate distortion cost value is less than the second rate distortion cost value, for the i+1th level, the third rate distortion cost value can be calculated at this time. It is regarded as the second rate-distortion cost value that does not continue to divide the i+1th level; and by using the preset division mode to continue to divide the node sub-block of the current level, the node sub-block of the i+2th level can be obtained, and Calculate the new third rate-distortion cost value.
- the third rate-distortion cost value represents the rate-distortion cost value for the continued division of the i+1 level, and execute the second rate-distortion cost value and the third rate-distortion cost value again. value comparison.
- the node sub-blocks of the current level can be divided into the next-level node sub-blocks again, and the rate-distortion cost value comparison can be continued, and the recursive cycle has been carried out. Go on until the minimum rate-distortion cost value is determined, and then the division mode corresponding to the minimum rate-distortion cost value is determined as the division mode.
- the preset division mode may include a quad-tree division mode and/or a multi-type tree division mode; wherein, the multi-type tree division mode may include at least one of the following: vertical Binary tree partition mode, horizontal binary tree partition mode, vertical ternary tree partition mode and horizontal ternary tree partition mode.
- the vertical binary tree division mode and the horizontal binary tree division mode may be collectively referred to as a binary tree division mode
- the vertical ternary tree division mode and the horizontal ternary tree division mode may be collectively referred to as a ternary tree division mode.
- the first node sub-block for "using the preset division mode to divide the first node sub-block to obtain at least one second node sub-block", specifically, if the first node sub-block is divided by using the quad-tree division mode, it can be obtained Four second node sub-blocks; if the first node sub-block is divided by the binary tree division mode, two second node sub-blocks can be obtained; if the first node sub-block is divided by the ternary tree division mode, three second node sub-blocks can be obtained. A second node sub-block.
- the sub-block of the first node is divided by a preset division mode, which may be a quad-tree division mode, a vertical binary tree division mode, a horizontal binary tree division mode, a vertical ternary tree division mode, a horizontal ternary tree division mode, or the like.
- a preset division mode which may be a quad-tree division mode, a vertical binary tree division mode, a horizontal binary tree division mode, a vertical ternary tree division mode, a horizontal ternary tree division mode, or the like.
- Each division mode divides the first node sub-block, and can calculate a rate-distortion cost value respectively, and then select the minimum rate-distortion cost value from the calculated rate-distortion cost value as the third rate-distortion cost value; When the third rate-distortion cost value is less than the second rate-distortion cost value, the obtained second node sub-blocks will continue to be divided, and the recursive cycle will continue until the minimum rate-distortion cost value is determined.
- the division mode corresponding to the distortion cost value is used to determine the division mode of the current block.
- the method may further include: when the second rate-distortion cost value is greater than or equal to the first rate-distortion value, directly dividing the current block according to the maximum unit size information Determines the division mode of the current block.
- the second rate-distortion cost value is greater than or equal to the first rate-distortion value, it means that the rate-distortion cost is the smallest when the node sub-blocks are no longer divided into the next level, then it can be directly adjusted according to the maximum unit size information.
- the division mode of the current block is determined as the division mode of the current block, and at this time, it is no longer necessary to continue dividing the obtained node sub-blocks.
- the current block can be preprocessed according to the maximum unit size information, and then the division mode of the current block can be determined, so as to realize the block division operation of the current block.
- S503 Determine block division parameters of the current block according to the division mode.
- S504 Encode the current block according to the block division parameter.
- the division mode is a specific block division manner, and the block division parameter here may be identification information indicating block division, such as split_cu_flag[x0][y0].
- the current block may be encoded according to the block division parameters.
- the encoding of the current block according to the block division parameter may include:
- the block division parameters are encoded, and the encoded bits are written into the code stream.
- the encoder needs to encode the block division parameters, and then writes the code stream to wait for transmission from the encoder to the decoder.
- the encoding of the current block according to the block division parameter may include:
- the prediction parameters of each node sub-block are sequentially determined
- the prediction parameters and residual values of the node sub-blocks are encoded, and the encoded bits are written into the code stream.
- the preset processing order of node sub-blocks may be the preset scanning order.
- the preset scanning sequence may be diagonal, Zigzag, horizontal, vertical, 4 ⁇ 4 sub-block scanning, or any other raster scanning sequence, which is not limited in this embodiment of the present application.
- the residual value can be determined.
- the residual value can be transformed, quantized and entropy encoded, and the prediction parameters of the node sub-blocks can be encoded, and then written into the code stream to be transmitted from the encoder to the decoder.
- the embodiment of the present application may also provide a code stream, where the code stream is generated by bit encoding according to relevant parameters.
- the relevant parameters may include at least one of the following: a block division parameter, a prediction parameter of a node sub-block, a residual value, and identification information of a video image.
- This embodiment provides a block division method, which is applied to an encoder. Based on the texture information of the video image, the maximum unit size information of the current block is determined; the current block is preprocessed according to the maximum unit size information, and the division mode of the current block is determined; the block division parameters of the current block are determined according to the division mode; parameter to encode the current block.
- the maximum unit size information of the current block is determined according to the texture information of the video image, that is, the technical solution of the present application designs an adaptive image texture mechanism for the maximum unit size, which can directly skip the prediction and transformation process of large-size blocks.
- the calculation of leads to an exponential decrease in the total number of recursion of block division, thus significantly reducing the coding complexity and reducing the coding time while keeping the performance gain basically unchanged, thereby improving the coding and decoding efficiency.
- FIG. 6 shows a schematic flowchart of determining the maximum unit size information provided by an embodiment of the present application. As shown in Figure 6, the process may include:
- S601 Divide the video image into blocks to obtain N blocks of a preset size; wherein, N is an integer greater than zero, and the N blocks do not overlap each other.
- the preset size refers to a preset block size value.
- the preset size can be any one of 8, 16, 32, 64, etc., or any one of 8 ⁇ 8, 16 ⁇ 16, 32 ⁇ 32, 64 ⁇ 64, etc., which is implemented in this application. Examples are not specifically limited.
- the preset size may be 64 ⁇ 64; at this time, for the video image, it may be divided into N non-overlapping blocks of 64 ⁇ 64.
- S602 Perform texture analysis on the N blocks to determine a first quantity; where the first quantity represents the quantity of blocks whose texture values are smaller than a first threshold in the N blocks.
- performing texture analysis on N blocks to determine the first number may include:
- the number of blocks whose texture value is less than the first threshold is counted to obtain the first number.
- the method may further include:
- the obtained statistical value is determined as the first number; wherein, i is an integer greater than or equal to zero.
- the first number represents the number of blocks with low texture complexity among the N blocks.
- the level of texture complexity can be measured by the first threshold. If the texture value is greater than or equal to the first threshold, it indicates that the texture of the block is relatively complex; if the texture value is less than the first threshold, it indicates that the texture of the block is relatively complex. Simple (ie low texture complexity).
- T the first threshold
- bitdepth-8 the value of the first threshold
- S603 Determine maximum unit size information of the current block according to the comparison result between the first number and the second threshold.
- a second threshold may also be set for determining the maximum unit size information of the current block.
- determining the maximum unit size information of the current block according to the comparison result between the first number and the second threshold may include:
- the maximum unit size information of the current block is determined to be the first size value.
- the method may further include:
- the maximum unit size information of the current block is determined to be the second size value.
- the method may further include: if the ratio is greater than or equal to a third threshold, determining the maximum unit size information of the current block as a default value.
- the first threshold is different from the second threshold and the third threshold, and the second threshold is smaller than the third threshold.
- the first threshold may be represented by c1
- the second threshold may be represented by c2.
- the value of c1 is equal to 0.15
- the value of c2 is equal to 0.3, but the embodiment of the present application does not specifically limit it.
- the first size value is different from the second size value.
- the first size value may be 8, and the second size value may be 16, but the embodiment of the present application does not specifically limit it.
- the ratio can be represented by j/N, and the value of the first size is 8 and the value of the second size is 16. In this way, if j/N ⁇ c1, then the maximum cell size information is 8; if c1 ⁇ j/N ⁇ c2, then the maximum cell size information is 16; if j/N ⁇ c2, then the maximum cell size information is the default value.
- the calculation of the texture value may be determined according to variance calculation, or may be determined according to other methods, such as a method of summing the absolute values of the horizontal gradient and the vertical gradient.
- the calculating texture values of N blocks may include:
- the calculating the texture values of the N blocks may include:
- the absolute value of the horizontal gradient and the absolute value of the vertical gradient are summed to obtain the texture value of the kth block.
- the value of k is an integer greater than or equal to zero and less than N.
- the calculated variance value can be determined as the texture value, or the sum of the absolute value of the horizontal gradient and the absolute value of the vertical gradient can be calculated, and the calculated sum value can be determined as the texture value, but it does not make any limitation.
- its level may be at least one of the following: sequence level and image level.
- a video sequence may be input, and then the initial frame is used as a video image, and the maximum unit size information corresponding to the initial frame is determined accordingly.
- the maximum unit size information is at the sequence level, the entire video sequence can use this maximum unit size information.
- the maximum unit size information is at the image level, then for the entire video sequence, the maximum unit size information corresponding to each frame is the video image, and then the corresponding maximum unit size information for each frame is determined separately. maximum element size information. It should also be noted that for different blocks in the same frame, the maximum unit size information is the same.
- FIG. 7 shows a detailed schematic flow chart of determining maximum unit size information provided by an embodiment of the present application. As shown in Figure 7, the detailed process may include:
- S702 Divide the initial frame into N non-overlapping blocks of 64 ⁇ 64.
- T represents the first threshold described in the embodiment of the present application
- c1 represents the second threshold described in the embodiment of the present application
- c2 represents the third threshold described in the embodiment of the present application.
- T 4 ⁇ 2 ⁇ (bitdepth-8)
- c1 0.15
- c2 0.3.
- the maximum unit size information may be the BT maximum unit size (represented by maxBtSize) used to limit the binary tree division, or the TT maximum unit size information used to limit the ternary tree division ( Expressed by maxTtSize).
- the maximum unit size information can be represented by maxBtSize or maxTtSize; but under the same conditions, maxBtSize and maxTtSize have the same values.
- maxBtSize and maxTtSize shown in FIG. 7 may be at the sequence level. That is, for the video sequence, maxBtSize and maxTtSize can be determined only according to the initial frame, and the determined maxBtSize and maxTtSize can be used for the entire video sequence.
- maxBtSize and maxTtSize can also be modified from sequence level to image level, that is, maxBtSize and maxTtSize can be calculated according to the method shown in FIG. 7 for each frame of video image as block size constraints of the current frame.
- i represents the execution order of variance calculation for each of the N blocks and whether the variance is less than T
- j represents the cumulative value of the number of blocks with variance less than T among the N blocks.
- step S704 if the judgment result is yes, it means that the variance of the i-th block is less than T, then step S705 and S706 are executed, that is, not only the processing of adding 1 to j is performed, but also processing of adding 1 to i is performed; If the result is no, it means that the variance of the i-th block is greater than or equal to T, then step S706 is executed, that is, at this time, it is no longer necessary to perform the processing of adding 1 to j, and only processing of adding 1 to i is performed.
- step S707 if the judgment result is no, it means that all the N blocks have not been executed, then return to step S703, that is, continue the operation of the next block (for example, calculate the variance of the ith block, and then further judge Whether the variance of the i-th block is less than T, etc.); if the judgment result is yes, it means that the N blocks are all executed, then execute step S708, that is, after obtaining j, determine the ratio of j to N j/N, and compare j/N with c1.
- step S708 if the judgment result is yes, it means that j/N is less than c1, then step S710 is executed, that is, it is determined that the maximum unit size information of the current block is 8; if the judgment result is no, it means that j/N N is greater than or equal to c1, then step S709 is executed, and j/N needs to be further compared with c2.
- step S709 if the judgment result is yes, it means that j/N is less than c2, then step S711 is executed, that is, it is determined that the maximum unit size information of the current block is 16; if the judgment result is no, it means that j/N N is greater than or equal to c2, then step S712 is executed, that is, the maximum unit size information of the current block is determined as the default value (including the default maxBtSize and the default maxTtSize).
- the embodiments of the present application provide a high-bit-depth video fast division technology based on texture analysis.
- the technology first performs texture analysis on the high-bit-depth video, and then designs an adaptive maximum size mechanism for multi-type tree units based on this.
- the maximum size of the multi-type tree unit is smaller; and when there is a large flat area in the video image, the maximum size of the multi-type tree unit is larger. See Figure 7 for details.
- maxBtSize is 8 and maxTtSize is 8; if c1 ⁇ j/N ⁇ c2, then maxBtSize is 16 and maxTtSize is 16; if j/N ⁇ c2, then maxBtSize is the default maxBtSize, and maxTtSize is the default maxTtSize.
- an implementation method of an HBD sequence encoder using QTMT fast block division is as follows: First, input the first frame of the video sequence, and determine the maximum BT according to the flow shown in FIG. 7 .
- the cell size is maxBtSize
- the TT maximum cell size is maxTtSize.
- the input image is divided into non-overlapping CTU blocks.
- each CTU is processed in turn according to the raster scan order, and the CTU is divided into several CUs, which mainly includes the following four steps: 1 Divide the CTU into several non-overlapping CUs according to the maxBtSize ⁇ maxBtSize size (or maxTtSize ⁇ maxTtSize size) , calculate the rate-distortion cost value RdCost0 of predictive coding at this time. 2 The CU is divided and predicted according to the QT mode or the MT mode, and the rate-distortion cost value RdCost1 is calculated. 3 Compare RdCost0 and RdCost1, if RdCost1 is smaller, continue to process each sub-CU in turn.
- each CU is divided and predicted according to the QT mode or the MT mode, the rate-distortion cost value of each division mode is calculated, the current relatively better RdCost is compared, and the recursive loop is continued until the rate-distortion cost value with the smallest rate is determined.
- a block division method that is, the division mode of the current block.
- the residual block is calculated according to the division mode, and the residual block is transformed, quantized, and entropy encoded, and the relevant information such as block division parameters is encoded, and the output code stream is waiting for transmission.
- the block division process of binary tree and ternary tree in MT is adaptively cropped, and for video sequences with extremely complex textures, the MT block division larger than 8 ⁇ 8 is cut off. mode, for video sequences with more complex textures, cut out the block division mode larger than 16 ⁇ 16.
- the test is carried out in the HBD test sequence required by JVET under the All Intra condition, and the average change of BD-rate on the Y, Cb, and Cr components are 0.20%, 0.29%, and 0.28%, respectively, and the encoding time is reduced by an average of 56%. This data shows that this technology can save more than half of the encoding time with almost negligible loss of performance gain.
- This embodiment provides a block division method, which is applied to an encoder.
- the technical solution of the present application can significantly reduce the coding complexity while maintaining the coding performance substantially equivalent to that of the prior art.
- FIG. 8 shows a schematic flowchart of another block division method provided by the embodiment of the present application. As shown in Figure 8, the method may include:
- S801 Parse the code stream, and determine the block division parameters of the current block.
- the block division method in this embodiment of the present application is applied to a decoder.
- the video image can be divided into multiple image blocks, wherein each image block to be decoded can be called a decoding block, and the current block here specifically refers to the decoding block currently to be decoded ; After decoding is complete, you can wait for the video to play.
- the embodiments of the present application mainly provide a fast block division technology for high bit depth video based on texture analysis, that is, applied to high bit depth video.
- whether the video image is a high bit depth video can be determined by using the identification information of the video image.
- the method may further include:
- the value of the identification information of the video image is the first value, it is determined that the identification information of the video image indicates that the video image is a high bit depth video; or,
- the value of the identification information of the video image is the second value, it is determined that the identification information of the video image indicates that the video image is a non-high bit depth video.
- the first value and the second value are different, and the first value and the second value may be in the form of parameters or in the form of numbers.
- the identification information of the video image is a parameter written in the profile (profile), but the identification information of the video image may also be a flag (flag), which is not limited here.
- the embodiments of the present application provide a decoding method, and specifically provide a block division method. More specifically, the embodiments of the present application design an image texture-based adaptive maximum unit size mechanism for high-bit-depth video. . In this way, when the encoder determines that the video image is a high-bit-depth video, the identification information of the video image can be written into the code stream, so that the decoder can directly determine whether the video image is a high-bit-depth video by parsing the code stream.
- the first value can be set to 1, and the second value can be set to 0 ;
- the first value can also be set to true, and the second value can also be set to false; even in another specific example, the first value can also be set to 0, and the second value can also be set to Set to 1; alternatively, the first value can also be set to false, and the second value can also be set to true.
- the first value and the second value in this embodiment of the present application are not limited in any way.
- the decoder parses the code stream, if the value of the identification information of the video image is 1, it can be determined that the video image is a high-bit-depth video, that is, encoding
- the block division method described in the embodiments of the present application can be used to save coding speed and significantly reduce coding complexity. Otherwise, if the value of the identification information of the video image is 0, it can be determined that the video image is a non-high bit-depth video at this time, that is, the encoder does not use the block division method described in the embodiment of the present application, for example, according to the related art The block division method is performed.
- the method may further include:
- a division tree of the current block is determined, wherein the division tree includes one or more node sub-blocks obtained by dividing the current block.
- the division mode of the current block can be determined, and then the division tree of the current block can be determined, so as to sequentially process each node sub-tree of the division tree according to the preset processing order of node sub-blocks piece.
- the division mode of the current block is determined according to the block division parameter, and the division mode is associated with the texture information of the video image. That is, in the encoder, the division mode is determined by determining the maximum unit size information of the current block according to the texture information of the video image, and then preprocessing the current block according to the maximum unit size information;
- the mechanism of adaptive image texture is designed, which can directly skip the calculation of the prediction and transformation process of large-size blocks, thereby reducing the encoding time and significantly reducing the encoding complexity while keeping the performance gain basically unchanged.
- S802 Based on the block division parameter, parse the code stream to determine the predicted value of the current block.
- the step of parsing the code stream based on the block division parameter to determine the predicted value of the current block may include:
- the code stream of each node sub-block of the partition tree is sequentially parsed, and the prediction mode of each node sub-block is determined;
- the prediction value of each node sub-block is determined according to the prediction mode.
- the preset processing order of node sub-blocks may be the preset scanning order.
- the preset scanning sequence may be diagonal, Zigzag, horizontal, vertical, 4 ⁇ 4 sub-block scanning, or any other raster scanning sequence, which is not limited in this embodiment of the present application.
- the code stream of each node sub-block of the partition tree can be sequentially parsed according to the preset scanning order, to obtain the prediction mode of each node sub-block, and then determine the prediction of each node sub-block. value.
- S803 Based on the block division parameters, parse the code stream to determine the residual value of the current block.
- the step of parsing the code stream based on the block division parameter to determine the residual value of the current block may include:
- the code stream of each node sub-block of the partition tree is sequentially parsed according to the preset node sub-block processing order, and the residual value of each node sub-block is determined.
- the preset processing order of node sub-blocks may be the preset scanning order. That is to say, the embodiment of the present application may sequentially parse the code stream of each node sub-block of the partition tree according to the preset scanning order, and then determine the residual value of each node sub-block.
- S804 Determine the reconstruction value of the current block based on the predicted value and the residual value.
- the determining the reconstructed value of the current block based on the predicted value and the residual value may include: adding the predicted value and the residual value to determine the reconstructed value of the current block.
- the predicted value of the current block can also be obtained by parsing the code stream; and the residual value of the current block can also be obtained by parsing the code stream; in this way, by comparing the predicted value and The residual value is added and calculated to determine the reconstruction value of the current block.
- an implementation method of an HBD sequence decoder using QTMT fast block division is as follows: First, entropy decoding, inverse quantization, and inverse transformation are performed on the input code stream, and the residual error can be obtained. Next, the image is reconstructed according to the residual block, and the reconstruction process here mainly includes the following three steps: 1. Determine the current CTU partition tree according to relevant information such as block partition parameters. 2 Process each CU of the partition tree in turn according to the raster scan order, and use information such as motion vectors to find the prediction block. 3 Superimpose the residual value and the predicted value of the current CU to obtain the reconstructed CU. Finally, the reconstructed image is sent to the DBF/SAO/ALF filter, and the filtered image is sent to the buffer area, waiting for the video to play.
- This embodiment provides a block division method, which is applied to a decoder.
- the block division parameters of the current block are determined; based on the block division parameters, the code stream is parsed to determine the predicted value of the current block; based on the block division parameters, the code stream is parsed to determine the residual value of the current block; and based on the predicted value and the residual value to determine the reconstructed value of the current block.
- the maximum unit size information of the current block is determined according to the texture information of the video image, that is, the technical solution of the present application designs an adaptive image texture mechanism for the maximum unit size, which can directly skip the prediction and transformation process of large-size blocks.
- the calculation of leads to an exponential decrease in the total number of recursion of block division, thus significantly reducing the coding complexity and reducing the coding time while keeping the performance gain basically unchanged, thereby improving the coding and decoding efficiency.
- FIG. 9 shows a schematic structural diagram of an encoder 90 provided by an embodiment of the present application.
- the encoder 90 may include: a first determining unit 901, a block dividing unit 902 and an encoding unit 903; wherein,
- the first determining unit 901 is configured to determine the maximum unit size information of the current block based on the texture information of the video image;
- the block division unit 902 is configured to preprocess the current block according to the maximum unit size information to determine the division mode of the current block; and according to the division mode, determine the block division parameter of the current block;
- the encoding unit 903 is configured to encode the current block according to the block division parameter.
- the encoding unit 903 is further configured to encode the block division parameter, and write the encoded bits into the code stream.
- the block division unit 902 is further configured to divide the current block into one or more node sub-blocks according to the block division parameter;
- the first determining unit 901 is further configured to sequentially determine the prediction parameter of each node sub-block according to the preset node sub-block processing order; and determine the predicted value of the node sub-block according to the prediction parameter; and according to the original node sub-block value and predicted value, determine the residual value of the node sub-block;
- the encoding unit 903 is further configured to encode the prediction parameter and the residual value of the node sub-block, and write the encoded bits into the code stream.
- the block dividing unit 902 is further configured to use the maximum unit size information to divide the current block, obtain at least one first node sub-block, and calculate the first rate-distortion cost value; and use a preset division mode to The first node sub-block is divided to obtain at least one second node sub-block, and the second rate-distortion cost value is calculated;
- the first determining unit 901 is further configured to determine the division mode of the current block according to the comparison result of the first rate-distortion cost value and the second rate-distortion cost value.
- the block dividing unit 902 is further configured to compare the first rate-distortion cost value with the second rate-distortion cost value; and in the case that the second rate-distortion cost value is less than the first rate-distortion cost value, Use the preset division mode to divide the second node sub-block to obtain at least one next-level second node sub-block, and calculate the third rate-distortion cost value;
- the first determining unit 901 is further configured to determine the division mode of the current block according to the second rate-distortion cost value and the third rate-distortion cost value.
- the block dividing unit 902 is further configured to update the second rate-distortion cost value with the third rate-distortion cost value when the third rate-distortion cost value is smaller than the second rate-distortion cost value, and return to executing
- the second node sub-block is divided by using the preset division mode to obtain at least one next-level second node sub-block, and the steps of calculating the third rate-distortion cost value until the minimum rate-distortion cost value is determined;
- the first determining unit 901 is further configured to determine the division mode of the current block according to the division mode corresponding to the minimum rate-distortion cost value.
- the first determining unit 901 is further configured to directly determine the mode of dividing the current block according to the maximum unit size information as the second rate-distortion cost value is greater than or equal to the first rate-distortion value.
- the partition mode of the current block is further configured to directly determine the mode of dividing the current block according to the maximum unit size information as the second rate-distortion cost value is greater than or equal to the first rate-distortion value.
- the preset division mode includes a quadtree division mode and/or a multi-type tree division mode
- the multi-type tree division mode includes at least one of the following: a vertical binary tree division mode, a horizontal binary tree division mode, and a vertical ternary tree division mode Partition mode and horizontal ternary tree partition mode.
- the block division unit 902 is further configured to perform block division on the video image to obtain N blocks of preset size; wherein, N is an integer greater than zero, and the N blocks do not overlap each other;
- the first determining unit 901 is further configured to perform texture analysis on the N blocks to determine a first quantity; wherein the first quantity represents the quantity of blocks whose texture values are less than the first threshold in the N blocks; and according to the first quantity and the second The result of the comparison of the thresholds determines the maximum unit size information of the current block.
- the encoder 90 may further include a calculation unit 904 configured to calculate the texture values of the N blocks;
- the first determining unit 901 is further configured to compare the texture values of the N blocks with the first threshold in sequence; and to count the number of blocks whose texture values are less than the first threshold according to the comparison result to obtain the first number.
- the calculation unit 904 is specifically configured to perform variance value calculation on the kth block to obtain the texture value of the kth block; wherein k is an integer greater than or equal to zero and less than N.
- the calculation unit 904 is specifically configured to determine the absolute value of the horizontal gradient and the absolute value of the vertical gradient of the kth block; and perform a sum calculation on the absolute value of the horizontal gradient and the absolute value of the vertical gradient to obtain the kth block The texture value of ; where k is an integer greater than or equal to zero and less than N.
- the first determining unit 901 is further configured to determine a ratio of the first number to N, compare the ratio with a second threshold; and when the ratio is less than the second threshold, determine the largest unit of the current block
- the size information is a first size value.
- the first determining unit 901 is further configured to compare the ratio with a third threshold when the ratio is greater than or equal to the second threshold; and if the ratio is less than the third threshold, determine the maximum value of the current block
- the unit size information is a second size value; wherein the second size value is different from the first size value; the first threshold value is different from the second threshold value and the third threshold value, and the second threshold value is smaller than the third threshold value.
- the first determining unit 901 is further configured to determine identification information of the video image; and when the identification information of the video image indicates that the video image is a high bit-depth video, perform texture information based on the video image to determine the Steps for maximum element size information.
- the first determining unit 901 is further configured to determine that the identification information of the video image is a first value if the identification information of the video image indicates that the video image is a high bit depth video; If the identification information indicates that the video image is a non-high bit depth video, the value of the identification information of the video image is determined to be the second value.
- the encoding unit 903 is further configured to encode the identification information of the video image, and write the encoded bits into the code stream.
- the level of the maximum cell size information is at least one of the following: sequence level, picture level.
- a "unit” may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course, it may also be a module, and it may also be non-modular.
- each component in this embodiment may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
- the above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of software function modules.
- the integrated unit is implemented in the form of a software functional module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium.
- the technical solution of this embodiment is essentially or The part that contributes to the prior art or the whole or part of the technical solution can be embodied in the form of a software product, the computer software product is stored in a storage medium, and includes several instructions for making a computer device (which can be It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the method described in this embodiment.
- the aforementioned storage medium includes: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes.
- an embodiment of the present application provides a computer storage medium, which is applied to the encoder 90, where the computer storage medium stores a computer program, and when the computer program is executed by the first processor, any one of the foregoing embodiments is implemented.
- FIG. 10 shows a schematic diagram of a specific hardware structure of the encoder 90 provided by the embodiment of the present application.
- it may include: a first communication interface 1001 , a first memory 1002 and a first processor 1003 ; each component is coupled together through a first bus system 1004 .
- the first bus system 1004 is used to realize the connection and communication between these components.
- the first bus system 1004 also includes a power bus, a control bus and a status signal bus.
- the various buses are designated as the first bus system 1004 in FIG. 10 . in,
- the first communication interface 1001 is used for receiving and sending signals in the process of sending and receiving information with other external network elements;
- a first memory 1002 for storing a computer program that can run on the first processor 1003;
- the first processor 1003 is configured to, when running the computer program, execute:
- the current block is encoded according to the block partition parameter.
- the first memory 1002 in this embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memories.
- the non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electrically programmable read-only memory (Erasable PROM, EPROM). Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory.
- Volatile memory may be Random Access Memory (RAM), which acts as an external cache.
- RAM Static RAM
- DRAM Dynamic RAM
- SDRAM Synchronous DRAM
- SDRAM double data rate synchronous dynamic random access memory
- Double Data Rate SDRAM DDRSDRAM
- enhanced SDRAM ESDRAM
- synchronous link dynamic random access memory Synchronous DRAM, SLDRAM
- Direct Rambus RAM Direct Rambus RAM
- the first processor 1003 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method may be completed by an integrated logic circuit of hardware in the first processor 1003 or an instruction in the form of software.
- the above-mentioned first processor 1003 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA) Or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
- DSP Digital Signal Processor
- ASIC Application Specific Integrated Circuit
- FPGA Field Programmable Gate Array
- the methods, steps, and logic block diagrams disclosed in the embodiments of this application can be implemented or executed.
- a general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
- the steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor.
- the software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art.
- the storage medium is located in the first memory 1002, and the first processor 1003 reads the information in the first memory 1002, and completes the steps of the above method in combination with its hardware.
- the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or a combination thereof.
- the processing unit can be implemented in one or more Application Specific Integrated Circuits (ASIC), Digital Signal Processing (DSP), Digital Signal Processing Device (DSP Device, DSPD), programmable Logic Devices (Programmable Logic Device, PLD), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), General Purpose Processors, Controllers, Microcontrollers, Microprocessors, Others for performing the functions described herein electronic unit or a combination thereof.
- the techniques described herein may be implemented through modules (eg, procedures, functions, etc.) that perform the functions described herein.
- Software codes may be stored in memory and executed by a processor.
- the memory can be implemented in the processor or external to the processor.
- the first processor 1003 is further configured to execute the method described in any one of the foregoing embodiments when running the computer program.
- This embodiment provides an encoder, and the encoder may include a first determination unit, a block division unit, and a coding unit.
- the encoder may include a first determination unit, a block division unit, and a coding unit.
- FIG. 11 shows a schematic structural diagram of a decoder 110 provided by an embodiment of the present application.
- the decoder 110 may include: a parsing unit 1101 and a second determining unit 1102; wherein,
- the parsing unit 1101 is configured to parse the code stream and determine the block division parameter of the current block
- the parsing unit 1101 is further configured to parse the code stream based on the block division parameter to determine the predicted value of the current block; and based on the block division parameter, parse the code stream to determine the residual value of the current block;
- the second determining unit 1102 is configured to determine the reconstruction value of the current block based on the predicted value and the residual value.
- the parsing unit 1101 is further configured to parse the code stream to obtain identification information of the video image
- the second determining unit 1102 is further configured to, if the value of the identification information of the video image is the first value, determine that the identification information of the video image indicates that the video image is a high bit depth video; or, if the value of the identification information of the video image is the value of For the second value, it is determined that the identification information of the video image indicates that the video image is a non-high bit-depth video.
- the second determining unit 1102 is further configured to determine a division mode of the current block based on the block division parameter; and determine a division tree of the current block according to the division mode, wherein the division tree comprises dividing the current block to obtain One or more node sub-blocks of .
- the parsing unit 1101 is further configured to sequentially parse the code stream of each node sub-block of the partition tree according to a preset node sub-block processing order, and determine the prediction mode of each node sub-block;
- the second determining unit 1102 is further configured to determine the prediction value of each node sub-block according to the prediction mode.
- the parsing unit 1101 is further configured to sequentially parse the code stream of each node sub-block of the partition tree according to the preset node sub-block processing order, and determine the residual value of each node sub-block.
- the division mode has an associated relationship with texture information of the video image.
- a "unit” may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course, it may also be a module, and it may also be non-modular.
- each component in this embodiment may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit.
- the above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of software function modules.
- the integrated unit is implemented in the form of a software functional module and is not sold or used as an independent product, it may be stored in a computer-readable storage medium.
- this embodiment provides a computer storage medium, which is applied to the decoder 110, where the computer storage medium stores a computer program, and when the computer program is executed by the second processor, any one of the foregoing embodiments is implemented the method described.
- FIG. 12 shows a schematic diagram of a specific hardware structure of the decoder 110 provided by the embodiment of the present application.
- it may include: a second communication interface 1201 , a second memory 1202 and a second processor 1203 ; each component is coupled together through a second bus system 1204 .
- the second bus system 1204 is used to implement connection communication between these components.
- the second bus system 1204 also includes a power bus, a control bus, and a status signal bus.
- the various buses are labeled as the second bus system 1204 in FIG. 12 . in,
- the second communication interface 1201 is used for receiving and sending signals in the process of sending and receiving information with other external network elements;
- a second memory 1202 for storing computer programs that can run on the second processor 1203;
- the second processor 1203 is configured to, when running the computer program, execute:
- the code stream is parsed to determine the predicted value of the current block
- the code stream is parsed to determine the residual value of the current block
- the reconstructed value of the current block is determined.
- the second processor 1203 is further configured to execute the method described in any one of the foregoing embodiments when running the computer program.
- the hardware function of the second memory 1202 is similar to that of the first memory 1002, and the hardware function of the second processor 1203 is similar to that of the first processor 1003; details are not described here.
- This embodiment provides a decoder, and the decoder may include a parsing unit and a second determining unit.
- the decoder may include a parsing unit and a second determining unit.
- the maximum unit size information of the current block is determined based on the texture information of the video image; the current block is preprocessed according to the maximum unit size information, and the division mode of the current block is determined; according to the division mode, determining a block division parameter of the current block; and encoding the current block according to the block division parameter.
- the code stream is parsed to determine the block division parameters of the current block; based on the block division parameters, the code stream is parsed to determine the predicted value of the current block; based on the block division parameters, the code stream is parsed to determine the residual value of the current block; And, based on the predicted value and the residual value, the reconstructed value of the current block is determined.
- the technical solution of the present application designs an adaptive image texture mechanism for the maximum unit size, which can directly skip the prediction and transformation process of large-size blocks.
- the calculation of leads to an exponential decrease in the total number of recursion of block division, thus significantly reducing the coding complexity and reducing the coding time while keeping the performance gain basically unchanged, thereby improving the coding and decoding efficiency.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Embodiments of the present application disclose block division methods, encoders, decoders, and a computer storage medium. A block division method comprises: determining, on the basis of texture information of a video image, maximum unit size information of a current block; pre-processing the current block according to the maximum unit size information, and determining a division mode of the current block; determining a block division parameter of the current block according to the division mode; and encoding the current block according to the block division parameter. As such, since the maximum unit size information of the current block is determined according to the texture information of the video image, that is, an adaptive image texture mechanism is designed for a maximum unit size, the prediction of large-size blocks and calculation of a transformation process can be directly skipped, leading to the total number of recursions for block division being exponentially reduced, thereby significantly reducing encoding complexity while maintaining performance gain to be substantially unchanged, and reducing encoding time, and thus improving the efficiency of encoding and decoding.
Description
本申请实施例涉及视频编解码技术领域,尤其涉及一种块划分方法、编码器、解码器以及计算机存储介质。The embodiments of the present application relate to the technical field of video coding and decoding, and in particular, to a block division method, an encoder, a decoder, and a computer storage medium.
随着人们对视频显示质量要求的提高,高清和超高清视频等新视频应用形式应运而生。H.265/高效率视频编码(High Efficiency Video Coding,HEVC)已经无法满足视频应用迅速发展的需求,联合视频专家组(Joint Video Exploration Team,JVET)提出了新一代视频编码标准H.266/多功能视频编码(Versatile Video Coding,VVC),其相应的测试模型为VVC的参考软件测试平台(VVC Test Model,VTM)。With the improvement of people's requirements for video display quality, new video application forms such as high-definition and ultra-high-definition video emerge as the times require. H.265/High Efficiency Video Coding (HEVC) has been unable to meet the needs of the rapid development of video applications. The Joint Video Exploration Team (JVET) proposed a new generation of video coding standard H.266/Multiple Versatile Video Coding (VVC), and its corresponding test model is VVC's reference software test platform (VVC Test Model, VTM).
在目前的VVC块划分技术中,嵌套多类型树的四叉树(Quad Tree with nested Multi-type Tree,QTMT)模式导致VVC的编码复杂度远远超过HEVC;进而在编码高位深(High Bit Depth,HBD)视频时,大尺寸编码块的预测和率失真代价计算可能会产生大量不必要的开销,浪费计算资源,而且还会增加编码时间。In the current VVC block division technology, the Quad Tree (Quad Tree with nested Multi-type Tree, QTMT) mode leads to the coding complexity of VVC far exceeding HEVC; Depth, HBD) video, the prediction and rate-distortion cost calculation of large-size coding blocks may generate a lot of unnecessary overhead, waste computing resources, and increase coding time.
发明内容SUMMARY OF THE INVENTION
本申请实施例提供一种块划分方法、编码器、解码器以及计算机存储介质,可以减少编码复杂度,进而提升编解码效率。Embodiments of the present application provide a block division method, an encoder, a decoder, and a computer storage medium, which can reduce coding complexity and further improve coding and decoding efficiency.
本申请实施例的技术方案可以如下实现:The technical solutions of the embodiments of the present application can be implemented as follows:
第一方面,本申请实施例提供了一种块划分方法,应用于编码器,该方法包括:In a first aspect, an embodiment of the present application provides a block division method, which is applied to an encoder, and the method includes:
基于视频图像的纹理信息,确定当前块的最大单元尺寸信息;Determine the maximum unit size information of the current block based on the texture information of the video image;
根据最大单元尺寸信息对当前块进行预处理,确定当前块的划分模式;Preprocess the current block according to the maximum unit size information to determine the division mode of the current block;
根据划分模式,确定当前块的块划分参数;Determine the block division parameters of the current block according to the division mode;
根据块划分参数,对当前块进行编码。The current block is encoded according to the block partition parameter.
第二方面,本申请实施例提供了一种块划分方法,应用于解码器,该方法包括:In a second aspect, an embodiment of the present application provides a block division method, which is applied to a decoder, and the method includes:
解析码流,确定当前块的块划分参数;Parse the code stream and determine the block division parameters of the current block;
基于块划分参数,解析码流,确定当前块的预测值;Based on the block division parameters, the code stream is parsed to determine the predicted value of the current block;
基于块划分参数,解析码流,确定当前块的残差值;Based on the block division parameters, the code stream is parsed to determine the residual value of the current block;
基于预测值和残差值,确定当前块的重建值。Based on the predicted value and the residual value, the reconstructed value of the current block is determined.
第三方面,本申请实施例提供了一种编码器,该编码器包括第一确定单元、块划分单元和编码单元;其中,In a third aspect, an embodiment of the present application provides an encoder, the encoder includes a first determination unit, a block division unit, and an encoding unit; wherein,
第一确定单元,配置为基于视频图像的纹理信息,确定当前块的最大单元尺寸信息;a first determining unit, configured to determine the maximum unit size information of the current block based on the texture information of the video image;
块划分单元,配置为根据最大单元尺寸信息对当前块进行预处理,确定当前块的划分模式;以及根据划分模式,确定当前块的块划分参数;a block division unit, configured to preprocess the current block according to the maximum unit size information, to determine a division mode of the current block; and to determine a block division parameter of the current block according to the division mode;
编码单元,配置为根据块划分参数,对当前块进行编码。an encoding unit, configured to encode the current block according to the block division parameter.
第四方面,本申请实施例提供了一种编码器,该编码器包括第一存储器和第一处理器;其中,In a fourth aspect, an embodiment of the present application provides an encoder, where the encoder includes a first memory and a first processor; wherein,
第一存储器,用于存储能够在第一处理器上运行的计算机程序;a first memory for storing a computer program executable on the first processor;
第一处理器,用于在运行计算机程序时,执行如第一方面所述的方法。The first processor is configured to execute the method according to the first aspect when running the computer program.
第五方面,本申请实施例提供了一种解码器,该解码器包括解析单元和第二确定单元;其中,In a fifth aspect, an embodiment of the present application provides a decoder, where the decoder includes a parsing unit and a second determining unit; wherein,
解析单元,配置为解析码流,确定当前块的块划分参数;a parsing unit, configured to parse the code stream, and determine the block division parameters of the current block;
解析单元,还配置为基于块划分参数,解析码流,确定当前块的预测值;以及基于块划分参数,解析码流,确定当前块的残差值;The parsing unit is further configured to parse the code stream based on the block division parameter to determine the predicted value of the current block; and based on the block division parameter, parse the code stream to determine the residual value of the current block;
第二确定单元,配置为基于预测值和残差值,确定当前块的重建值。The second determination unit is configured to determine the reconstruction value of the current block based on the predicted value and the residual value.
第六方面,本申请实施例提供了一种解码器,该解码器包括第二存储器和第二处理器;其中,In a sixth aspect, an embodiment of the present application provides a decoder, the decoder includes a second memory and a second processor; wherein,
第二存储器,用于存储能够在第二处理器上运行的计算机程序;a second memory for storing a computer program executable on the second processor;
第二处理器,用于在运行计算机程序时,执行如第二方面所述的方法。The second processor is configured to execute the method according to the second aspect when running the computer program.
第七方面,本申请实施例提供了一种计算机存储介质,该计算机存储介质存储有计算机程序,计算机程序被执行时实现如第一方面所述的方法、或者如第二方面所述的方法。In a seventh aspect, an embodiment of the present application provides a computer storage medium, where the computer storage medium stores a computer program, and when the computer program is executed, the method described in the first aspect or the method described in the second aspect is implemented.
本申请实施例提供了一种块划分方法、编码器、解码器以及计算机存储介质,在编码器侧,基于视频图像的纹理信息,确定当前块的最大单元尺寸信息;根据最大单元尺寸信息对当前块进行预处理,确定当前块的划分模式;根据划分模式,确定当前块的块划分参数;以及根据块划分参数,对当前块进行编码。在解码器侧,解析码流,确定当前块的块划分参数;基于块划分参数,解析码流,确定当前块的预测值;基于块划分参数,解析码流,确定当前块的残差值;以及基于预测值和残差值,确定当前块的重建值。这样,由于根据视频图像的纹理信息确定当前块的最大单元尺寸信息,也即本申请的技术方案对最大单元尺寸设计了自适应图像纹理的机制,能够直接跳过大尺寸块的预测和变换过程的计算,导致块划分的递归总次数实现指数级下降,从而在保持性能增益基本不变的情况下,显著地减少了编码复杂度,并且减少了编码时间,进而能够提升编解码效率。The embodiments of the present application provide a block division method, an encoder, a decoder, and a computer storage medium. On the encoder side, based on the texture information of the video image, the maximum unit size information of the current block is determined; The block is preprocessed to determine the division mode of the current block; the block division parameter of the current block is determined according to the division mode; and the current block is encoded according to the block division parameter. On the decoder side, the code stream is parsed to determine the block division parameters of the current block; based on the block division parameters, the code stream is parsed to determine the predicted value of the current block; based on the block division parameters, the code stream is parsed to determine the residual value of the current block; And, based on the predicted value and the residual value, the reconstructed value of the current block is determined. In this way, since the maximum unit size information of the current block is determined according to the texture information of the video image, that is, the technical solution of the present application designs an adaptive image texture mechanism for the maximum unit size, which can directly skip the prediction and transformation process of large-size blocks. The calculation of , leads to an exponential decrease in the total number of recursion of block division, thus significantly reducing the coding complexity and reducing the coding time while keeping the performance gain basically unchanged, thereby improving the coding and decoding efficiency.
图1为相关技术提供的一种多类型树的结构示意图;1 is a schematic structural diagram of a multi-type tree provided by the related art;
图2为相关技术提供的一种块划分的流程示意图;2 is a schematic flowchart of a block division provided by the related art;
图3为相关技术提供的另一种块划分的结构示意图;3 is a schematic structural diagram of another block division provided by the related art;
图4A为本申请实施例提供的一种编码器的组成框图示意图;4A is a schematic block diagram of the composition of an encoder according to an embodiment of the present application;
图4B为本申请实施例提供的一种解码器的组成框图示意图;4B is a schematic block diagram of the composition of a decoder according to an embodiment of the present application;
图5为本申请实施例提供的一种块划分方法的流程示意图;FIG. 5 is a schematic flowchart of a block division method provided by an embodiment of the present application;
图6为本申请实施例提供的一种确定最大单元尺寸信息的流程示意图;FIG. 6 is a schematic flowchart of determining maximum unit size information according to an embodiment of the present application;
图7为本申请实施例提供的一种确定最大单元尺寸信息的详细流程示意图;FIG. 7 is a detailed schematic flow chart of determining maximum unit size information according to an embodiment of the present application;
图8为本申请实施例提供的另一种块划分方法的流程示意图;FIG. 8 is a schematic flowchart of another block division method provided by an embodiment of the present application;
图9为本申请实施例提供的一种编码器的组成结构示意图;9 is a schematic diagram of the composition and structure of an encoder provided by an embodiment of the present application;
图10为本申请实施例提供的一种编码器的具体硬件结构示意图;10 is a schematic diagram of a specific hardware structure of an encoder provided by an embodiment of the application;
图11为本申请实施例提供的一种解码器的组成结构示意图;11 is a schematic diagram of the composition and structure of a decoder provided by an embodiment of the application;
图12为本申请实施例提供的一种解码器的具体硬件结构示意图。FIG. 12 is a schematic diagram of a specific hardware structure of a decoder provided by an embodiment of the present application.
为了能够更加详尽地了解本申请实施例的特点与技术内容,下面结合附图对本申请实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本申请实施例。In order to have a more detailed understanding of the features and technical contents of the embodiments of the present application, the implementation of the embodiments of the present application will be described in detail below with reference to the accompanying drawings.
除非另有定义,本文所使用的所有的技术和科学术语与属于本申请的技术领域的技术人员通常理解的含义相同。本文中所使用的术语只是为了描述本申请实施例的目的,不是旨在限制本申请。Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field to which this application belongs. The terms used herein are only for the purpose of describing the embodiments of the present application, and are not intended to limit the present application.
在以下的描述中,涉及到“一些实施例”,其描述了所有可能实施例的子集,但是可以理解,“一些实施例”可以是所有可能实施例的相同子集或不同子集,并且可以在不冲突的情况下相互结合。还需要指出,本申请实施例所涉及的术语“第一\第二\第三”仅是用于区别类似的对象,不代表针对对象的特定排序,可以理解地,“第一\第二\第三”在允许的情况下可以互换特定的顺序或先后次序,以使这里描述的本申请实施例能够以除了在这里图示或描述的以外的顺序实施。In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" can be the same or a different subset of all possible embodiments, and Can be combined with each other without conflict. It should also be pointed out that the term "first\second\third" involved in the embodiments of the present application is only used to distinguish similar objects, and does not represent a specific ordering of objects. It is understood that "first\second\" Where permitted, the specific order or sequence may be interchanged so that the embodiments of the present application described herein can be implemented in sequences other than those illustrated or described herein.
在视频图像中,一般采用第一图像分量、第二图像分量和第三图像分量来表征编码块(Coding Block,CB);其中,这三个图像分量分别为一个亮度分量、一个蓝色色度分量和一个红色色度分量,具体地,亮度分量通常使用符号Y表示,蓝色色度分量通常使用符号Cb或者U表示,红色色度分量通常使用符号Cr或者V表示;这样,视频图像可以用YCbCr格式表示,也可以用YUV格式表示。In a video image, a first image component, a second image component, and a third image component are generally used to represent a coding block (Coding Block, CB); wherein, the three image components are a luminance component and a blue chrominance component respectively. and a red chrominance component, specifically, the luminance component is usually represented by the symbol Y, the blue chrominance component is usually represented by the symbol Cb or U, and the red chrominance component is usually represented by the symbol Cr or V; in this way, the video image can use the YCbCr format Representation can also be represented in YUV format.
对本申请实施例进行进一步详细说明之前,先对本申请实施例中涉及的名词和术语进行说明,本申请实施例中涉及的名词和术语适用于如下的解释:Before the embodiments of the present application are described in further detail, the nouns and terms involved in the embodiments of the present application will be described first. The nouns and terms involved in the embodiments of the present application are applicable to the following explanations:
动态图像专家组(Moving Picture Experts Group,MPEG)Moving Picture Experts Group (MPEG)
国际标准化组织(International Standardization Organization,ISO)International Standardization Organization (ISO)
国际电工委员会(International Electrotechnical Commission,IEC)International Electrotechnical Commission (IEC)
联合视频专家组(Joint Video Experts Team,JVET)Joint Video Experts Team (JVET)
开放媒体联盟(Alliance for Open Media,AOM)Alliance for Open Media (AOM)
H.265/高效视频编码(High Efficiency Video Coding,HEVC)H.265/High Efficiency Video Coding (HEVC)
H.266/多功能视频编码(Versatile Video Coding,VVC)H.266/Versatile Video Coding (VVC)
VVC的参考软件测试平台(VVC Test Model,VTM)VVC's reference software test platform (VVC Test Model, VTM)
音视频编码标准(Audio Video Standard,AVS)Audio Video Standard (AVS)
AVS的高性能测试模型(High-Performance Model,HPM)High-Performance Model (HPM) of AVS
二叉树(Binary Tree,BT)Binary Tree (BT)
三叉树(Ternary Tree,TT)Ternary Tree (TT)
四叉树(Quad Tree,QT)Quad Tree (Quad Tree, QT)
多类型树(Multi-type Tree,MT)Multi-type Tree (MT)
嵌套多类型树的四叉树(Quad Tree with nested Multi-type Tree,QTMT)Quad Tree with nested Multi-type Tree (QTMT)
高位深(High Bit Depth,HBD)High Bit Depth (HBD)
量化参数(Quantization Parameter,QP)Quantization Parameter (QP)
编码单元(Coding Unit,CU)Coding unit (Coding Unit, CU)
编码树单元(Coding Tree Unit,CTU)Coding Tree Unit (CTU)
目前,对于VVC的块划分技术,其采用了比HEVC更为复杂的编码单元划分结构,如QTMT模式。具体地,在HEVC四叉树(QT)划分的基础上,还增加了两种二叉树(即垂直二叉树和水平二叉树)划分和两种三叉树(即垂直三叉树和水平三叉树)划分,其中,二叉树(BT)和三叉树(TT)可以统称为多类型树(MT),图1示出了相关技术提供的一种多类型树的结构示意图。Currently, for the block division technology of VVC, it adopts a more complex coding unit division structure than HEVC, such as the QTMT mode. Specifically, on the basis of the HEVC quadtree (QT) division, two kinds of binary trees (ie vertical binary tree and horizontal binary tree) and two kinds of ternary trees (ie vertical ternary tree and horizontal ternary tree) are added, wherein, A binary tree (BT) and a ternary tree (TT) may be collectively referred to as a multi-type tree (MT). FIG. 1 shows a schematic structural diagram of a multi-type tree provided by the related art.
可以理解,对于CTU来说,CTU首先使用四叉树进行划分,然后四叉树的叶子结点可以进一步采用MT进行划分。具体地,CU划分的流程如图2所示。在图2中,对于CTU/四叉树节点,首先判断是否进行四叉树划分;如果这时候的标识信息(如flag)的取值为1,表明进行四叉树划分,那么可以得到四叉树节点;如果这时候的标识信息(如flag)的取值为0,表明不进行四叉树划分,那么可以得到四叉树叶节点/多类型树节点,然后再次判断是否进行多类型树划分,直至划分得到垂直二叉划分的多类型树节点/垂直三叉划分的多类型树节点/水平二叉划分的多类型树节点/水平三叉划分的多类型树节点。It can be understood that, for the CTU, the CTU is firstly divided by the quadtree, and then the leaf nodes of the quadtree can be further divided by the MT. Specifically, the flow of CU division is shown in FIG. 2 . In Figure 2, for the CTU/quadtree node, first determine whether to perform quadtree division; if the value of the identification information (such as flag) at this time is 1, indicating that quadtree division is performed, then quadtree division can be obtained. tree node; if the value of the identification information (such as flag) at this time is 0, indicating that no quad-tree division is performed, then the quad-leaf node/multi-type tree node can be obtained, and then it is judged again whether to perform multi-type tree division, Until the division obtains a multi-type tree node divided by vertical binary division/multi-type tree node divided by vertical trigeminal division/multi-type tree node divided by horizontal binary division/multi-type tree node divided by horizontal trigeminal division.
需要注意的是,图2的几点说明如下:It should be noted that several points of Figure 2 are explained as follows:
(a)VVC的CTU默认尺寸是128×128,最小CU尺寸为4×4;(a) The default size of the CTU of VVC is 128×128, and the minimum CU size is 4×4;
(b)CTU默认先用QT模式划分成4个子CU;(b) The CTU is first divided into 4 sub-CUs in QT mode by default;
(c)如果一个CU采用MT划分,那么后续不能再进行QT划分;(c) If a CU adopts MT division, then subsequent QT division cannot be performed;
(d)理论上,QT节点可以按照图2中的5种方式划分,MT节点按照图2中的4种方式划分。其中,QT节点的5种划分方式结果包括:四叉树节点、垂直二叉树划分的多类型树节点、垂直三叉树划分的多类型树节点、水平二叉划分的多类型树节点、水平三叉划分的多类型树节点;MT节点的4种划分方式结果包括:垂直二叉树划分的多类型树节点、垂直三叉树划分的多类型树节点、水平二叉划分的多类型树节点、水平三叉划分的多类型树节点;(d) Theoretically, QT nodes can be divided according to the 5 ways shown in Figure 2, and MT nodes can be divided according to the 4 ways shown in Figure 2. Among them, the results of the five division methods of QT nodes include: quad-tree node, multi-type tree node divided by vertical binary tree, multi-type tree node divided by vertical ternary tree, multi-type tree node divided by horizontal binary tree, multi-type tree node divided by horizontal ternary tree Multi-type tree nodes; the results of the four division methods of MT nodes include: multi-type tree nodes divided by vertical binary tree, multi-type tree nodes divided by vertical trigeminal tree, multi-type tree nodes divided by horizontal binary tree, multi-type tree nodes divided by horizontal trigeminal tree tree node;
(e)图3示出了相关技术提供的一种QTMT块划分的结构示意图,其可以看作是VVC的CTU最终划分方式的一个具体示例。(e) FIG. 3 shows a schematic structural diagram of a QTMT block division provided by the related art, which can be regarded as a specific example of a final CTU division manner of VVC.
在编码器或解码器中,QTMT块划分位于帧内/帧间预测模块,根据不同的块划分寻找相应的参考块进行预测,然后查找到率失真代价最小的划分模式,从而得到最终的预测残差,可以执行下一步的变换和量化等流程。In the encoder or decoder, the QTMT block division is located in the intra/inter prediction module. According to different block divisions, the corresponding reference blocks are found for prediction, and then the division mode with the least rate-distortion cost is found to obtain the final prediction residual. If it is poor, the next steps such as transformation and quantization can be performed.
在一种具体的示例中,一种使用QTMT的编码器的实施方法为:首先,输入图像被划分成不重叠的多个CTU块。然后,按照光栅扫描顺序依次处理每个CTU,将CTU划分成若干个CU,这里主要包含以下4个步骤:①计算未划分时预测编码的第一率失真代价结果(用RdCost0表示);②将CTU按照QT模式划分并预测编码,计算第二率失真代价结果(用RdCost1表示);③比较RdCost0和RdCost1,若RdCost1更小,则继续依次处理4个子CU;④每个CU按照QT或者MT模式划分预测,计算每一种划分方式的率失真代价结果,通过比较选择当前的最优RdCost,一直递归循环下去,直到选出率失真代价最小的一种块划分模式。最后,按照最优的块划分模式计算得到残差块,再对残差块进行变换、量化、熵编码,对块划分模式等预测信息进行编码,输出码流等待传输。In a specific example, an implementation method of an encoder using QTMT is: first, the input image is divided into multiple non-overlapping CTU blocks. Then, each CTU is processed in turn according to the raster scanning order, and the CTU is divided into several CUs, which mainly includes the following four steps: ① Calculate the first rate-distortion cost result of predictive coding when it is not divided (represented by RdCost0); ② Set the The CTU is divided according to the QT mode and predicted and encoded, and the second rate-distortion cost result (represented by RdCost1) is calculated; ③ Compare RdCost0 and RdCost1, if RdCost1 is smaller, continue to process 4 sub-CUs in sequence; ④ Each CU is in QT or MT mode Partition prediction, calculate the rate-distortion cost result of each division method, select the current optimal RdCost by comparison, and repeat recursively until a block division mode with the smallest rate-distortion cost is selected. Finally, the residual block is obtained by calculating the optimal block division mode, and then the residual block is transformed, quantized, and entropy encoded, and the prediction information such as the block division mode is encoded, and the output code stream is waiting for transmission.
在另一种具体的示例中,一种使用QTMT的解码器的实施方法为:首先,输入码流进行熵解码、反量化、反变换,可以得到残差块;接着,根据残差块重建图像,重建过程主要包含以下3个步骤:①根据块划分模式等预测信息确定当前CTU的划分树;②根据光栅扫描顺序依次处理划分树的每个CU, 利用运动矢量等信息找到预测块;③将当前CU的残差值和预测值叠加,得到重建CU。最后,将重建图像送入去方块滤波器(Deblocking Filter,DBF)/样本自适应补偿(Sample Adaptive 0ffset,SAO)滤波器/自适应环路滤波器(Adaptive Loop Filter,ALF),滤波后的图像送入缓存区,等待视频播放。In another specific example, an implementation method of a decoder using QTMT is: first, perform entropy decoding, inverse quantization, and inverse transformation on the input code stream to obtain a residual block; then, reconstruct an image according to the residual block , the reconstruction process mainly includes the following three steps: 1. Determine the partition tree of the current CTU according to the prediction information such as the block partition mode; 2. Process each CU of the partition tree in turn according to the raster scan order, and use the motion vector and other information to find the prediction block; 3. The residual value and the predicted value of the current CU are superimposed to obtain the reconstructed CU. Finally, the reconstructed image is sent to the Deblocking Filter (DBF)/Sample Adaptive Offset (SAO) filter/Adaptive Loop Filter (ALF), and the filtered image Send it to the buffer area and wait for the video to play.
然而,在已有的VVC块划分技术中,即使只考虑四叉树划分,也存在有4
5+4
4+4
3+4
2+4
1+4
0=1365种划分模式,远远超过HEVC的341种模式。另外,再加上二叉树、三叉树等划分方式,理论上总的划分次数高达几千种。因此,目前的QTMT模式会导致VVC的编码复杂度远远超过HEVC;例如,编码一个高清视频序列(1080p)也需要几天时间。此外,高位深(HBD)视频编码的目标质量是超高清的,从VTM推荐测试配置可以看到,HBD序列在12bit配置下的编码QP是-13,-8,-3,2,7,12,所以大部分编码块的最终尺寸较小。也就是说,已有的VVC块划分技术在编码HBD视频时,大尺寸编码块的预测和率失真代价(Rate-Distortion cost,RDcost)计算将会产生大量不必要的开销,浪费计算资源,增加编码时间。
However, in the existing VVC block division technology, even if only quadtree division is considered, there are 4 5 +4 4 +4 3 +4 2 +4 1 +4 0 =1365 division modes, far exceeding HEVC of 341 modes. In addition, coupled with the division methods such as binary tree and ternary tree, the total number of divisions is theoretically as high as several thousand. As a result, the current QTMT mode results in much more coding complexity for VVC than HEVC; for example, it takes days to encode a high-definition video sequence (1080p). In addition, the target quality of high-bit-depth (HBD) video encoding is ultra-high-definition. It can be seen from the recommended test configuration of VTM that the encoding QP of HBD sequence in 12bit configuration is -13, -8, -3, 2, 7, 12 , so the final size of most coded blocks is smaller. That is to say, when the existing VVC block division technology encodes HBD video, the prediction and rate-distortion cost (Rate-Distortion cost, RDcost) calculation of large-size coding blocks will generate a lot of unnecessary overhead, waste computing resources, increase encoding time.
本申请实施例提供了一种块划分方法,在编码器侧,基于视频图像的纹理信息,确定当前块的最大单元尺寸信息;根据最大单元尺寸信息对当前块进行预处理,确定当前块的划分模式;根据划分模式,确定当前块的块划分参数;以及根据块划分参数,对当前块进行编码。在解码器侧,解析码流,确定当前块的块划分参数;基于块划分参数,解析码流,确定当前块的预测值;基于块划分参数,解析码流,确定当前块的残差值;以及基于预测值和残差值,确定当前块的重建值。The embodiment of the present application provides a block division method. On the encoder side, based on the texture information of a video image, the maximum unit size information of the current block is determined; the current block is preprocessed according to the maximum unit size information, and the division of the current block is determined. mode; determining a block division parameter of the current block according to the division mode; and encoding the current block according to the block division parameter. On the decoder side, the code stream is parsed to determine the block division parameters of the current block; based on the block division parameters, the code stream is parsed to determine the predicted value of the current block; based on the block division parameters, the code stream is parsed to determine the residual value of the current block; And, based on the predicted value and the residual value, the reconstructed value of the current block is determined.
这样,由于根据视频图像的纹理信息确定当前块的最大单元尺寸信息,也即本申请的技术方案对最大单元尺寸设计了自适应图像纹理的机制,能够直接跳过大尺寸块的预测和变换过程的计算,导致块划分的递归总次数实现指数级下降,从而在保持性能增益基本不变的情况下,显著地减少了编码复杂度,并且减少了编码时间,进而能够提升编解码效率。In this way, since the maximum unit size information of the current block is determined according to the texture information of the video image, that is, the technical solution of the present application designs an adaptive image texture mechanism for the maximum unit size, which can directly skip the prediction and transformation process of large-size blocks. The calculation of , leads to an exponential decrease in the total number of recursion of block division, thus significantly reducing the coding complexity and reducing the coding time while keeping the performance gain basically unchanged, thereby improving the coding and decoding efficiency.
下面将结合附图对本申请各实施例进行详细说明。The embodiments of the present application will be described in detail below with reference to the accompanying drawings.
参见图4A,其示出了本申请实施例提供的一种编码器的组成框图示意图。如图4A所示,该编码器10包括变换与量化单元101、帧内估计单元102、帧内预测单元103、运动补偿单元104、运动估计单元105、反变换与反量化单元106、滤波器控制分析单元107、滤波单元108、编码单元109和解码图像缓存单元110等,其中,滤波单元108可以实现DBF滤波/SAO滤波/ALF滤波,编码单元109可以实现头信息编码及基于上下文的自适应二进制算术编码(Context-based Adaptive Binary Arithmatic Coding,CABAC)。针对输入的原始视频信号,通过编码树单元(Coding Tree Unit,CTU)的划分可以得到一个视频编码块,然后对经过帧内或帧间预测后得到的残差像素信息通过变换与量化单元101对该视频编码块进行变换,包括将残差信息从像素域变换到变换域,并对所得的变换系数进行量化,用以进一步减少比特率;帧内估计单元102和帧内预测单元103是用于对该视频编码块进行帧内预测;明确地说,帧内估计单元102和帧内预测单元103用于确定待用以编码该视频编码块的帧内预测模式;运动补偿单元104和运动估计单元105用于执行所接收的视频编码块相对于一或多个参考帧中的一或多个块的帧间预测编码以提供时间预测信息;由运动估计单元105执行的运动估计为产生运动向量的过程,所述运动向量可以估计该视频编码块的运动,然后由运动补偿单元104基于由运动估计单元105所确定的运动向量执行运动补偿;在确定帧内预测模式之后,帧内预测单元103还用于将所选择的帧内预测数据提供到编码单元109,而且运动估计单元105将所计算确定的运动向量数据也发送到编码单元109;此外,反变换与反量化单元106是用于该视频编码块的重构建,在像素域中重构建残差块,该重构建残差块通过滤波器控制分析单元107和滤波单元108去除方块效应伪影,然后将该重构残差块添加到解码图像缓存单元110的帧中的一个预测性块,用以产生经重构建的视频编码块;编码单元109是用于编码各种编码参数及量化后的变换系数,在基于CABAC的编码算法中,上下文内容可基于相邻编码块,可用于编码指示所确定的帧内预测模式的信息,输出该视频信号的码流;而解码图像缓存单元110是用于存放重构建的视频编码块,用于预测参考。随着视频图像编码的进行,会不断生成新的重构建的视频编码块,这些重构建的视频编码块都会被存放在解码图像缓存单元110中。Referring to FIG. 4A , it shows a schematic block diagram of the composition of an encoder provided by an embodiment of the present application. As shown in FIG. 4A, the encoder 10 includes a transform and quantization unit 101, an intra-frame estimation unit 102, an intra-frame prediction unit 103, a motion compensation unit 104, a motion estimation unit 105, an inverse transform and inverse quantization unit 106, and a filter control unit. The analysis unit 107, the filtering unit 108, the encoding unit 109, the decoded image buffering unit 110, etc., wherein the filtering unit 108 can implement DBF filtering/SAO filtering/ALF filtering, and the encoding unit 109 can implement header information encoding and context-based adaptive binary Arithmetic coding (Context-based Adaptive Binary Arithmatic Coding, CABAC). For the input original video signal, a video coding block can be obtained by dividing the coding tree unit (Coding Tree Unit, CTU), and then the residual pixel information obtained after intra-frame or inter-frame prediction is transformed and quantized by the quantization unit 101. The video coding block is transformed, including transforming residual information from the pixel domain to the transform domain, and quantizing the resulting transform coefficients to further reduce the bit rate; the intra-frame estimation unit 102 and the intra-frame prediction unit 103 are used for Intra prediction is performed on the video coding block; specifically, the intra prediction unit 102 and the intra prediction unit 103 are used to determine the intra prediction mode to be used to encode the video coding block; the motion compensation unit 104 and the motion estimation unit 105 is used to perform inter-predictive encoding of the received video encoding block relative to one or more blocks in one or more reference frames to provide temporal prediction information; the motion estimation performed by the motion estimation unit 105 is to generate a motion vector. process, the motion vector can estimate the motion of the video coding block, and then the motion compensation unit 104 performs motion compensation based on the motion vector determined by the motion estimation unit 105; after determining the intra prediction mode, the intra prediction unit 103 also For providing the selected intra prediction data to the encoding unit 109, and the motion estimation unit 105 also sends the calculated motion vector data to the encoding unit 109; in addition, the inverse transform and inverse quantization unit 106 is used for the video Reconstruction of the coding block, reconstructing the residual block in the pixel domain, the reconstructed residual block removing the blocking artifacts by the filter control analysis unit 107 and the filtering unit 108, and then adding the reconstructed residual block to the decoding A predictive block in the frame of the image buffer unit 110 is used to generate a reconstructed video coding block; the coding unit 109 is used for coding various coding parameters and quantized transform coefficients. In the CABAC-based coding algorithm, The context content can be based on adjacent coding blocks, and can be used to encode information indicating the determined intra-frame prediction mode, and output a code stream of the video signal; and the decoded image buffer unit 110 is used to store the reconstructed video coding blocks, for Forecast reference. As the video image coding proceeds, new reconstructed video coding blocks are continuously generated, and these reconstructed video coding blocks are all stored in the decoded image buffer unit 110 .
参见图4B,其示出了本申请实施例提供的一种解码器的组成框图示意图。如图4B所示,该解码器20包括解码单元201、反变换与反量化单元202、帧内预测单元203、运动补偿单元204、滤波单元205和解码图像缓存单元206等,其中,解码单元201可以实现头信息解码以及CABAC解码,滤波单元205可以实现DBF滤波/SAO滤波/ALF滤波。输入的视频信号经过图4A的编码处理之后,输出该视频信号的码流;该码流输入视频解码系统20中,首先经过解码单元201,用于得到解码后的变换系数;针对该变换系数通过反变换与反量化单元202进行处理,以便在像素域中产生残差块;帧内预测单元203可用于基于所确定的帧内预测模式和来自当前帧或图片的先前经解码块的数据而产生当前视频解码块的预测数据;运动补偿单元204是通过剖析运动向量和其他关联语法元素来确定用于视频解码块的预测信息,并使用该预测信息以产生正被解码的视频解码块的预测性块;通过对来自反变换与反量化 单元202的残差块与由帧内预测单元203或运动补偿单元204产生的对应预测性块进行求和,而形成解码的视频块;该解码的视频信号通过滤波单元205以便去除方块效应伪影,可以改善视频质量;然后将经解码的视频块存储于解码图像缓存单元206中,解码图像缓存单元206存储用于后续帧内预测或运动补偿的参考图像,同时也用于视频信号的输出,即得到了所恢复的原始视频信号。Referring to FIG. 4B , it shows a schematic block diagram of the composition of a decoder provided by an embodiment of the present application. As shown in FIG. 4B, the decoder 20 includes a decoding unit 201, an inverse transform and inverse quantization unit 202, an intra prediction unit 203, a motion compensation unit 204, a filtering unit 205, a decoded image buffer unit 206, etc., wherein the decoding unit 201 Header decoding and CABAC decoding can be implemented, and the filtering unit 205 can implement DBF filtering/SAO filtering/ALF filtering. After the input video signal is subjected to the encoding process of FIG. 4A, the code stream of the video signal is output; the code stream is input into the video decoding system 20, and firstly passes through the decoding unit 201 to obtain the decoded transform coefficient; Inverse transform and inverse quantization unit 202 processes to generate residual blocks in the pixel domain; intra prediction unit 203 may be used to generate based on the determined intra prediction mode and data from previously decoded blocks of the current frame or picture Prediction data for the current video decoding block; motion compensation unit 204 determines prediction information for the video decoding block by parsing the motion vector and other associated syntax elements, and uses the prediction information to generate predictive information for the video decoding block being decoded block; a decoded video block is formed by summing the residual block from inverse transform and inverse quantization unit 202 and the corresponding predictive block produced by intra prediction unit 203 or motion compensation unit 204; the decoded video signal Video quality may be improved by filtering unit 205 in order to remove blocking artifacts; decoded video blocks are then stored in decoded image buffer unit 206, which stores reference images for subsequent intra prediction or motion compensation , and is also used for the output of the video signal, that is, the restored original video signal is obtained.
需要说明的是,本申请实施例中的块划分方法可以应用于视频编解码芯片中,使用QTMT模式能够显著提升编码性能。在这里,可以应用在如图4A所示的帧内/帧间预测部分(用黑色加粗方框表示,具体包括帧内估计单元102、帧内预测单元103、运动补偿单元104、运动估计单元105),也可以应用在如图4B所示的帧内/帧间预测部分(用黑色加粗方框表示,具体包括帧内预测单元203和运动补偿单元204)。也就是说,本申请实施例中的块划分方法,既可以应用于视频编码系统(简称为“编码器”),也可以应用于视频解码系统(简称为“解码器”),甚至还可以同时应用于视频编码系统和视频解码系统,但是这里不作任何限定。It should be noted that the block division method in the embodiment of the present application can be applied to a video codec chip, and the use of the QTMT mode can significantly improve the coding performance. Here, it can be applied to the intra/inter prediction part as shown in FIG. 4A (represented by a black bold box, specifically including the intra-frame estimation unit 102, the intra-frame prediction unit 103, the motion compensation unit 104, the motion estimation unit 105), it can also be applied to the intra/inter prediction part as shown in FIG. 4B (represented by a bold black box, specifically including the intra prediction unit 203 and the motion compensation unit 204). That is to say, the block division method in the embodiments of the present application can be applied to a video encoding system (referred to as "encoder" for short), also can be applied to a video decoding system (referred to as "decoder" for short), or even simultaneously It is applied to the video coding system and the video decoding system, but no limitation is made here.
还需要说明的是,当本申请实施例应用于编码器10时,“当前块”具体是指视频图像中的当前待编码的块(也可以简称为“编码块”);当本申请实施例应用于解码器20时,“当前块”具体是指视频图像中的当前待解码的块(也可以简称为“解码块”)。It should also be noted that when the embodiments of the present application are applied to the encoder 10, the “current block” specifically refers to the block currently to be encoded in the video image (may also be referred to as “encoding blocks” for short); When applied to the decoder 20, the "current block" specifically refers to the block currently to be decoded in the video image (it may also be referred to as a "decoding block" for short).
在本申请的一实施例中,参见图5,其示出了本申请实施例提供的一种块划分方法的流程示意图。如图5所示,该方法可以包括:In an embodiment of the present application, referring to FIG. 5 , it shows a schematic flowchart of a block division method provided by an embodiment of the present application. As shown in Figure 5, the method may include:
S501:基于视频图像的纹理信息,确定当前块的最大单元尺寸信息。S501: Determine maximum unit size information of the current block based on the texture information of the video image.
需要说明的是,本申请实施例的块划分方法应用于编码器。这里,对于一视频图像而言,视频图像可以划分为多个图像块,每个待编码的图像块均可以称为编码块,而这里的当前块具体是指当前待执行编码的编码块,其可以是一个CTU,甚至可以是一个CU等,本申请实施例不作任何限定。It should be noted that the block division method in the embodiment of the present application is applied to an encoder. Here, for a video image, the video image can be divided into a plurality of image blocks, and each image block to be encoded can be called an encoding block, and the current block here specifically refers to the encoding block currently to be encoded. It may be a CTU, or even a CU, etc., which is not limited in any embodiment of the present application.
还需要说明的是,本申请实施例主要是提供一种基于纹理分析的高位深视频的快速块划分技术,即应用于高位深视频。因此,在一些实施例中,该方法还可以包括:It should also be noted that the embodiments of the present application mainly provide a fast block division technology for high bit depth video based on texture analysis, that is, applied to high bit depth video. Therefore, in some embodiments, the method may further include:
确定视频图像的标识信息;Determine the identification information of the video image;
在视频图像的标识信息指示视频图像为高位深视频时,执行基于视频图像的纹理信息,确定当前块的最大单元尺寸信息的步骤。When the identification information of the video image indicates that the video image is a high bit depth video, the step of determining the maximum unit size information of the current block based on the texture information of the video image is performed.
在本申请实施例中,判断视频图像是否为高位深视频,可以利用视频图像的标识信息来表示。具体地,在一些实施例中,所述确定视频图像的标识信息,可以包括:In this embodiment of the present application, the determination of whether a video image is a high bit depth video may be represented by the identification information of the video image. Specifically, in some embodiments, the determining the identification information of the video image may include:
若视频图像的标识信息指示视频图像为高位深视频,则确定视频图像的标识信息的取值为第一值;或者,If the identification information of the video image indicates that the video image is a high bit depth video, then determine that the value of the identification information of the video image is the first value; or,
若视频图像的标识信息指示视频图像为非高位深视频,则确定视频图像的标识信息的取值为第二值。If the identification information of the video image indicates that the video image is a non-high bit-depth video, the value of the identification information of the video image is determined to be the second value.
需要说明的是,第一值和第二值不同,而且第一值和第二值可以是参数形式,也可以是数字形式。在通常情况下,视频图像的标识信息是写入在概述(profile)中的参数,但是视频图像的标识信息也可以是一个标志(flag),这里并不作任何限定。It should be noted that the first value and the second value are different, and the first value and the second value may be in the form of parameters or in the form of numbers. In general, the identification information of the video image is a parameter written in the profile (profile), but the identification information of the video image may also be a flag (flag), which is not limited here.
还需要说明的是,如果视频图像的标识信息为一个flag,那么在一种具体的示例中,第一值可以设置为1,第二值可以设置为0;在另一具体的示例中,第一值还可以设置为true,第二值还可以设置为false;甚至在又一具体的示例中,第一值还可以设置为0,第二值还可以设置为1;或者,第一值还可以设置为false,第二值还可以设置为true。本申请实施例第一值和第二值不作任何限定。It should also be noted that if the identification information of the video image is a flag, then in a specific example, the first value may be set to 1, and the second value may be set to 0; in another specific example, the first value may be set to 0. One value can also be set to true, and the second value can also be set to false; even in another specific example, the first value can also be set to 0, and the second value can also be set to 1; or, the first value can also be set to Can be set to false, and the second value can also be set to true. The first value and the second value in this embodiment of the present application are not limited in any way.
也就是说,本申请实施例提供了一种编码方法,具体提供了一种块划分方法,更具体地,本申请实施例对高位深视频设计了基于图像纹理的自适应最大单元尺寸机制,使得后续能够直接跳过大尺寸块的预测和变换过程的计算。That is to say, the embodiments of the present application provide an encoding method, and specifically provide a block division method. More specifically, the embodiments of the present application design an adaptive maximum unit size mechanism based on image texture for high-bit-depth video, so that Subsequent computations of prediction and transform processes for large-sized blocks can be skipped directly.
进一步地,在一些实施例中,该方法还可以包括:对视频图像的标识信息进行编码,将编码比特写入码流。如此,编码器在将视频图像的标识信息写入码流后,使得解码器可以通过解析码流直接确定出视频图像是否为高位深视频,以方便解码器执行后续操作。Further, in some embodiments, the method may further include: encoding the identification information of the video image, and writing the encoded bits into the code stream. In this way, after the encoder writes the identification information of the video image into the code stream, the decoder can directly determine whether the video image is a high-bit-depth video by parsing the code stream, so as to facilitate the decoder to perform subsequent operations.
还需要说明的是,对于最大单元尺寸信息而言,可以是用于限制二叉树划分时的BT最大单元尺寸(用maxBtSize表示),也可以是用于限制三叉树划分的TT最大单元尺寸信息(用maxTtSize表示)。换句话说,最大单元尺寸信息可以用maxBtSize或者maxTtSize表示。It should also be noted that, for the maximum unit size information, it can be the BT maximum unit size (represented by maxBtSize) used to limit the binary tree division, or the TT maximum unit size information used to limit the ternary tree division (represented by maxTtSize indicates). In other words, the maximum cell size information may be represented by maxBtSize or maxTtSize.
进一步地,在一些实施例中,对于最大单元尺寸信息而言,其级别至少可以为下述其中之一:序列级、图像级。Further, in some embodiments, for the maximum unit size information, its level may be at least one of the following: sequence level and image level.
示例性地,在确定最大单元尺寸信息时,可以是输入一个视频序列,然后以初始帧作为视频图像,据此确定出初始帧对应的最大单元尺寸信息。在一种具体的示例中,如果最大单元尺寸信息为序列级, 那么整个视频序列均可以采用这一最大单元尺寸信息。在另一种具体的示例中,如果最大单元尺寸信息为图像级(或称为“帧级”),那么对于整个视频序列,每一帧对应的最大单元尺寸信息可以是不同的,其是以每一帧作为视频图像来分别确定每一帧对应的最大单元尺寸信息。还需要注意的是,针对同一帧内的不同块,其最大单元尺寸信息是相同的。Exemplarily, when determining the maximum unit size information, a video sequence may be input, and then the initial frame is used as a video image, and the maximum unit size information corresponding to the initial frame is determined accordingly. In a specific example, if the maximum unit size information is at the sequence level, the entire video sequence can use this maximum unit size information. In another specific example, if the maximum unit size information is at the picture level (or called "frame level"), then for the entire video sequence, the maximum unit size information corresponding to each frame may be different, which is Each frame is used as a video image to determine the corresponding maximum unit size information of each frame. It should also be noted that for different blocks in the same frame, the maximum unit size information is the same.
这样,在本申请实施例中,首先需要对高位深视频进行纹理分析,在确定出该视频图像的纹理信息后,可以进一步确定出最大单元尺寸信息,也即实现了基于图像纹理的自适应最大单元尺寸机制。In this way, in the embodiment of the present application, it is first necessary to perform texture analysis on the high-bit-depth video, and after determining the texture information of the video image, the maximum unit size information can be further determined, that is, the adaptive maximum unit size based on the image texture is realized. Cell size mechanism.
S502:根据最大单元尺寸信息对当前块进行预处理,确定当前块的划分模式。S502: Preprocess the current block according to the maximum unit size information to determine the division mode of the current block.
需要说明的是,在确定出最大单元尺寸信息之后,可以根据最大单元尺寸信息对当前块进行预处理,比如计算不同划分模式下的率失真代价值,然后从中选取最优率失真代价值(或称为“最小率失真代价值”)来确定当前块的划分模式。It should be noted that after the maximum unit size information is determined, the current block can be preprocessed according to the maximum unit size information, such as calculating the rate-distortion cost value under different division modes, and then selecting the optimal rate-distortion cost value (or called "minimum rate-distortion cost") to determine the partition mode of the current block.
也就是说,本申请实施例可以是利用计算率失真代价的方式来确定划分模式。在一些实施例中,根据最大单元尺寸信息对当前块进行预处理,确定当前块的划分模式,可以包括:That is, in this embodiment of the present application, the division mode may be determined by calculating the rate-distortion cost. In some embodiments, the current block is preprocessed according to the maximum unit size information, and the division mode of the current block is determined, which may include:
利用最大单元尺寸信息对当前块进行划分,得到至少一个第一节点子块,并计算第一率失真代价值;Use the maximum unit size information to divide the current block to obtain at least one first node sub-block, and calculate the first rate-distortion cost value;
利用预设划分模式对第一节点子块进行划分,得到至少一个第二节点子块,并计算第二率失真代价值;Divide the first node sub-block by using the preset division mode, obtain at least one second node sub-block, and calculate the second rate-distortion cost value;
根据第一率失真代价值和第二率失真代价值的比较结果,确定当前块的划分模式。The division mode of the current block is determined according to the comparison result of the first rate-distortion cost value and the second rate-distortion cost value.
进一步地,在一些实施例中,根据第一率失真代价值和第二率失真代价值的比较结果,确定当前块的划分模式,可以包括:Further, in some embodiments, determining the division mode of the current block according to the comparison result of the first rate-distortion cost value and the second rate-distortion cost value may include:
将第一率失真代价值和第二率失真代价值进行比较;comparing the first rate-distortion cost value and the second rate-distortion cost value;
在第二率失真代价值小于第一率失真代价值的情况下,利用预设划分模式对第二节点子块进行划分,得到至少一个下一级第二节点子块,并计算第三率失真代价值;In the case where the second rate-distortion cost value is less than the first rate-distortion cost value, use a preset division mode to divide the second node sub-block to obtain at least one next-level second node sub-block, and calculate the third rate-distortion cost value;
根据第二率失真代价值和第三率失真代价值,确定当前块的划分模式。The division mode of the current block is determined according to the second rate-distortion cost value and the third rate-distortion cost value.
需要说明的是,当第二率失真代价值小于第一率失真代价值时,针对节点子块还需要利用预设划分模式继续进行划分,这是一个递归循环的过程,直至确定出最小率失真代价值。在一些实施例中,根据第二率失真代价值和第三率失真代价值,确定当前块的划分模式,可以包括:It should be noted that when the second rate-distortion cost value is less than the first rate-distortion cost value, the node sub-blocks need to be further divided by using the preset division mode, which is a recursive process until the minimum rate-distortion is determined. cost value. In some embodiments, determining the division mode of the current block according to the second rate-distortion cost value and the third rate-distortion cost value may include:
在第三率失真代价值小于第二率失真代价值的情况下,利用第三率失真代价值更新第二率失真代价值,并返回执行利用预设划分模式对第二节点子块进行划分,得到至少一个下一级第二节点子块,并计算第三率失真代价值的步骤,直至确定出最小率失真代价值;In the case that the third rate-distortion cost value is less than the second rate-distortion cost value, update the second rate-distortion cost value by using the third rate-distortion cost value, and return to perform dividing the second node sub-block by using the preset dividing mode, The steps of obtaining at least one next-level second node sub-block, and calculating the third rate-distortion cost value, until the minimum rate-distortion cost value is determined;
根据最小率失真代价值对应的划分模式,确定当前块的划分模式。The division mode of the current block is determined according to the division mode corresponding to the minimum rate-distortion cost value.
需要说明的是,第一节点子块可以是针对当前块进行第一次划分得到的节点子块,第二节点子块可以是基于预设划分模式下进行继续划分得到的节点子块,也可以看作是从第二次划分开始,之后的逐级划分所得到的节点子块均可以统称为第二节点子块。It should be noted that the first node sub-block may be a node sub-block obtained by dividing the current block for the first time, and the second node sub-block may be a node sub-block obtained by continuing division based on a preset division mode, or It is regarded as starting from the second division, and the node sub-blocks obtained by the subsequent step-by-step division can be collectively referred to as the second node sub-blocks.
示例性地,对于第i级,第二率失真代价值表示不对第i级继续进行划分的率失真代价值;而利用预设划分模式对当前级的节点子块继续进行划分,可以得到第i+1级的节点子块,此时可以计算出第三率失真代价值,这里的第三率失真代价值表示对第i级继续进行划分的率失真代价值。然后执行第二率失真代价值与第三率失真代价值的比较,如果第三率失真代价值小于第二率失真代价值,对于第i+1级,这时候可以将第三率失真代价值看作是不对第i+1级继续进行划分的第二率失真代价值;而利用预设划分模式对当前级的节点子块继续进行划分,可以得到第i+2级的节点子块,并计算出新的第三率失真代价值,这时候的第三率失真代价值表示对第i+1级继续进行划分的率失真代价值,再次执行第二率失真代价值与第三率失真代价值的比较。如此,当第三率失真代价值小于第二率失真代价值时,可以针对当前级的节点子块再次进行下一级节点子块的划分,并继续进行率失真代价值的比较,一直递归循环下去,直至确定出最小率失真代价值,然后将最小率失真代价值对应的划分模式确定为划分模式。Exemplarily, for the ith level, the second rate-distortion cost value represents the rate-distortion cost value that does not continue to divide the ith level; and by using the preset division mode to continue dividing the node sub-blocks of the current level, the ith level can be obtained. For the node sub-block of the +1 level, the third rate-distortion cost value can be calculated at this time, and the third rate-distortion cost value here represents the rate-distortion cost value of continuing to divide the i-th level. Then perform a comparison between the second rate distortion cost value and the third rate distortion cost value. If the third rate distortion cost value is less than the second rate distortion cost value, for the i+1th level, the third rate distortion cost value can be calculated at this time. It is regarded as the second rate-distortion cost value that does not continue to divide the i+1th level; and by using the preset division mode to continue to divide the node sub-block of the current level, the node sub-block of the i+2th level can be obtained, and Calculate the new third rate-distortion cost value. At this time, the third rate-distortion cost value represents the rate-distortion cost value for the continued division of the i+1 level, and execute the second rate-distortion cost value and the third rate-distortion cost value again. value comparison. In this way, when the third rate-distortion cost value is less than the second rate-distortion cost value, the node sub-blocks of the current level can be divided into the next-level node sub-blocks again, and the rate-distortion cost value comparison can be continued, and the recursive cycle has been carried out. Go on until the minimum rate-distortion cost value is determined, and then the division mode corresponding to the minimum rate-distortion cost value is determined as the division mode.
还需要说明的是,在本申请实施例中,预设划分模式可以包括四叉树划分模式和/或多类型树划分模式;其中,多类型树划分模式又可以包括下述至少之一:垂直二叉树划分模式、水平二叉树划分模式、垂直三叉树划分模式和水平三叉树划分模式。It should also be noted that, in this embodiment of the present application, the preset division mode may include a quad-tree division mode and/or a multi-type tree division mode; wherein, the multi-type tree division mode may include at least one of the following: vertical Binary tree partition mode, horizontal binary tree partition mode, vertical ternary tree partition mode and horizontal ternary tree partition mode.
在这里,垂直二叉树划分模式和水平二叉树划分模式可统称为二叉树划分模式,垂直三叉树划分模式和水平三叉树划分模式可统称为三叉树划分模式。如此,对于“利用预设划分模式对第一节点子块进行划分,得到至少一个第二节点子块”,具体来讲,如果利用四叉树划分模式对第一节点子块进行划分,可以得到四个第二节点子块;如果利用二叉树划分模式对第一节点子块进行划分,可以得到二个第二节点子块;如果利用三叉树划分模式对第一节点子块进行划分,可以得到三个第二节点子块。Here, the vertical binary tree division mode and the horizontal binary tree division mode may be collectively referred to as a binary tree division mode, and the vertical ternary tree division mode and the horizontal ternary tree division mode may be collectively referred to as a ternary tree division mode. In this way, for "using the preset division mode to divide the first node sub-block to obtain at least one second node sub-block", specifically, if the first node sub-block is divided by using the quad-tree division mode, it can be obtained Four second node sub-blocks; if the first node sub-block is divided by the binary tree division mode, two second node sub-blocks can be obtained; if the first node sub-block is divided by the ternary tree division mode, three second node sub-blocks can be obtained. A second node sub-block.
在这里,利用预设划分模式对第一节点子块进行划分,可以是利用四叉树划分模式、垂直二叉树划分模式、水平二叉树划分模式、垂直三叉树划分模式和水平三叉树划分模式等中的每一种划分模式对第 一节点子块进行划分,并且能够分别计算出一个率失真代价值,然后从所计算的率失真代价值中选取最小率失真代价值作为第三率失真代价值;在第三率失真代价值小于第二率失真代价值的情况下,将对所得到的第二节点子块进行继续划分,一直递归循环下去,直至确定出最小率失真代价值,最后可以根据最小率失真代价值对应的划分模式来确定当前块的划分模式。Here, the sub-block of the first node is divided by a preset division mode, which may be a quad-tree division mode, a vertical binary tree division mode, a horizontal binary tree division mode, a vertical ternary tree division mode, a horizontal ternary tree division mode, or the like. Each division mode divides the first node sub-block, and can calculate a rate-distortion cost value respectively, and then select the minimum rate-distortion cost value from the calculated rate-distortion cost value as the third rate-distortion cost value; When the third rate-distortion cost value is less than the second rate-distortion cost value, the obtained second node sub-blocks will continue to be divided, and the recursive cycle will continue until the minimum rate-distortion cost value is determined. The division mode corresponding to the distortion cost value is used to determine the division mode of the current block.
除此之外,在一些实施例中,该方法还可以包括:在第二率失真代价值大于或等于第一率失真值的情况下,直接将根据最大单元尺寸信息对当前块进行划分的模式确定为当前块的划分模式。Besides, in some embodiments, the method may further include: when the second rate-distortion cost value is greater than or equal to the first rate-distortion value, directly dividing the current block according to the maximum unit size information Determines the division mode of the current block.
也就是说,如果第二率失真代价值大于或等于第一率失真值,意味着不再对节点子块进行下一级划分时的率失真代价最小,那么可以直接将根据最大单元尺寸信息对当前块进行划分的模式确定为当前块的划分模式,这时候不再需要对所得到的节点子块继续划分。That is to say, if the second rate-distortion cost value is greater than or equal to the first rate-distortion value, it means that the rate-distortion cost is the smallest when the node sub-blocks are no longer divided into the next level, then it can be directly adjusted according to the maximum unit size information. The division mode of the current block is determined as the division mode of the current block, and at this time, it is no longer necessary to continue dividing the obtained node sub-blocks.
这样,在利用视频图像的纹理信息确定出最大单元尺寸信息之后,可以根据最大单元尺寸信息对当前块进行预处理,进而能够确定出当前块的划分模式,以便实现对当前块的块划分操作。In this way, after the maximum unit size information is determined by using the texture information of the video image, the current block can be preprocessed according to the maximum unit size information, and then the division mode of the current block can be determined, so as to realize the block division operation of the current block.
S503:根据划分模式,确定当前块的块划分参数。S503: Determine block division parameters of the current block according to the division mode.
S504:根据块划分参数,对当前块进行编码。S504: Encode the current block according to the block division parameter.
需要说明的是,划分模式为具体呈现的块划分方式,而这里的块划分参数可以为指示块划分的标识信息,如split_cu_flag[x0][y0]。在确定出块划分参数后,可以根据块划分参数对当前块进行编码。It should be noted that the division mode is a specific block division manner, and the block division parameter here may be identification information indicating block division, such as split_cu_flag[x0][y0]. After the block division parameters are determined, the current block may be encoded according to the block division parameters.
在一种可能的实施方式中,所述根据块划分参数,对当前块进行编码,可以包括:In a possible implementation manner, the encoding of the current block according to the block division parameter may include:
对块划分参数进行编码,将编码比特写入码流。The block division parameters are encoded, and the encoded bits are written into the code stream.
需要说明的是,在确定出块划分参数之后,为了使得解码器能够获得块划分参数,那么编码器需要对块划分参数进行编码,然后写入码流等待由编码器传输到解码器。It should be noted that, after the block division parameters are determined, in order to enable the decoder to obtain the block division parameters, the encoder needs to encode the block division parameters, and then writes the code stream to wait for transmission from the encoder to the decoder.
在另一种可能的实施方式中,所述根据块划分参数,对当前块进行编码,可以包括:In another possible implementation manner, the encoding of the current block according to the block division parameter may include:
根据块划分参数,将当前块划分成一个或多个节点子块;Divide the current block into one or more node sub-blocks according to the block division parameters;
按照预设的节点子块处理顺序,依次确定每一个节点子块的预测参数;According to the preset processing order of node sub-blocks, the prediction parameters of each node sub-block are sequentially determined;
根据预测参数,确定节点子块的预测值;According to the prediction parameters, determine the predicted value of the node sub-block;
根据节点子块的原始值和预测值,确定节点子块的残差值;Determine the residual value of the node sub-block according to the original value and the predicted value of the node sub-block;
对节点子块的预测参数和残差值进行编码,将编码比特写入码流。The prediction parameters and residual values of the node sub-blocks are encoded, and the encoded bits are written into the code stream.
需要说明的是,预设的节点子块处理顺序可以为预设扫描顺序。在这里,预设扫描顺序可以是对角线、Zigzag、水平、垂直、4×4子块扫描或者任何其它光栅扫描顺序,本申请实施例不作任何限定。It should be noted that the preset processing order of node sub-blocks may be the preset scanning order. Here, the preset scanning sequence may be diagonal, Zigzag, horizontal, vertical, 4×4 sub-block scanning, or any other raster scanning sequence, which is not limited in this embodiment of the present application.
还需要说明的是,在利用划分模式对当前块进行块划分操作后,可以确定出残差值。这时候可以对残差值进行变换、量化和熵编码,对节点子块的预测参数进行编码,然后写入码流等待由编码器传输到解码器。It should also be noted that after the block division operation is performed on the current block by using the division mode, the residual value can be determined. At this time, the residual value can be transformed, quantized and entropy encoded, and the prediction parameters of the node sub-blocks can be encoded, and then written into the code stream to be transmitted from the encoder to the decoder.
除此之外,本申请实施例还可以提供一种码流,该码流是根据相关参数进行比特编码生成的。其中,相关参数至少可以包括下述之一:块划分参数、节点子块的预测参数、残差值和视频图像的标识信息。In addition, the embodiment of the present application may also provide a code stream, where the code stream is generated by bit encoding according to relevant parameters. The relevant parameters may include at least one of the following: a block division parameter, a prediction parameter of a node sub-block, a residual value, and identification information of a video image.
本实施例提供了一种块划分方法,应用于编码器。基于视频图像的纹理信息,确定当前块的最大单元尺寸信息;根据最大单元尺寸信息对当前块进行预处理,确定当前块的划分模式;根据划分模式,确定当前块的块划分参数;根据块划分参数,对当前块进行编码。这样,由于根据视频图像的纹理信息确定当前块的最大单元尺寸信息,也即本申请的技术方案对最大单元尺寸设计了自适应图像纹理的机制,能够直接跳过大尺寸块的预测和变换过程的计算,导致块划分的递归总次数实现指数级下降,从而在保持性能增益基本不变的情况下,显著地减少了编码复杂度,并且减少了编码时间,进而能够提升编解码效率。This embodiment provides a block division method, which is applied to an encoder. Based on the texture information of the video image, the maximum unit size information of the current block is determined; the current block is preprocessed according to the maximum unit size information, and the division mode of the current block is determined; the block division parameters of the current block are determined according to the division mode; parameter to encode the current block. In this way, since the maximum unit size information of the current block is determined according to the texture information of the video image, that is, the technical solution of the present application designs an adaptive image texture mechanism for the maximum unit size, which can directly skip the prediction and transformation process of large-size blocks. The calculation of , leads to an exponential decrease in the total number of recursion of block division, thus significantly reducing the coding complexity and reducing the coding time while keeping the performance gain basically unchanged, thereby improving the coding and decoding efficiency.
在本申请的另一实施例中,对于当前块的最大单元尺寸信息的确定,参见图6,其示出了本申请实施例提供的一种确定最大单元尺寸信息的流程示意图。如图6所示,该流程可以包括:In another embodiment of the present application, for the determination of the maximum unit size information of the current block, see FIG. 6 , which shows a schematic flowchart of determining the maximum unit size information provided by an embodiment of the present application. As shown in Figure 6, the process may include:
S601:对视频图像进行块划分,得到N个预设尺寸的块;其中,N为大于零的整数,且N个块之间互不重叠。S601: Divide the video image into blocks to obtain N blocks of a preset size; wherein, N is an integer greater than zero, and the N blocks do not overlap each other.
需要说明的是,预设尺寸是指预先设定的一个块尺寸值。在这里,预设尺寸可以是8、16、32、64等等中的任一项,也可以是8×8、16×16、32x32、64×64等等中的任一项,本申请实施例不作具体限定。在一种具体的示例中,预设尺寸可以为64×64;这时候对于视频图像而言,可以将其划分成N个64×64的互不重叠块。It should be noted that the preset size refers to a preset block size value. Here, the preset size can be any one of 8, 16, 32, 64, etc., or any one of 8×8, 16×16, 32×32, 64×64, etc., which is implemented in this application. Examples are not specifically limited. In a specific example, the preset size may be 64×64; at this time, for the video image, it may be divided into N non-overlapping blocks of 64×64.
S602:对N个块进行纹理分析,确定第一数量;其中,第一数量表征N个块中纹理值小于第一阈值的块数量。S602: Perform texture analysis on the N blocks to determine a first quantity; where the first quantity represents the quantity of blocks whose texture values are smaller than a first threshold in the N blocks.
在这里,对于纹理复杂度偏低的块数量,即第一数量的确定,在一些实施例中,所述对N个块进行纹理分析,确定第一数量,可以包括:Here, for the number of blocks with low texture complexity, that is, the determination of the first number, in some embodiments, performing texture analysis on N blocks to determine the first number may include:
计算N个块的纹理值;Calculate the texture values of N blocks;
将N个块的纹理值依次与第一阈值进行比较;comparing the texture values of the N blocks with the first threshold in sequence;
根据比较结果对纹理值小于第一阈值的块进行数量统计,得到第一数量。According to the comparison result, the number of blocks whose texture value is less than the first threshold is counted to obtain the first number.
在一种具体的实现方式中,该方法还可以包括:In a specific implementation, the method may further include:
设置统计值的初始值为零;Set the initial value of the statistical value to zero;
将第i个块的纹理值与第一阈值进行比较;comparing the texture value of the ith block with the first threshold;
若第i个块的纹理值小于第一阈值,则对统计值执行加1处理,且i=i+1;If the texture value of the i-th block is less than the first threshold, the statistic value is incremented by 1, and i=i+1;
当i小于N时,继续执行将第i个块的纹理值与第一阈值进行比较的步骤;When i is less than N, continue to perform the step of comparing the texture value of the ith block with the first threshold;
当i等于N时,将所得到的统计值确定为第一数量;其中,i为大于或等于零的整数。When i is equal to N, the obtained statistical value is determined as the first number; wherein, i is an integer greater than or equal to zero.
需要说明的是,第一数量表示N个块中纹理复杂度偏低的块数量。这里,衡量纹理复杂度的高低可以通过第一阈值来实现,如果纹理值大于或等于第一阈值,那么表明该块的纹理相对复杂;如果纹理值小于第一阈值,那么表明该块的纹理相对简单(即纹理复杂度偏低)。It should be noted that the first number represents the number of blocks with low texture complexity among the N blocks. Here, the level of texture complexity can be measured by the first threshold. If the texture value is greater than or equal to the first threshold, it indicates that the texture of the block is relatively complex; if the texture value is less than the first threshold, it indicates that the texture of the block is relatively complex. Simple (ie low texture complexity).
还需要说明的是,对于第一阈值而言,可以用T表示。在通常情况下,T的取值等于4×2^(bitdepth-8),但是并不作具体限定。It should also be noted that, for the first threshold, it can be represented by T. In general, the value of T is equal to 4×2^(bitdepth-8), but it is not specifically limited.
S603:根据第一数量与第二阈值的比较结果,确定当前块的最大单元尺寸信息。S603: Determine maximum unit size information of the current block according to the comparison result between the first number and the second threshold.
需要说明的是,在确定出第一数量后,还可以设置第二阈值,用于确定当前块的最大单元尺寸信息。在一些实施例中,根据第一数量与第二阈值的比较结果,确定当前块的最大单元尺寸信息,可以包括:It should be noted that, after the first number is determined, a second threshold may also be set for determining the maximum unit size information of the current block. In some embodiments, determining the maximum unit size information of the current block according to the comparison result between the first number and the second threshold, which may include:
确定第一数量与N的比值,将比值与第二阈值进行比较;determining a ratio of the first number to N, and comparing the ratio with a second threshold;
在比值小于第二阈值的情况下,确定当前块的最大单元尺寸信息为第一尺寸值。In the case that the ratio is smaller than the second threshold value, the maximum unit size information of the current block is determined to be the first size value.
进一步地,在比值大于或等于第二阈值的情况下,该方法还可以包括:Further, when the ratio is greater than or equal to the second threshold, the method may further include:
将比值与第三阈值进行比较;comparing the ratio with a third threshold;
若比值小于第三阈值,则确定当前块的最大单元尺寸信息为第二尺寸值。If the ratio is smaller than the third threshold, the maximum unit size information of the current block is determined to be the second size value.
进一步地,该方法还可以包括:若比值大于或等于第三阈值,则确定当前块的最大单元尺寸信息为默认值。Further, the method may further include: if the ratio is greater than or equal to a third threshold, determining the maximum unit size information of the current block as a default value.
需要说明的是,第一阈值与第二阈值和第三阈值不同,且第二阈值小于第三阈值。这里,第一阈值可以用c1表示,第二阈值可以用c2表示。在一种具体的示例中,c1的取值等于0.15,c2的取值等于0.3,但是本申请实施例不作具体限定。It should be noted that the first threshold is different from the second threshold and the third threshold, and the second threshold is smaller than the third threshold. Here, the first threshold may be represented by c1, and the second threshold may be represented by c2. In a specific example, the value of c1 is equal to 0.15, and the value of c2 is equal to 0.3, but the embodiment of the present application does not specifically limit it.
另外,第一尺寸值与第二尺寸值不同。在一种具体的示例中,第一尺寸值可以为8,第二尺寸值可以为16,但是本申请实施例也不作具体限定。In addition, the first size value is different from the second size value. In a specific example, the first size value may be 8, and the second size value may be 16, but the embodiment of the present application does not specifically limit it.
还需要说明的是,假定第一数量用j表示,那么比值可以用j/N表示,而第一尺寸值为8,第二尺寸值为16。这样,如果j/N<c1,那么最大单元尺寸信息为8;如果c1≤j/N<c2,那么最大单元尺寸信息为16;如果j/N≥c2,那么最大单元尺寸信息为默认值。It should also be noted that, assuming that the first quantity is represented by j, the ratio can be represented by j/N, and the value of the first size is 8 and the value of the second size is 16. In this way, if j/N<c1, then the maximum cell size information is 8; if c1≤j/N<c2, then the maximum cell size information is 16; if j/N≥c2, then the maximum cell size information is the default value.
在本申请实施例中,对于纹理值的计算,可以根据方差计算来确定,也可以根据其他方式来确定,比如水平梯度和垂直梯度的绝对值求和方式等。In this embodiment of the present application, the calculation of the texture value may be determined according to variance calculation, or may be determined according to other methods, such as a method of summing the absolute values of the horizontal gradient and the vertical gradient.
在一种可能的实施方式中,所述计算N个块的纹理值,可以包括:In a possible implementation manner, the calculating texture values of N blocks may include:
对第k个块进行方差值计算,得到第k个块的纹理值。Calculate the variance value of the kth block to obtain the texture value of the kth block.
在另一种可能的实施方式中,所述计算N个块的纹理值,可以包括:In another possible implementation manner, the calculating the texture values of the N blocks may include:
确定第k个块的水平梯度绝对值和垂直梯度绝对值;Determine the absolute value of the horizontal gradient and the absolute value of the vertical gradient of the kth block;
对水平梯度绝对值和垂直梯度绝对值进行求和计算,得到第k个块的纹理值。The absolute value of the horizontal gradient and the absolute value of the vertical gradient are summed to obtain the texture value of the kth block.
在这里,由于本申请实施例总共涉及N个块,故k的取值为大于或等于零且小于N的整数。如此,针对纹理分析计算,可以是将计算得到的方差值确定为纹理值,也可以是对水平梯度绝对值和垂直梯度绝对值进行求和计算,将计算得到的和值确定为纹理值,但是不作任何限定。Here, since the embodiment of the present application involves a total of N blocks, the value of k is an integer greater than or equal to zero and less than N. In this way, for the texture analysis calculation, the calculated variance value can be determined as the texture value, or the sum of the absolute value of the horizontal gradient and the absolute value of the vertical gradient can be calculated, and the calculated sum value can be determined as the texture value, But it does not make any limitation.
另外,对于最大单元尺寸信息而言,其级别至少可以为下述其中之一:序列级、图像级。In addition, for the maximum unit size information, its level may be at least one of the following: sequence level and image level.
示例性地,在确定最大单元尺寸信息时,可以是输入一个视频序列,然后以初始帧作为视频图像,据此确定出初始帧对应的最大单元尺寸信息。在一种具体的示例中,如果最大单元尺寸信息为序列级,那么整个视频序列均可以采用这一最大单元尺寸信息。在另一种具体的示例中,如果最大单元尺寸信息为图像级,那么对于整个视频序列,每一帧对应的最大单元尺寸信息均是以该帧作为视频图像,然后分别确定出每一帧对应的最大单元尺寸信息。还需要注意的是,针对同一帧内的不同块,其最大单元尺寸信息是相同的。Exemplarily, when determining the maximum unit size information, a video sequence may be input, and then the initial frame is used as a video image, and the maximum unit size information corresponding to the initial frame is determined accordingly. In a specific example, if the maximum unit size information is at the sequence level, the entire video sequence can use this maximum unit size information. In another specific example, if the maximum unit size information is at the image level, then for the entire video sequence, the maximum unit size information corresponding to each frame is the video image, and then the corresponding maximum unit size information for each frame is determined separately. maximum element size information. It should also be noted that for different blocks in the same frame, the maximum unit size information is the same.
在一种具体的示例中,以输入一个视频序列为例,参见图7,其示出了本申请实施例提供的一种确定最大单元尺寸信息的详细流程示意图。如图7所示,该详细流程可以包括:In a specific example, taking the input of a video sequence as an example, see FIG. 7 , which shows a detailed schematic flow chart of determining maximum unit size information provided by an embodiment of the present application. As shown in Figure 7, the detailed process may include:
S701:输入一个视频序列,且设置j=0,i=0。S701: Input a video sequence, and set j=0, i=0.
S702:将初始帧划分成N个64×64的不重叠块。S702: Divide the initial frame into N non-overlapping blocks of 64×64.
S703:计算第i个块的方差值var
i。
S703: Calculate the variance value var i of the ith block.
S704:判断是否var
i<T?
S704: Determine whether var i < T?
S705:j=j+1。S705: j=j+ 1.
S706:i=i+1。S706: i=i+1.
S707:判断是否i=N?S707: Determine whether i=N?
S708:判断是否j/N<c1?S708: Judge whether j/N<c1?
S709:判断是否j/N<c2?S709: Judge whether j/N<c2?
S710:确定maxBtSize=8,maxTtSize=8。S710: Determine maxBtSize=8 and maxTtSize=8.
S711:确定maxBtSize=16,maxTtSize=16。S711: Determine maxBtSize=16 and maxTtSize=16.
S712:确定为默认的maxBtSize、默认的maxTtSize。S712: Determine the default maxBtSize and the default maxTtSize.
在这里,T表示本申请实施例所述的第一阈值,c1表示本申请实施例所述的第二阈值,c2表示本申请实施例所述的第三阈值。示例性地,T=4×2^(bitdepth-8),c1=0.15,c2=0.3。Here, T represents the first threshold described in the embodiment of the present application, c1 represents the second threshold described in the embodiment of the present application, and c2 represents the third threshold described in the embodiment of the present application. Exemplarily, T=4×2^(bitdepth-8), c1=0.15, c2=0.3.
在本申请实施例中,对于最大单元尺寸信息而言,可以是用于限制二叉树划分时的BT最大单元尺寸(用maxBtSize表示),也可以是用于限制三叉树划分的TT最大单元尺寸信息(用maxTtSize表示)。换句话说,最大单元尺寸信息可以用maxBtSize表示,也可以用maxTtSize表示;但是在同一条件下,maxBtSize和maxTtSize的取值相同。In this embodiment of the present application, the maximum unit size information may be the BT maximum unit size (represented by maxBtSize) used to limit the binary tree division, or the TT maximum unit size information used to limit the ternary tree division ( Expressed by maxTtSize). In other words, the maximum unit size information can be represented by maxBtSize or maxTtSize; but under the same conditions, maxBtSize and maxTtSize have the same values.
需要说明的是,图7中所示的maxBtSize和maxTtSize可以是序列级。即针对该视频序列,可以仅根据初始帧来确定出maxBtSize和maxTtSize,而且所确定的maxBtSize和maxTtSize可以使用到整个视频序列。另外,maxBtSize和maxTtSize还可以由序列级修改为图像级,即每一帧视频图像均可以按照图7所示的方法计算出maxBtSize和maxTtSize,以作为当前帧的块尺寸约束条件。It should be noted that the maxBtSize and maxTtSize shown in FIG. 7 may be at the sequence level. That is, for the video sequence, maxBtSize and maxTtSize can be determined only according to the initial frame, and the determined maxBtSize and maxTtSize can be used for the entire video sequence. In addition, maxBtSize and maxTtSize can also be modified from sequence level to image level, that is, maxBtSize and maxTtSize can be calculated according to the method shown in FIG. 7 for each frame of video image as block size constraints of the current frame.
还需要说明的是,在图7中,i表示N个块中每一个块进行方差计算以及方差是否小于T的执行顺序,j表示N个块中方差小于T的块数量累加值。对于步骤S704来说,如果判断结果为是,意味着第i个块的方差小于T,那么执行步骤S705和S706,即不仅对j执行加1处理,而且对i也执行加1处理;如果判断结果为否,意味着第i个块的方差大于或等于T,那么执行步骤S706,即这时候不再需要对j执行加1处理,仅对i执行加1处理。对于步骤S707来说,如果判断结果为否,意味着这N个块还没有全部执行完成,那么返回执行步骤S703,即继续下一个块的操作(比如计算第i个块的方差,然后进一步判断第i个块的方差是否小于T等等);如果判断结果为是,意味着这N个块全部执行完成,那么执行步骤S708,即在得到j之后,确定j与N的比值j/N,并将j/N与c1进行比较。具体地,对于步骤S708来说,如果判断结果为是,意味着j/N小于c1,那么执行步骤S710,即确定当前块的最大单元尺寸信息为8;如果判断结果为否,意味着j/N大于或等于c1,那么执行步骤S709,需要进一步将j/N与c2进行比较。具体地,对于步骤S709来说,如果判断结果为是,意味着j/N小于c2,那么执行步骤S711,即确定当前块的最大单元尺寸信息为16;如果判断结果为否,意味着j/N大于或等于c2,那么执行步骤S712,即确定当前块的最大单元尺寸信息为默认值(包括默认的maxBtSize和默认的maxTtSize)。It should also be noted that, in FIG. 7 , i represents the execution order of variance calculation for each of the N blocks and whether the variance is less than T, and j represents the cumulative value of the number of blocks with variance less than T among the N blocks. For step S704, if the judgment result is yes, it means that the variance of the i-th block is less than T, then step S705 and S706 are executed, that is, not only the processing of adding 1 to j is performed, but also processing of adding 1 to i is performed; If the result is no, it means that the variance of the i-th block is greater than or equal to T, then step S706 is executed, that is, at this time, it is no longer necessary to perform the processing of adding 1 to j, and only processing of adding 1 to i is performed. For step S707, if the judgment result is no, it means that all the N blocks have not been executed, then return to step S703, that is, continue the operation of the next block (for example, calculate the variance of the ith block, and then further judge Whether the variance of the i-th block is less than T, etc.); if the judgment result is yes, it means that the N blocks are all executed, then execute step S708, that is, after obtaining j, determine the ratio of j to N j/N, and compare j/N with c1. Specifically, for step S708, if the judgment result is yes, it means that j/N is less than c1, then step S710 is executed, that is, it is determined that the maximum unit size information of the current block is 8; if the judgment result is no, it means that j/N N is greater than or equal to c1, then step S709 is executed, and j/N needs to be further compared with c2. Specifically, for step S709, if the judgment result is yes, it means that j/N is less than c2, then step S711 is executed, that is, it is determined that the maximum unit size information of the current block is 16; if the judgment result is no, it means that j/N N is greater than or equal to c2, then step S712 is executed, that is, the maximum unit size information of the current block is determined as the default value (including the default maxBtSize and the default maxTtSize).
简言之,本申请实施例提供了一种基于纹理分析的高位深视频快速划分技术,该技术首先对高位深视频进行纹理分析,然后基于此设计了多类型树单元的自适应最大尺寸机制。当视频图像大部分区域的纹理较复杂时,多类型树单元的最大尺寸更小;而当视频图像存在大面积平坦区域时,多类型树单元的最大尺寸更大。具体详见图7所示,如果j/N<c1,那么maxBtSize为8,maxTtSize为8;如果c1≤j/N<c2,那么maxBtSize为16,maxTtSize为16;如果j/N≥c2,那么maxBtSize为默认的maxBtSize,maxTtSize为默认的maxTtSize。In short, the embodiments of the present application provide a high-bit-depth video fast division technology based on texture analysis. The technology first performs texture analysis on the high-bit-depth video, and then designs an adaptive maximum size mechanism for multi-type tree units based on this. When the texture of most areas of the video image is complex, the maximum size of the multi-type tree unit is smaller; and when there is a large flat area in the video image, the maximum size of the multi-type tree unit is larger. See Figure 7 for details. If j/N<c1, then maxBtSize is 8 and maxTtSize is 8; if c1≤j/N<c2, then maxBtSize is 16 and maxTtSize is 16; if j/N≥c2, then maxBtSize is the default maxBtSize, and maxTtSize is the default maxTtSize.
如此,在另一种具体的示例中,一种使用QTMT快速块划分的HBD序列编码器的实施方法如下:首先,输入视频序列的第一帧图像,根据图7所示的流程,确定BT最大单元尺寸为maxBtSize,TT最大单元尺寸为maxTtSize。接着,输入图像被划分成不重叠的多个CTU块。然后,按照光栅扫描顺序依次处理每个CTU,将CTU划分成若干个CU,主要包含以下4个步骤:①将CTU按maxBtSize×maxBtSize尺寸(或者maxTtSize×maxTtSize尺寸)划分成若干个不重叠的CU,计算此时预测编码的率失真代价值RdCost0。②将CU按照QT模式或者MT模式进行划分预测,计算率失真代价值RdCost1。③比较RdCost0和RdCost1,若RdCost1更小,则继续依次处理各个子CU。④其中,每个CU按照QT模式或者MT模式进行划分预测,计算每一种划分模式的率失真代价值,比较当前的相对较优RdCost,一直递归循环下去,直到确定出率失真代价值最小的一种块划分方式,即当前块的划分模式。最后,按照该划分模式计算得到残差块,对残差块进行变换、量化、熵编码,对块划分参数 等相关信息进行编码,输出码流等待传输。Thus, in another specific example, an implementation method of an HBD sequence encoder using QTMT fast block division is as follows: First, input the first frame of the video sequence, and determine the maximum BT according to the flow shown in FIG. 7 . The cell size is maxBtSize, and the TT maximum cell size is maxTtSize. Next, the input image is divided into non-overlapping CTU blocks. Then, each CTU is processed in turn according to the raster scan order, and the CTU is divided into several CUs, which mainly includes the following four steps: ① Divide the CTU into several non-overlapping CUs according to the maxBtSize×maxBtSize size (or maxTtSize×maxTtSize size) , calculate the rate-distortion cost value RdCost0 of predictive coding at this time. ② The CU is divided and predicted according to the QT mode or the MT mode, and the rate-distortion cost value RdCost1 is calculated. ③ Compare RdCost0 and RdCost1, if RdCost1 is smaller, continue to process each sub-CU in turn. ④ Among them, each CU is divided and predicted according to the QT mode or the MT mode, the rate-distortion cost value of each division mode is calculated, the current relatively better RdCost is compared, and the recursive loop is continued until the rate-distortion cost value with the smallest rate is determined. A block division method, that is, the division mode of the current block. Finally, the residual block is calculated according to the division mode, and the residual block is transformed, quantized, and entropy encoded, and the relevant information such as block division parameters is encoded, and the output code stream is waiting for transmission.
可以理解地,在VVC的实现中,首先,对于输入变量声明部分,可以添加maxBtSize的补充说明:Understandably, in the implementation of VVC, first of all, for the input variable declaration part, a supplementary description of maxBtSize can be added:
对于BT最大单元尺寸maxBtSize,在编码HBD序列时,如果j/N<c1,那么maxBtSize=8;否则,如果j/N<c2,那么maxBtSize=16。For the BT maximum unit size maxBtSize, when encoding the HBD sequence, if j/N<c1, then maxBtSize=8; otherwise, if j/N<c2, then maxBtSize=16.
其次,对于输入变量声明部分,还可以添加maxTtSize的补充说明:Secondly, for the input variable declaration part, you can also add a supplementary description of maxTtSize:
对于TT最大单元尺寸maxTtSize,在编码HBD序列时,如果j/N<c1,那么maxTtSize=8;否则,如果j/N<c2,那么maxTtSize=16。For the TT maximum unit size maxTtSize, when encoding the HBD sequence, if j/N<c1, then maxTtSize=8; otherwise, if j/N<c2, then maxTtSize=16.
也就是说,在QTMT技术中,针对高位深视频编码,对MT中的二叉树和三叉树的块划分过程进行了自适应裁剪,对于纹理极复杂的视频序列剪掉大于8×8的MT块划分模式,对于纹理较复杂的视频序列剪掉大于16×16的块划分模式。本申请实施例所提供的块划分技术在VVC参考软件VTM11.0上实现后,在All Intra条件下对JVET要求的HBD测试序列中进行测试,在Y、Cb、Cr分量上BD-rate平均变化分别为0.20%、0.29%、0.28%,而编码时间平均减少了56%,这一数据说明本技术在几乎忽略不计的性能增益损失下,可以节省超过一半的编码时间。That is to say, in QTMT technology, for high bit-depth video coding, the block division process of binary tree and ternary tree in MT is adaptively cropped, and for video sequences with extremely complex textures, the MT block division larger than 8×8 is cut off. mode, for video sequences with more complex textures, cut out the block division mode larger than 16×16. After the block division technology provided by the embodiment of the present application is implemented on the VVC reference software VTM11.0, the test is carried out in the HBD test sequence required by JVET under the All Intra condition, and the average change of BD-rate on the Y, Cb, and Cr components are 0.20%, 0.29%, and 0.28%, respectively, and the encoding time is reduced by an average of 56%. This data shows that this technology can save more than half of the encoding time with almost negligible loss of performance gain.
本实施例提供了一种块划分方法,应用于编码器。通过本实施例对前述实施例的详细阐述,从中可以看出,在不影响性能的前提下,使用本申请的技术方案可以在性能增益几乎不变的情况下显著减少编码复杂度。相比于相关技术中已有的QTMT技术,本申请的技术方案直接跳过大尺寸块的预测和变换过程的计算,由于对最大块尺寸设计了自适应图像纹理的机制,因此损失掉的编码性能仅为0.20%,几乎可以忽略不计,但是由于直接跳过了128、64、32等块尺寸,使得块划分的递归总次数可以获得指数级下降,从而最终减少超过50%的编码时间;也就是说,本申请的技术方案可以在保持和原有技术基本相当的编码性能的情况下显著减少编码复杂度。This embodiment provides a block division method, which is applied to an encoder. Through the detailed description of the foregoing embodiments in this embodiment, it can be seen that, on the premise of not affecting the performance, the coding complexity can be significantly reduced with the performance gain almost unchanged by using the technical solution of the present application. Compared with the existing QTMT technology in the related art, the technical solution of the present application directly skips the calculation of the prediction and transformation process of large-size blocks. The performance is only 0.20%, which is almost negligible, but since block sizes such as 128, 64, 32, etc. are directly skipped, the total number of recursion of block division can be reduced exponentially, thus ultimately reducing the encoding time by more than 50%; also That is to say, the technical solution of the present application can significantly reduce the coding complexity while maintaining the coding performance substantially equivalent to that of the prior art.
在本申请的又一实施例中,参见图8,其示出了本申请实施例提供的另一种块划分方法的流程示意图。如图8所示,该方法可以包括:In another embodiment of the present application, referring to FIG. 8 , it shows a schematic flowchart of another block division method provided by the embodiment of the present application. As shown in Figure 8, the method may include:
S801:解析码流,确定当前块的块划分参数。S801: Parse the code stream, and determine the block division parameters of the current block.
需要说明的是,本申请实施例的块划分方法应用于解码器。这里,对于一视频图像而言,视频图像可以划分为多个图像块,其中,每个待解码的图像块均可以称为解码块,而这里的当前块具体是指当前待执行解码的解码块;在解码完成后,可以等待视频播放。It should be noted that, the block division method in this embodiment of the present application is applied to a decoder. Here, for a video image, the video image can be divided into multiple image blocks, wherein each image block to be decoded can be called a decoding block, and the current block here specifically refers to the decoding block currently to be decoded ; After decoding is complete, you can wait for the video to play.
还需要说明的是,本申请实施例主要是提供一种基于纹理分析的高位深视频的快速块划分技术,即应用于高位深视频。在这里,判断视频图像是否为高位深视频,可以利用视频图像的标识信息来表示。具体地,在一些实施例中,该方法还可以包括:It should also be noted that the embodiments of the present application mainly provide a fast block division technology for high bit depth video based on texture analysis, that is, applied to high bit depth video. Here, whether the video image is a high bit depth video can be determined by using the identification information of the video image. Specifically, in some embodiments, the method may further include:
解析码流,获取视频图像的标识信息;Parse the code stream to obtain the identification information of the video image;
若视频图像的标识信息的取值为第一值,则确定视频图像的标识信息指示视频图像为高位深视频;或者,If the value of the identification information of the video image is the first value, it is determined that the identification information of the video image indicates that the video image is a high bit depth video; or,
若视频图像的标识信息的取值为第二值,则确定视频图像的标识信息指示视频图像为非高位深视频。If the value of the identification information of the video image is the second value, it is determined that the identification information of the video image indicates that the video image is a non-high bit depth video.
需要说明的是,第一值和第二值不同,而且第一值和第二值可以是参数形式,也可以是数字形式。在通常情况下,视频图像的标识信息是写入在概述(profile)中的参数,但是视频图像的标识信息也可以是一个标志(flag),这里并不作任何限定。It should be noted that the first value and the second value are different, and the first value and the second value may be in the form of parameters or in the form of numbers. In general, the identification information of the video image is a parameter written in the profile (profile), but the identification information of the video image may also be a flag (flag), which is not limited here.
还需要说明的是,本申请实施例提供了一种解码方法,具体提供了一种块划分方法,更具体地,本申请实施例对高位深视频设计了基于图像纹理的自适应最大单元尺寸机制。如此,在编码器确定视频图像为高位深视频时,可以将视频图像的标识信息写入码流后,使得解码器可以通过解析码流直接确定出视频图像是否为高位深视频。It should also be noted that the embodiments of the present application provide a decoding method, and specifically provide a block division method. More specifically, the embodiments of the present application design an image texture-based adaptive maximum unit size mechanism for high-bit-depth video. . In this way, when the encoder determines that the video image is a high-bit-depth video, the identification information of the video image can be written into the code stream, so that the decoder can directly determine whether the video image is a high-bit-depth video by parsing the code stream.
另外,以视频图像的标识信息为一个flag为例,这时候对于第一值和第二值而言,在一种具体的示例中,第一值可以设置为1,第二值可以设置为0;在另一具体的示例中,第一值还可以设置为true,第二值还可以设置为false;甚至在又一具体的示例中,第一值还可以设置为0,第二值还可以设置为1;或者,第一值还可以设置为false,第二值还可以设置为true。本申请实施例第一值和第二值不作任何限定。In addition, taking the identification information of the video image as a flag as an example, at this time, for the first value and the second value, in a specific example, the first value can be set to 1, and the second value can be set to 0 ; In another specific example, the first value can also be set to true, and the second value can also be set to false; even in another specific example, the first value can also be set to 0, and the second value can also be set to Set to 1; alternatively, the first value can also be set to false, and the second value can also be set to true. The first value and the second value in this embodiment of the present application are not limited in any way.
这样,假定第一值为1,第二值为0,那么解码器在解析码流后,如果视频图像的标识信息的取值为1,这时候可以确定出视频图像为高位深视频,即编码器采用了本申请实施例所述的块划分方法,可以节省编码速度,显著减少编码复杂度。否则,如果视频图像的标识信息的取值为0,这时候可以确定出视频图像为非高位深视频,即编码器没有采用本申请实施例所述的块划分方法,比如是按照相关技术中的块划分方法执行的。In this way, assuming that the first value is 1 and the second value is 0, after the decoder parses the code stream, if the value of the identification information of the video image is 1, it can be determined that the video image is a high-bit-depth video, that is, encoding The block division method described in the embodiments of the present application can be used to save coding speed and significantly reduce coding complexity. Otherwise, if the value of the identification information of the video image is 0, it can be determined that the video image is a non-high bit-depth video at this time, that is, the encoder does not use the block division method described in the embodiment of the present application, for example, according to the related art The block division method is performed.
进一步地,在一些实施例中,当解码器解码获得块划分参数之后,该方法还可以包括:Further, in some embodiments, after the decoder obtains the block division parameters by decoding, the method may further include:
基于块划分参数,确定当前块的划分模式;Determine the division mode of the current block based on the block division parameters;
根据划分模式,确定当前块的划分树,其中,划分树包含对当前块进行划分得到的一个或多个节点子块。According to the division mode, a division tree of the current block is determined, wherein the division tree includes one or more node sub-blocks obtained by dividing the current block.
也就是说,在确定出块划分参数之后,即可以确定出当前块的划分模式,进而确定出当前块的划分树,以便根据预设的节点子块处理顺序依次处理划分树的每一个节点子块。That is to say, after the block division parameters are determined, the division mode of the current block can be determined, and then the division tree of the current block can be determined, so as to sequentially process each node sub-tree of the division tree according to the preset processing order of node sub-blocks piece.
还需要说明的是,如果视频图像的标识信息指示视频图像为高位深视频,那么这时候根据块划分参数确定出当前块的划分模式,该划分模式与视频图像的纹理信息具有关联关系。即在编码器中,划分模式是根据视频图像的纹理信息确定当前块的最大单元尺寸信息,然后根据最大单元尺寸信息对当前块进行预处理所确定的;即本申请的技术方案对最大单元尺寸设计了自适应图像纹理的机制,可以直接跳过大尺寸块的预测和变换过程的计算,从而可以减少编码时间,并且在保持性能增益基本不变的情况下,显著地减少编码复杂度。It should also be noted that if the identification information of the video image indicates that the video image is a high bit depth video, then the division mode of the current block is determined according to the block division parameter, and the division mode is associated with the texture information of the video image. That is, in the encoder, the division mode is determined by determining the maximum unit size information of the current block according to the texture information of the video image, and then preprocessing the current block according to the maximum unit size information; The mechanism of adaptive image texture is designed, which can directly skip the calculation of the prediction and transformation process of large-size blocks, thereby reducing the encoding time and significantly reducing the encoding complexity while keeping the performance gain basically unchanged.
S802:基于块划分参数,解析码流,确定当前块的预测值。S802: Based on the block division parameter, parse the code stream to determine the predicted value of the current block.
需要说明的是,在解码获得块划分参数之后,对于预测值的确定,在一些实施例中,所述基于块划分参数,解析码流,确定当前块的预测值,可以包括:It should be noted that, after decoding to obtain the block division parameter, for the determination of the predicted value, in some embodiments, the step of parsing the code stream based on the block division parameter to determine the predicted value of the current block may include:
根据预设的节点子块处理顺序依次解析划分树的每一个节点子块的码流,确定每一个节点子块的预测模式;According to the preset node sub-block processing order, the code stream of each node sub-block of the partition tree is sequentially parsed, and the prediction mode of each node sub-block is determined;
根据预测模式确定每一个节点子块的预测值。The prediction value of each node sub-block is determined according to the prediction mode.
在这里,预设的节点子块处理顺序可以为预设扫描顺序。其中,预设扫描顺序可以是对角线、Zigzag、水平、垂直、4×4子块扫描或者任何其它光栅扫描顺序,本申请实施例不作任何限定。Here, the preset processing order of node sub-blocks may be the preset scanning order. The preset scanning sequence may be diagonal, Zigzag, horizontal, vertical, 4×4 sub-block scanning, or any other raster scanning sequence, which is not limited in this embodiment of the present application.
还需要说明的是,本申请实施例可以根据预设扫描顺序依次解析划分树的每一个节点子块的码流,获取每一个节点子块的预测模式,进而确定出每一个节点子块的预测值。It should also be noted that, in this embodiment of the present application, the code stream of each node sub-block of the partition tree can be sequentially parsed according to the preset scanning order, to obtain the prediction mode of each node sub-block, and then determine the prediction of each node sub-block. value.
S803:基于块划分参数,解析码流,确定当前块的残差值。S803: Based on the block division parameters, parse the code stream to determine the residual value of the current block.
需要说明的是,在解码获得块划分参数之后,对于残差值的确定,在一些实施例中,所述基于块划分参数,解析码流,确定当前块的残差值,可以包括:It should be noted that, after decoding to obtain the block division parameter, for the determination of the residual value, in some embodiments, the step of parsing the code stream based on the block division parameter to determine the residual value of the current block may include:
根据预设的节点子块处理顺序依次解析划分树的每一个节点子块的码流,确定每一个节点子块的残差值。The code stream of each node sub-block of the partition tree is sequentially parsed according to the preset node sub-block processing order, and the residual value of each node sub-block is determined.
在这里,预设的节点子块处理顺序可以为预设扫描顺序。也就是说,本申请实施例可以根据预设扫描顺序依次解析划分树的每一个节点子块的码流,进而确定出每一个节点子块的残差值。Here, the preset processing order of node sub-blocks may be the preset scanning order. That is to say, the embodiment of the present application may sequentially parse the code stream of each node sub-block of the partition tree according to the preset scanning order, and then determine the residual value of each node sub-block.
S804:基于预测值和残差值,确定当前块的重建值。S804: Determine the reconstruction value of the current block based on the predicted value and the residual value.
在一种具体的示例中,所述基于预测值和残差值,确定当前块的重建值,可以包括:对预测值和残差值进行加法计算,确定当前块的重建值。In a specific example, the determining the reconstructed value of the current block based on the predicted value and the residual value may include: adding the predicted value and the residual value to determine the reconstructed value of the current block.
需要说明的是,在解码获得块划分参数之后,还可以通过解析码流,获取当前块的预测值;而且还可以通过解析码流,获取当前块的残差值;这样,通过对预测值和残差值进行加法计算,可以确定出当前块的重建值。It should be noted that, after the block division parameters are obtained by decoding, the predicted value of the current block can also be obtained by parsing the code stream; and the residual value of the current block can also be obtained by parsing the code stream; in this way, by comparing the predicted value and The residual value is added and calculated to determine the reconstruction value of the current block.
还需要说明的是,在一种具体的示例中,一种使用QTMT快速块划分的HBD序列解码器的实施方法如下:首先,输入码流进行熵解码,反量化,反变换,可以得到残差值;接着,根据残差块重建图像,这里的重建过程主要包含以下3个步骤:①根据块划分参数等相关信息确定当前CTU的划分树。②根据光栅扫描顺序依次处理划分树的每个CU,利用运动矢量等信息找到预测块。③将当前CU的残差值和预测值叠加,得到重建CU。最后,将重建图像送入DBF/SAO/ALF滤波器,滤波后的图像送入缓存区,等待视频播放。It should also be noted that, in a specific example, an implementation method of an HBD sequence decoder using QTMT fast block division is as follows: First, entropy decoding, inverse quantization, and inverse transformation are performed on the input code stream, and the residual error can be obtained. Next, the image is reconstructed according to the residual block, and the reconstruction process here mainly includes the following three steps: 1. Determine the current CTU partition tree according to relevant information such as block partition parameters. ② Process each CU of the partition tree in turn according to the raster scan order, and use information such as motion vectors to find the prediction block. ③ Superimpose the residual value and the predicted value of the current CU to obtain the reconstructed CU. Finally, the reconstructed image is sent to the DBF/SAO/ALF filter, and the filtered image is sent to the buffer area, waiting for the video to play.
本实施例提供了一种块划分方法,应用于解码器。通过解析码流,确定当前块的块划分参数;基于块划分参数,解析码流,确定当前块的预测值;基于块划分参数,解析码流,确定当前块的残差值;以及基于预测值和残差值,确定当前块的重建值。这样,由于根据视频图像的纹理信息确定当前块的最大单元尺寸信息,也即本申请的技术方案对最大单元尺寸设计了自适应图像纹理的机制,能够直接跳过大尺寸块的预测和变换过程的计算,导致块划分的递归总次数实现指数级下降,从而在保持性能增益基本不变的情况下,显著地减少了编码复杂度,并且减少了编码时间,进而能够提升编解码效率。This embodiment provides a block division method, which is applied to a decoder. By parsing the code stream, the block division parameters of the current block are determined; based on the block division parameters, the code stream is parsed to determine the predicted value of the current block; based on the block division parameters, the code stream is parsed to determine the residual value of the current block; and based on the predicted value and the residual value to determine the reconstructed value of the current block. In this way, since the maximum unit size information of the current block is determined according to the texture information of the video image, that is, the technical solution of the present application designs an adaptive image texture mechanism for the maximum unit size, which can directly skip the prediction and transformation process of large-size blocks. The calculation of , leads to an exponential decrease in the total number of recursion of block division, thus significantly reducing the coding complexity and reducing the coding time while keeping the performance gain basically unchanged, thereby improving the coding and decoding efficiency.
在本申请的再一实施例中,基于前述实施例相同的发明构思,参见图9,其示出了本申请实施例提供的一种编码器90的组成结构示意图。如图9所示,该编码器90可以包括:第一确定单元901、块划分单元902和编码单元903;其中,In yet another embodiment of the present application, based on the same inventive concept as the foregoing embodiments, see FIG. 9 , which shows a schematic structural diagram of an encoder 90 provided by an embodiment of the present application. As shown in FIG. 9 , the encoder 90 may include: a first determining unit 901, a block dividing unit 902 and an encoding unit 903; wherein,
第一确定单元901,配置为基于视频图像的纹理信息,确定当前块的最大单元尺寸信息;The first determining unit 901 is configured to determine the maximum unit size information of the current block based on the texture information of the video image;
块划分单元902,配置为根据最大单元尺寸信息对当前块进行预处理,确定当前块的划分模式;以 及根据划分模式,确定当前块的块划分参数;The block division unit 902 is configured to preprocess the current block according to the maximum unit size information to determine the division mode of the current block; and according to the division mode, determine the block division parameter of the current block;
编码单元903,配置为根据块划分参数,对当前块进行编码。The encoding unit 903 is configured to encode the current block according to the block division parameter.
在一些实施例中,编码单元903,还配置为对块划分参数进行编码,将编码比特写入码流。In some embodiments, the encoding unit 903 is further configured to encode the block division parameter, and write the encoded bits into the code stream.
在一些实施例中,块划分单元902,还配置为根据块划分参数,将当前块划分成一个或多个节点子块;In some embodiments, the block division unit 902 is further configured to divide the current block into one or more node sub-blocks according to the block division parameter;
第一确定单元901,还配置为按照预设的节点子块处理顺序,依次确定每一个节点子块的预测参数;以及根据预测参数,确定节点子块的预测值;以及根据节点子块的原始值和预测值,确定节点子块的残差值;The first determining unit 901 is further configured to sequentially determine the prediction parameter of each node sub-block according to the preset node sub-block processing order; and determine the predicted value of the node sub-block according to the prediction parameter; and according to the original node sub-block value and predicted value, determine the residual value of the node sub-block;
编码单元903,还配置为对节点子块的预测参数和残差值进行编码,将编码比特写入码流。The encoding unit 903 is further configured to encode the prediction parameter and the residual value of the node sub-block, and write the encoded bits into the code stream.
在一些实施例中,块划分单元902,还配置为利用最大单元尺寸信息对当前块进行划分,得到至少一个第一节点子块,并计算第一率失真代价值;以及利用预设划分模式对第一节点子块进行划分,得到至少一个第二节点子块,并计算第二率失真代价值;In some embodiments, the block dividing unit 902 is further configured to use the maximum unit size information to divide the current block, obtain at least one first node sub-block, and calculate the first rate-distortion cost value; and use a preset division mode to The first node sub-block is divided to obtain at least one second node sub-block, and the second rate-distortion cost value is calculated;
第一确定单元901,还配置为根据第一率失真代价值和第二率失真代价值的比较结果,确定当前块的划分模式。The first determining unit 901 is further configured to determine the division mode of the current block according to the comparison result of the first rate-distortion cost value and the second rate-distortion cost value.
在一些实施例中,块划分单元902,还配置为将第一率失真代价值和第二率失真代价值进行比较;以及在第二率失真代价值小于第一率失真代价值的情况下,利用预设划分模式对第二节点子块进行划分,得到至少一个下一级第二节点子块,并计算第三率失真代价值;In some embodiments, the block dividing unit 902 is further configured to compare the first rate-distortion cost value with the second rate-distortion cost value; and in the case that the second rate-distortion cost value is less than the first rate-distortion cost value, Use the preset division mode to divide the second node sub-block to obtain at least one next-level second node sub-block, and calculate the third rate-distortion cost value;
第一确定单元901,还配置为根据第二率失真代价值和第三率失真代价值,确定当前块的划分模式。The first determining unit 901 is further configured to determine the division mode of the current block according to the second rate-distortion cost value and the third rate-distortion cost value.
在一些实施例中,块划分单元902,还配置为在第三率失真代价值小于第二率失真代价值的情况下,利用第三率失真代价值更新第二率失真代价值,并返回执行利用预设划分模式对第二节点子块进行划分,得到至少一个下一级第二节点子块,并计算第三率失真代价值的步骤,直至确定出最小率失真代价值;In some embodiments, the block dividing unit 902 is further configured to update the second rate-distortion cost value with the third rate-distortion cost value when the third rate-distortion cost value is smaller than the second rate-distortion cost value, and return to executing The second node sub-block is divided by using the preset division mode to obtain at least one next-level second node sub-block, and the steps of calculating the third rate-distortion cost value until the minimum rate-distortion cost value is determined;
第一确定单元901,还配置为根据最小率失真代价值对应的划分模式,确定当前块的划分模式。The first determining unit 901 is further configured to determine the division mode of the current block according to the division mode corresponding to the minimum rate-distortion cost value.
在一些实施例中,第一确定单元901,还配置为在第二率失真代价值大于或等于第一率失真值的情况下,直接将根据最大单元尺寸信息对当前块进行划分的模式确定为当前块的划分模式。In some embodiments, the first determining unit 901 is further configured to directly determine the mode of dividing the current block according to the maximum unit size information as the second rate-distortion cost value is greater than or equal to the first rate-distortion value. The partition mode of the current block.
在一些实施例中,预设划分模式包括四叉树划分模式和/或多类型树划分模式,多类型树划分模式包括下述至少之一:垂直二叉树划分模式、水平二叉树划分模式、垂直三叉树划分模式和水平三叉树划分模式。In some embodiments, the preset division mode includes a quadtree division mode and/or a multi-type tree division mode, and the multi-type tree division mode includes at least one of the following: a vertical binary tree division mode, a horizontal binary tree division mode, and a vertical ternary tree division mode Partition mode and horizontal ternary tree partition mode.
在一些实施例中,块划分单元902,还配置为对视频图像进行块划分,得到N个预设尺寸的块;其中,N为大于零的整数,且N个块之间互不重叠;In some embodiments, the block division unit 902 is further configured to perform block division on the video image to obtain N blocks of preset size; wherein, N is an integer greater than zero, and the N blocks do not overlap each other;
第一确定单元901,还配置为对N个块进行纹理分析,确定第一数量;其中,第一数量表征N个块中纹理值小于第一阈值的块数量;以及根据第一数量与第二阈值的比较结果,确定当前块的最大单元尺寸信息。The first determining unit 901 is further configured to perform texture analysis on the N blocks to determine a first quantity; wherein the first quantity represents the quantity of blocks whose texture values are less than the first threshold in the N blocks; and according to the first quantity and the second The result of the comparison of the thresholds determines the maximum unit size information of the current block.
在一些实施例中,参见图9,编码器90还可以包括计算单元904,配置为计算N个块的纹理值;In some embodiments, referring to FIG. 9, the encoder 90 may further include a calculation unit 904 configured to calculate the texture values of the N blocks;
第一确定单元901,还配置为将N个块的纹理值依次与第一阈值进行比较;以及根据比较结果对纹理值小于第一阈值的块进行数量统计,得到第一数量。The first determining unit 901 is further configured to compare the texture values of the N blocks with the first threshold in sequence; and to count the number of blocks whose texture values are less than the first threshold according to the comparison result to obtain the first number.
在一些实施例中,计算单元904,具体配置为对第k个块进行方差值计算,得到第k个块的纹理值;其中,k为大于或等于零且小于N的整数。In some embodiments, the calculation unit 904 is specifically configured to perform variance value calculation on the kth block to obtain the texture value of the kth block; wherein k is an integer greater than or equal to zero and less than N.
在一些实施例中,计算单元904,具体配置为确定第k个块的水平梯度绝对值和垂直梯度绝对值;以及对水平梯度绝对值和垂直梯度绝对值进行求和计算,得到第k个块的纹理值;其中,k为大于或等于零且小于N的整数。In some embodiments, the calculation unit 904 is specifically configured to determine the absolute value of the horizontal gradient and the absolute value of the vertical gradient of the kth block; and perform a sum calculation on the absolute value of the horizontal gradient and the absolute value of the vertical gradient to obtain the kth block The texture value of ; where k is an integer greater than or equal to zero and less than N.
在一些实施例中,第一确定单元901,还配置为确定第一数量与N的比值,将比值与第二阈值进行比较;以及在比值小于第二阈值的情况下,确定当前块的最大单元尺寸信息为第一尺寸值。In some embodiments, the first determining unit 901 is further configured to determine a ratio of the first number to N, compare the ratio with a second threshold; and when the ratio is less than the second threshold, determine the largest unit of the current block The size information is a first size value.
在一些实施例中,第一确定单元901,还配置为在比值大于或等于第二阈值的情况下,将比值与第三阈值进行比较;以及若比值小于第三阈值,则确定当前块的最大单元尺寸信息为第二尺寸值;其中,第二尺寸值与第一尺寸值不同;第一阈值与第二阈值和第三阈值不同,且第二阈值小于第三阈值。In some embodiments, the first determining unit 901 is further configured to compare the ratio with a third threshold when the ratio is greater than or equal to the second threshold; and if the ratio is less than the third threshold, determine the maximum value of the current block The unit size information is a second size value; wherein the second size value is different from the first size value; the first threshold value is different from the second threshold value and the third threshold value, and the second threshold value is smaller than the third threshold value.
在一些实施例中,第一确定单元901,还配置为确定视频图像的标识信息;以及在视频图像的标识信息指示视频图像为高位深视频时,执行基于视频图像的纹理信息,确定当前块的最大单元尺寸信息的步骤。In some embodiments, the first determining unit 901 is further configured to determine identification information of the video image; and when the identification information of the video image indicates that the video image is a high bit-depth video, perform texture information based on the video image to determine the Steps for maximum element size information.
在一些实施例中,第一确定单元901,还配置为若视频图像的标识信息指示视频图像为高位深视频,则确定视频图像的标识信息的取值为第一值;或者,若视频图像的标识信息指示视频图像为非高位深视频,则确定视频图像的标识信息的取值为第二值。In some embodiments, the first determining unit 901 is further configured to determine that the identification information of the video image is a first value if the identification information of the video image indicates that the video image is a high bit depth video; If the identification information indicates that the video image is a non-high bit depth video, the value of the identification information of the video image is determined to be the second value.
在一些实施例中,编码单元903,还配置为对视频图像的标识信息进行编码,将编码比特写入码流。In some embodiments, the encoding unit 903 is further configured to encode the identification information of the video image, and write the encoded bits into the code stream.
在一些实施例中,最大单元尺寸信息的级别至少为下述其中之一:序列级、图像级。In some embodiments, the level of the maximum cell size information is at least one of the following: sequence level, picture level.
可以理解地,在本申请实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。It can be understood that, in the embodiments of the present application, a "unit" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course, it may also be a module, and it may also be non-modular. Moreover, each component in this embodiment may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of software function modules.
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中,基于这样的理解,本实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(Read Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the integrated unit is implemented in the form of a software functional module and is not sold or used as an independent product, it can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this embodiment is essentially or The part that contributes to the prior art or the whole or part of the technical solution can be embodied in the form of a software product, the computer software product is stored in a storage medium, and includes several instructions for making a computer device (which can be It is a personal computer, a server, or a network device, etc.) or a processor (processor) that executes all or part of the steps of the method described in this embodiment. The aforementioned storage medium includes: U disk, mobile hard disk, read only memory (Read Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes.
因此,本申请实施例提供了一种计算机存储介质,应用于编码器90,该计算机存储介质存储有计算机程序,所述计算机程序被第一处理器执行时实现前述实施例中任一项所述的方法。Therefore, an embodiment of the present application provides a computer storage medium, which is applied to the encoder 90, where the computer storage medium stores a computer program, and when the computer program is executed by the first processor, any one of the foregoing embodiments is implemented. Methods.
基于上述编码器90的组成以及计算机存储介质,参见图10,其示出了本申请实施例提供的编码器90的具体硬件结构示示意图。如图10所示,可以包括:第一通信接口1001、第一存储器1002和第一处理器1003;各个组件通过第一总线系统1004耦合在一起。可理解,第一总线系统1004用于实现这些组件之间的连接通信。第一总线系统1004除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图10中将各种总线都标为第一总线系统1004。其中,Based on the composition of the encoder 90 and the computer storage medium described above, see FIG. 10 , which shows a schematic diagram of a specific hardware structure of the encoder 90 provided by the embodiment of the present application. As shown in FIG. 10 , it may include: a first communication interface 1001 , a first memory 1002 and a first processor 1003 ; each component is coupled together through a first bus system 1004 . It can be understood that the first bus system 1004 is used to realize the connection and communication between these components. In addition to the data bus, the first bus system 1004 also includes a power bus, a control bus and a status signal bus. However, for the sake of clarity, the various buses are designated as the first bus system 1004 in FIG. 10 . in,
第一通信接口1001,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;The first communication interface 1001 is used for receiving and sending signals in the process of sending and receiving information with other external network elements;
第一存储器1002,用于存储能够在第一处理器1003上运行的计算机程序;a first memory 1002 for storing a computer program that can run on the first processor 1003;
第一处理器1003,用于在运行所述计算机程序时,执行:The first processor 1003 is configured to, when running the computer program, execute:
基于视频图像的纹理信息,确定当前块的最大单元尺寸信息;Determine the maximum unit size information of the current block based on the texture information of the video image;
根据最大单元尺寸信息对当前块进行预处理,确定当前块的划分模式;Preprocess the current block according to the maximum unit size information to determine the division mode of the current block;
根据划分模式,确定当前块的块划分参数;Determine the block division parameters of the current block according to the division mode;
根据块划分参数,对当前块进行编码。The current block is encoded according to the block partition parameter.
可以理解,本申请实施例中的第一存储器1002可以是易失性存储器或非易失性存储器,或可包括易失性和非易失性存储器两者。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDRSDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(Synchlink DRAM,SLDRAM)和直接内存总线随机存取存储器(Direct Rambus RAM,DRRAM)。本申请描述的系统和方法的第一存储器1002旨在包括但不限于这些和任意其它适合类型的存储器。It can be understood that the first memory 1002 in this embodiment of the present application may be a volatile memory or a non-volatile memory, or may include both volatile and non-volatile memories. Wherein, the non-volatile memory may be a read-only memory (Read-Only Memory, ROM), a programmable read-only memory (Programmable ROM, PROM), an erasable programmable read-only memory (Erasable PROM, EPROM), an electrically programmable read-only memory (Erasable PROM, EPROM). Erase programmable read-only memory (Electrically EPROM, EEPROM) or flash memory. Volatile memory may be Random Access Memory (RAM), which acts as an external cache. By way of illustration and not limitation, many forms of RAM are available, such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM, SDRAM), double data rate synchronous dynamic random access memory (Double Data Rate SDRAM, DDRSDRAM), enhanced synchronous dynamic random access memory (Enhanced SDRAM, ESDRAM), synchronous link dynamic random access memory (Synchlink DRAM, SLDRAM) And direct memory bus random access memory (Direct Rambus RAM, DRRAM). The first memory 1002 of the systems and methods described herein is intended to include, but not be limited to, these and any other suitable types of memory.
而第一处理器1003可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过第一处理器1003中的硬件的集成逻辑电路或者软件形式的指令完成。上述的第一处理器1003可以是通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。可以实现或者执行本申请实施例中的公开的各方法、步骤及逻辑框图。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合本申请实施例所公开的方法的步骤可以直接体现为硬件译码处理器执行完成,或者用译码处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于第一存储器1002,第一处理器1003读取第一存储器1002中的信息,结合其硬件完成上述方法的步骤。The first processor 1003 may be an integrated circuit chip with signal processing capability. In the implementation process, each step of the above-mentioned method may be completed by an integrated logic circuit of hardware in the first processor 1003 or an instruction in the form of software. The above-mentioned first processor 1003 can be a general-purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), a ready-made programmable gate array (Field Programmable Gate Array, FPGA) Or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The methods, steps, and logic block diagrams disclosed in the embodiments of this application can be implemented or executed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in conjunction with the embodiments of the present application may be directly embodied as executed by a hardware decoding processor, or executed by a combination of hardware and software modules in the decoding processor. The software modules may be located in random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, registers and other storage media mature in the art. The storage medium is located in the first memory 1002, and the first processor 1003 reads the information in the first memory 1002, and completes the steps of the above method in combination with its hardware.
可以理解的是,本申请描述的这些实施例可以用硬件、软件、固件、中间件、微码或其组合来实现。对于硬件实现,处理单元可以实现在一个或多个专用集成电路(Application Specific Integrated Circuits,ASIC)、数字信号处理器(Digital Signal Processing,DSP)、数字信号处理设备(DSP Device,DSPD)、可编程逻辑设备(Programmable Logic Device,PLD)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、通用处理器、控制器、微控制器、微处理器、用于执行本申请所述功能的其它电子单元或其 组合中。对于软件实现,可通过执行本申请所述功能的模块(例如过程、函数等)来实现本申请所述的技术。软件代码可存储在存储器中并通过处理器执行。存储器可以在处理器中或在处理器外部实现。It will be appreciated that the embodiments described herein may be implemented in hardware, software, firmware, middleware, microcode, or a combination thereof. For hardware implementation, the processing unit can be implemented in one or more Application Specific Integrated Circuits (ASIC), Digital Signal Processing (DSP), Digital Signal Processing Device (DSP Device, DSPD), programmable Logic Devices (Programmable Logic Device, PLD), Field-Programmable Gate Array (Field-Programmable Gate Array, FPGA), General Purpose Processors, Controllers, Microcontrollers, Microprocessors, Others for performing the functions described herein electronic unit or a combination thereof. For a software implementation, the techniques described herein may be implemented through modules (eg, procedures, functions, etc.) that perform the functions described herein. Software codes may be stored in memory and executed by a processor. The memory can be implemented in the processor or external to the processor.
可选地,作为另一个实施例,第一处理器1003还配置为在运行所述计算机程序时,执行前述实施例中任一项所述的方法。Optionally, as another embodiment, the first processor 1003 is further configured to execute the method described in any one of the foregoing embodiments when running the computer program.
本实施例提供了一种编码器,该编码器可以包括第一确定单元、块划分单元和编码单元。这样,由于根据视频图像的纹理信息确定当前块的最大单元尺寸信息,也即本申请的技术方案对最大单元尺寸设计了自适应图像纹理的机制,能够直接跳过大尺寸块的预测和变换过程的计算,导致块划分的递归总次数实现指数级下降,从而在保持性能增益基本不变的情况下,显著地减少了编码复杂度,并且减少了编码时间,进而能够提升编解码效率。This embodiment provides an encoder, and the encoder may include a first determination unit, a block division unit, and a coding unit. In this way, since the maximum unit size information of the current block is determined according to the texture information of the video image, that is, the technical solution of the present application designs an adaptive image texture mechanism for the maximum unit size, which can directly skip the prediction and transformation process of large-size blocks. The calculation of , leads to an exponential decrease in the total number of recursion of block division, thus significantly reducing the coding complexity and reducing the coding time while keeping the performance gain basically unchanged, thereby improving the coding and decoding efficiency.
在本申请的再一实施例中,基于前述实施例相同的发明构思,参见图11,其示出了本申请实施例提供的一种解码器110的组成结构示意图。如图11所示,该解码器110可以包括:解析单元1101和第二确定单元1102;其中,In yet another embodiment of the present application, based on the same inventive concept as the foregoing embodiments, see FIG. 11 , which shows a schematic structural diagram of a decoder 110 provided by an embodiment of the present application. As shown in FIG. 11 , the decoder 110 may include: a parsing unit 1101 and a second determining unit 1102; wherein,
解析单元1101,配置为解析码流,确定当前块的块划分参数;The parsing unit 1101 is configured to parse the code stream and determine the block division parameter of the current block;
解析单元1101,还配置为基于块划分参数,解析码流,确定当前块的预测值;以及基于块划分参数,解析码流,确定当前块的残差值;The parsing unit 1101 is further configured to parse the code stream based on the block division parameter to determine the predicted value of the current block; and based on the block division parameter, parse the code stream to determine the residual value of the current block;
第二确定单元1102,配置为基于预测值和残差值,确定当前块的重建值。The second determining unit 1102 is configured to determine the reconstruction value of the current block based on the predicted value and the residual value.
在一些实施例中,解析单元1101,还配置为解析码流,获取视频图像的标识信息;In some embodiments, the parsing unit 1101 is further configured to parse the code stream to obtain identification information of the video image;
第二确定单元1102,还配置为若视频图像的标识信息的取值为第一值,则确定视频图像的标识信息指示视频图像为高位深视频;或者,若视频图像的标识信息的取值为第二值,则确定视频图像的标识信息指示视频图像为非高位深视频。The second determining unit 1102 is further configured to, if the value of the identification information of the video image is the first value, determine that the identification information of the video image indicates that the video image is a high bit depth video; or, if the value of the identification information of the video image is the value of For the second value, it is determined that the identification information of the video image indicates that the video image is a non-high bit-depth video.
在一些实施例中,第二确定单元1102,还配置为基于块划分参数,确定当前块的划分模式;以及根据划分模式,确定当前块的划分树,其中,划分树包含对当前块进行划分得到的一个或多个节点子块。In some embodiments, the second determining unit 1102 is further configured to determine a division mode of the current block based on the block division parameter; and determine a division tree of the current block according to the division mode, wherein the division tree comprises dividing the current block to obtain One or more node sub-blocks of .
在一些实施例中,解析单元1101,还配置为根据预设的节点子块处理顺序依次解析划分树的每一个节点子块的码流,确定每一个节点子块的预测模式;In some embodiments, the parsing unit 1101 is further configured to sequentially parse the code stream of each node sub-block of the partition tree according to a preset node sub-block processing order, and determine the prediction mode of each node sub-block;
第二确定单元1102,还配置为根据预测模式确定每一个节点子块的预测值。The second determining unit 1102 is further configured to determine the prediction value of each node sub-block according to the prediction mode.
在一些实施例中,解析单元1101,还配置为根据预设的节点子块处理顺序依次解析划分树的每一个节点子块的码流,确定每一个节点子块的残差值。In some embodiments, the parsing unit 1101 is further configured to sequentially parse the code stream of each node sub-block of the partition tree according to the preset node sub-block processing order, and determine the residual value of each node sub-block.
在一些实施例中,划分模式与视频图像的纹理信息具有关联关系。In some embodiments, the division mode has an associated relationship with texture information of the video image.
可以理解地,在本实施例中,“单元”可以是部分电路、部分处理器、部分程序或软件等等,当然也可以是模块,还可以是非模块化的。而且在本实施例中的各组成部分可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。It can be understood that, in this embodiment, a "unit" may be a part of a circuit, a part of a processor, a part of a program or software, etc., of course, it may also be a module, and it may also be non-modular. Moreover, each component in this embodiment may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, or can be implemented in the form of software function modules.
所述集成的单元如果以软件功能模块的形式实现并非作为独立的产品进行销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本实施例提供了一种计算机存储介质,应用于解码器110,该计算机存储介质存储有计算机程序,所述计算机程序被第二处理器执行时实现前述实施例中任一项所述的方法。If the integrated unit is implemented in the form of a software functional module and is not sold or used as an independent product, it may be stored in a computer-readable storage medium. Based on such understanding, this embodiment provides a computer storage medium, which is applied to the decoder 110, where the computer storage medium stores a computer program, and when the computer program is executed by the second processor, any one of the foregoing embodiments is implemented the method described.
基于上述解码器110的组成以及计算机存储介质,参见图12,其示出了本申请实施例提供的解码器110的具体硬件结构示意图。如图12所示,可以包括:第二通信接口1201、第二存储器1202和第二处理器1203;各个组件通过第二总线系统1204耦合在一起。可理解,第二总线系统1204用于实现这些组件之间的连接通信。第二总线系统1204除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图12中将各种总线都标为第二总线系统1204。其中,Based on the above-mentioned composition of the decoder 110 and the computer storage medium, see FIG. 12 , which shows a schematic diagram of a specific hardware structure of the decoder 110 provided by the embodiment of the present application. As shown in FIG. 12 , it may include: a second communication interface 1201 , a second memory 1202 and a second processor 1203 ; each component is coupled together through a second bus system 1204 . It can be understood that the second bus system 1204 is used to implement connection communication between these components. In addition to the data bus, the second bus system 1204 also includes a power bus, a control bus, and a status signal bus. However, for the sake of clarity, the various buses are labeled as the second bus system 1204 in FIG. 12 . in,
第二通信接口1201,用于在与其他外部网元之间进行收发信息过程中,信号的接收和发送;The second communication interface 1201 is used for receiving and sending signals in the process of sending and receiving information with other external network elements;
第二存储器1202,用于存储能够在第二处理器1203上运行的计算机程序;a second memory 1202 for storing computer programs that can run on the second processor 1203;
第二处理器1203,用于在运行所述计算机程序时,执行:The second processor 1203 is configured to, when running the computer program, execute:
解析码流,确定当前块的块划分参数;Parse the code stream and determine the block division parameters of the current block;
基于块划分参数,解析码流,确定当前块的预测值;Based on the block division parameters, the code stream is parsed to determine the predicted value of the current block;
基于块划分参数,解析码流,确定当前块的残差值;Based on the block division parameters, the code stream is parsed to determine the residual value of the current block;
基于预测值和残差值,确定当前块的重建值。Based on the predicted value and the residual value, the reconstructed value of the current block is determined.
可选地,作为另一个实施例,第二处理器1203还配置为在运行所述计算机程序时,执行前述实施例中任一项所述的方法。Optionally, as another embodiment, the second processor 1203 is further configured to execute the method described in any one of the foregoing embodiments when running the computer program.
可以理解,第二存储器1202与第一存储器1002的硬件功能类似,第二处理器1203与第一处理器 1003的硬件功能类似;这里不再详述。It can be understood that the hardware function of the second memory 1202 is similar to that of the first memory 1002, and the hardware function of the second processor 1203 is similar to that of the first processor 1003; details are not described here.
本实施例提供了一种解码器,该解码器可以包括解析单元和第二确定单元。这样,由于根据视频图像的纹理信息确定当前块的最大单元尺寸信息,也即本申请的技术方案对最大单元尺寸设计了自适应图像纹理的机制,能够直接跳过大尺寸块的预测和变换过程的计算,导致块划分的递归总次数实现指数级下降,从而在保持性能增益基本不变的情况下,显著地减少了编码复杂度,并且减少了编码时间,进而能够提升编解码效率。This embodiment provides a decoder, and the decoder may include a parsing unit and a second determining unit. In this way, since the maximum unit size information of the current block is determined according to the texture information of the video image, that is, the technical solution of the present application designs an adaptive image texture mechanism for the maximum unit size, which can directly skip the prediction and transformation process of large-size blocks. The calculation of , leads to an exponential decrease in the total number of recursion of block division, thus significantly reducing the coding complexity and reducing the coding time while keeping the performance gain basically unchanged, thereby improving the coding and decoding efficiency.
需要说明的是,在本申请中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者装置不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者装置所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物品或者装置中还存在另外的相同要素。It should be noted that, in this application, the terms "comprising", "comprising" or any other variation thereof are intended to encompass non-exclusive inclusion, such that a process, method, article or device comprising a series of elements includes not only those elements , but also other elements not expressly listed or inherent to such a process, method, article or apparatus. Without further limitation, an element qualified by the phrase "comprising a..." does not preclude the presence of additional identical elements in a process, method, article or apparatus that includes the element.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The above-mentioned serial numbers of the embodiments of the present application are only for description, and do not represent the advantages or disadvantages of the embodiments.
本申请所提供的几个方法实施例中所揭露的方法,在不冲突的情况下可以任意组合,得到新的方法实施例。The methods disclosed in the several method embodiments provided in this application can be arbitrarily combined under the condition of no conflict to obtain new method embodiments.
本申请所提供的几个产品实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的产品实施例。The features disclosed in the several product embodiments provided in this application can be combined arbitrarily without conflict to obtain a new product embodiment.
本申请所提供的几个方法或设备实施例中所揭露的特征,在不冲突的情况下可以任意组合,得到新的方法实施例或设备实施例。The features disclosed in several method or device embodiments provided in this application can be combined arbitrarily without conflict to obtain new method embodiments or device embodiments.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above are only specific embodiments of the present application, but the protection scope of the present application is not limited to this. should be covered within the scope of protection of this application. Therefore, the protection scope of the present application should be subject to the protection scope of the claims.
本申请实施例中,在编码器侧,基于视频图像的纹理信息,确定当前块的最大单元尺寸信息;根据最大单元尺寸信息对当前块进行预处理,确定当前块的划分模式;根据划分模式,确定当前块的块划分参数;以及根据块划分参数,对当前块进行编码。在解码器侧,解析码流,确定当前块的块划分参数;基于块划分参数,解析码流,确定当前块的预测值;基于块划分参数,解析码流,确定当前块的残差值;以及基于预测值和残差值,确定当前块的重建值。这样,由于根据视频图像的纹理信息确定当前块的最大单元尺寸信息,也即本申请的技术方案对最大单元尺寸设计了自适应图像纹理的机制,能够直接跳过大尺寸块的预测和变换过程的计算,导致块划分的递归总次数实现指数级下降,从而在保持性能增益基本不变的情况下,显著地减少了编码复杂度,并且减少了编码时间,进而能够提升编解码效率。In the embodiment of the present application, on the encoder side, the maximum unit size information of the current block is determined based on the texture information of the video image; the current block is preprocessed according to the maximum unit size information, and the division mode of the current block is determined; according to the division mode, determining a block division parameter of the current block; and encoding the current block according to the block division parameter. On the decoder side, the code stream is parsed to determine the block division parameters of the current block; based on the block division parameters, the code stream is parsed to determine the predicted value of the current block; based on the block division parameters, the code stream is parsed to determine the residual value of the current block; And, based on the predicted value and the residual value, the reconstructed value of the current block is determined. In this way, since the maximum unit size information of the current block is determined according to the texture information of the video image, that is, the technical solution of the present application designs an adaptive image texture mechanism for the maximum unit size, which can directly skip the prediction and transformation process of large-size blocks. The calculation of , leads to an exponential decrease in the total number of recursion of block division, thus significantly reducing the coding complexity and reducing the coding time while keeping the performance gain basically unchanged, thereby improving the coding and decoding efficiency.
Claims (29)
- 一种块划分方法,应用于编码器,所述方法包括:A block division method, applied to an encoder, the method comprising:基于视频图像的纹理信息,确定当前块的最大单元尺寸信息;Determine the maximum unit size information of the current block based on the texture information of the video image;根据所述最大单元尺寸信息对所述当前块进行预处理,确定所述当前块的划分模式;Perform preprocessing on the current block according to the maximum unit size information to determine a division mode of the current block;根据所述划分模式,确定所述当前块的块划分参数;determining the block division parameter of the current block according to the division mode;根据所述块划分参数,对所述当前块进行编码。The current block is encoded according to the block division parameter.
- 根据权利要求1所述的方法,其中,所述根据所述块划分参数,对所述当前块进行编码,包括:The method of claim 1, wherein the encoding the current block according to the block division parameter comprises:对所述块划分参数进行编码,将编码比特写入码流。The block division parameter is encoded, and the encoded bits are written into the code stream.
- 根据权利要求1或2所述的方法,其中,所述根据所述块划分参数,对所述当前块进行编码,包括:The method according to claim 1 or 2, wherein the encoding the current block according to the block division parameter comprises:根据所述块划分参数,将所述当前块划分成一个或多个节点子块;dividing the current block into one or more node sub-blocks according to the block division parameter;按照预设的节点子块处理顺序,依次确定每一个所述节点子块的预测参数;According to the preset node sub-block processing order, sequentially determine the prediction parameters of each of the node sub-blocks;根据所述预测参数,确定所述节点子块的预测值;determining the predicted value of the node sub-block according to the prediction parameter;根据所述节点子块的原始值和所述预测值,确定所述节点子块的残差值;According to the original value of the node sub-block and the predicted value, determine the residual value of the node sub-block;对所述节点子块的预测参数和残差值进行编码,将编码比特写入码流。The prediction parameters and residual values of the node sub-blocks are encoded, and the encoded bits are written into the code stream.
- 根据权利要求1所述的方法,其中,所述根据所述最大单元尺寸信息对所述当前块进行预处理,确定所述当前块的划分模式,包括:The method according to claim 1, wherein the preprocessing of the current block according to the maximum unit size information to determine the division mode of the current block comprises:利用所述最大单元尺寸信息对所述当前块进行划分,得到至少一个第一节点子块,并计算第一率失真代价值;The current block is divided by using the maximum unit size information to obtain at least one first node sub-block, and a first rate-distortion cost value is calculated;利用预设划分模式对所述第一节点子块进行划分,得到至少一个第二节点子块,并计算第二率失真代价值;The first node sub-block is divided by using a preset division mode to obtain at least one second node sub-block, and the second rate-distortion cost value is calculated;根据所述第一率失真代价值和所述第二率失真代价值的比较结果,确定所述当前块的划分模式。A division mode of the current block is determined according to a comparison result of the first rate-distortion cost value and the second rate-distortion cost value.
- 根据权利要求4所述的方法,其中,所述根据所述第一率失真代价值和所述第二率失真代价值的比较结果,确定所述当前块的划分模式,包括:The method according to claim 4, wherein determining the division mode of the current block according to the comparison result of the first rate-distortion cost value and the second rate-distortion cost value comprises:将所述第一率失真代价值和所述第二率失真代价值进行比较;comparing the first rate-distortion cost value to the second rate-distortion cost value;在所述第二率失真代价值小于所述第一率失真代价值的情况下,利用预设划分模式对所述第二节点子块进行划分,得到至少一个下一级第二节点子块,并计算第三率失真代价值;In the case that the second rate-distortion cost value is less than the first rate-distortion cost value, the second node sub-block is divided by using a preset division mode to obtain at least one next-level second node sub-block, And calculate the third rate distortion cost value;根据所述第二率失真代价值和所述第三率失真代价值,确定所述当前块的划分模式。A division mode of the current block is determined according to the second rate-distortion cost value and the third rate-distortion cost value.
- 根据权利要求5所述的方法,其中,所述根据所述第二率失真代价值和所述第三率失真代价值,确定所述当前块的划分模式,包括:The method according to claim 5, wherein the determining the division mode of the current block according to the second rate-distortion cost value and the third rate-distortion cost value comprises:在所述第三率失真代价值小于所述第二率失真代价值的情况下,利用所述第三率失真代价值更新所述第二率失真代价值,并返回执行所述利用预设划分模式对所述第二节点子块进行划分,得到至少一个下一级第二节点子块,并计算第三率失真代价值的步骤,直至确定出最小率失真代价值;In the case that the third rate-distortion cost value is smaller than the second rate-distortion cost value, update the second rate-distortion cost value by using the third rate-distortion cost value, and return to executing the using preset division The mode divides the second node sub-block to obtain at least one next-level second node sub-block, and calculates the third rate-distortion cost value until the minimum rate-distortion cost value is determined;根据所述最小率失真代价值对应的划分模式,确定所述当前块的划分模式。The division mode of the current block is determined according to the division mode corresponding to the minimum rate-distortion cost value.
- 根据权利要求5所述的方法,其中,所述方法还包括:The method of claim 5, wherein the method further comprises:在所述第二率失真代价值大于或等于所述第一率失真值的情况下,直接将根据所述最大单元尺寸信息对所述当前块进行划分的模式确定为所述当前块的划分模式。In the case that the second rate-distortion cost value is greater than or equal to the first rate-distortion value, directly determining the mode of dividing the current block according to the maximum unit size information as the division mode of the current block .
- 根据权利要求4所述的方法,其中,所述预设划分模式包括四叉树划分模式和/或多类型树划分模式;The method according to claim 4, wherein the preset division mode comprises a quad-tree division mode and/or a multi-type tree division mode;所述多类型树划分模式包括下述至少之一:垂直二叉树划分模式、水平二叉树划分模式、垂直三叉树划分模式和水平三叉树划分模式。The multi-type tree division modes include at least one of the following: a vertical binary tree division mode, a horizontal binary tree division mode, a vertical tri-tree division mode, and a horizontal tri-tree division mode.
- 根据权利要求1所述的方法,其中,所述基于视频图像的纹理信息,确定当前块的最大单元尺寸信息,包括:The method according to claim 1, wherein the determining the maximum unit size information of the current block based on the texture information of the video image comprises:对所述视频图像进行块划分,得到N个预设尺寸的块;其中,N为大于零的整数,且所述N个块之间互不重叠;Perform block division on the video image to obtain N blocks of preset size; wherein, N is an integer greater than zero, and the N blocks do not overlap each other;对所述N个块进行纹理分析,确定第一数量;其中,所述第一数量表征所述N个块中纹理值小于第一阈值的块数量;Perform texture analysis on the N blocks to determine a first number; wherein, the first number represents the number of blocks whose texture values are less than a first threshold in the N blocks;根据所述第一数量与第二阈值的比较结果,确定所述当前块的最大单元尺寸信息。The maximum unit size information of the current block is determined according to the comparison result of the first number and the second threshold.
- 根据权利要求9所述的方法,其中,所述对所述N个块进行纹理分析,确定第一数量,包括:The method of claim 9, wherein the performing texture analysis on the N blocks to determine the first number comprises:计算所述N个块的纹理值;calculating the texture values of the N blocks;将所述N个块的纹理值依次与第一阈值进行比较;comparing the texture values of the N blocks with the first threshold in sequence;根据比较结果对所述纹理值小于第一阈值的块进行数量统计,得到所述第一数量。According to the comparison result, the number of blocks whose texture value is less than the first threshold is counted to obtain the first number.
- 根据权利要求10所述的方法,其中,所述计算所述N个块的纹理值,包括:The method of claim 10, wherein the calculating the texture values of the N blocks comprises:对第k个块进行方差值计算,得到所述第k个块的纹理值;其中,k为大于或等于零且小于N的整数。Perform variance value calculation on the kth block to obtain the texture value of the kth block; wherein, k is an integer greater than or equal to zero and less than N.
- 根据权利要求10所述的方法,其中,所述计算所述N个块的纹理值,包括:The method of claim 10, wherein the calculating the texture values of the N blocks comprises:确定第k个块的水平梯度绝对值和垂直梯度绝对值;Determine the absolute value of the horizontal gradient and the absolute value of the vertical gradient of the kth block;对所述水平梯度绝对值和垂直梯度绝对值进行求和计算,得到所述第k个块的纹理值;其中,k为大于或等于零且小于N的整数。The absolute value of the horizontal gradient and the absolute value of the vertical gradient are summed to obtain the texture value of the kth block; wherein, k is an integer greater than or equal to zero and less than N.
- 根据权利要求9所述的方法,其中,所述根据所述第一数量与第二阈值的比较结果,确定所述当前块的最大单元尺寸信息,包括:The method according to claim 9, wherein the determining the maximum unit size information of the current block according to the comparison result of the first number and the second threshold value comprises:确定所述第一数量与N的比值,将所述比值与第二阈值进行比较;determining a ratio of the first number to N, and comparing the ratio to a second threshold;在所述比值小于所述第二阈值的情况下,确定所述当前块的最大单元尺寸信息为第一尺寸值。In the case that the ratio is smaller than the second threshold, determining the maximum unit size information of the current block as the first size value.
- 根据权利要求13所述的方法,其中,所述方法还包括:The method of claim 13, wherein the method further comprises:在所述比值大于或等于所述第二阈值的情况下,将所述比值与第三阈值进行比较;if the ratio is greater than or equal to the second threshold, comparing the ratio to a third threshold;若所述比值小于所述第三阈值,则确定所述当前块的最大单元尺寸信息为第二尺寸值;If the ratio is less than the third threshold, determining that the maximum unit size information of the current block is the second size value;其中,所述第二尺寸值与所述第一尺寸值不同;所述第一阈值与所述第二阈值和所述第三阈值不同,且所述第二阈值小于所述第三阈值。The second size value is different from the first size value; the first threshold value is different from the second threshold value and the third threshold value, and the second threshold value is smaller than the third threshold value.
- 根据权利要求1至14任一项所述的方法,其中,所述方法还包括:The method according to any one of claims 1 to 14, wherein the method further comprises:确定所述视频图像的标识信息;determining the identification information of the video image;在所述视频图像的标识信息指示所述视频图像为高位深视频时,执行所述基于视频图像的纹理信息,确定当前块的最大单元尺寸信息的步骤。When the identification information of the video image indicates that the video image is a high bit depth video, the step of determining the maximum unit size information of the current block based on the texture information of the video image is performed.
- 根据权利要求15所述的方法,其中,所述确定所述视频图像的标识信息,包括:The method according to claim 15, wherein the determining the identification information of the video image comprises:若所述视频图像的标识信息指示所述视频图像为高位深视频,则确定所述视频图像的标识信息的取值为第一值;或者,If the identification information of the video image indicates that the video image is a high-bit-depth video, determine that the value of the identification information of the video image is the first value; or,若所述视频图像的标识信息指示所述视频图像为非高位深视频,则确定所述视频图像的标识信息的取值为第二值。If the identification information of the video image indicates that the video image is a non-high bit-depth video, the value of the identification information of the video image is determined to be a second value.
- 根据权利要求16所述的方法,其中,所述方法还包括:The method of claim 16, wherein the method further comprises:对所述视频图像的标识信息进行编码,将编码比特写入码流。The identification information of the video image is encoded, and the encoded bits are written into the code stream.
- 根据权利要求1所述的方法,其中,所述最大单元尺寸信息的级别至少为下述其中之一:序列级、图像级。The method according to claim 1, wherein the level of the maximum cell size information is at least one of the following: sequence level and image level.
- 一种块划分方法,应用于解码器,所述方法包括:A block division method, applied to a decoder, the method comprising:解析码流,确定当前块的块划分参数;Parse the code stream and determine the block division parameters of the current block;基于所述块划分参数,解析码流,确定所述当前块的预测值;Based on the block division parameter, the code stream is parsed, and the predicted value of the current block is determined;基于所述块划分参数,解析码流,确定所述当前块的残差值;Based on the block division parameter, the code stream is parsed, and the residual value of the current block is determined;基于所述预测值和所述残差值,确定所述当前块的重建值。Based on the predicted value and the residual value, a reconstructed value of the current block is determined.
- 根据权利要求19所述的方法,其中,所述方法还包括:The method of claim 19, wherein the method further comprises:解析码流,获取视频图像的标识信息;Parse the code stream to obtain the identification information of the video image;若所述视频图像的标识信息的取值为第一值,则确定所述视频图像的标识信息指示所述视频图像为高位深视频;或者,If the value of the identification information of the video image is the first value, it is determined that the identification information of the video image indicates that the video image is a high bit depth video; or,若所述视频图像的标识信息的取值为第二值,则确定所述视频图像的标识信息指示所述视频图像为非高位深视频。If the value of the identification information of the video image is the second value, it is determined that the identification information of the video image indicates that the video image is a non-high bit depth video.
- 根据权利要求19或20所述的方法,其中,所述方法还包括:The method of claim 19 or 20, wherein the method further comprises:基于所述块划分参数,确定所述当前块的划分模式;determining a division mode of the current block based on the block division parameter;根据所述划分模式,确定所述当前块的划分树,其中,所述划分树包含对所述当前块进行划分得到的一个或多个节点子块。According to the division mode, a division tree of the current block is determined, wherein the division tree includes one or more node sub-blocks obtained by dividing the current block.
- 根据权利要求21所述的方法,其中,所述基于所述块划分参数,解析码流,确定所述当前块的预测值,包括:The method according to claim 21, wherein, based on the block division parameter, parsing the code stream to determine the predicted value of the current block comprises:根据预设的节点子块处理顺序依次解析所述划分树的每一个节点子块的码流,确定所述每一个节点子块的预测模式;According to the preset node sub-block processing order, the code stream of each node sub-block of the partition tree is sequentially analyzed, and the prediction mode of each node sub-block is determined;根据所述预测模式确定所述每一个节点子块的预测值。The prediction value of each node sub-block is determined according to the prediction mode.
- 根据权利要求21所述的方法,其中,所述基于所述块划分参数,解析码流,确定所述当前块 的残差值,包括:The method according to claim 21, wherein, based on the block division parameter, parsing the code stream to determine the residual value of the current block, comprising:根据预设的节点子块处理顺序依次解析所述划分树的每一个节点子块的码流,确定所述每一个节点子块的残差值。The code stream of each node sub-block of the partition tree is sequentially parsed according to the preset node sub-block processing order, and the residual value of each node sub-block is determined.
- 根据权利要求21所述的方法,其中,所述划分模式与视频图像的纹理信息具有关联关系。The method of claim 21, wherein the division mode has an associated relationship with texture information of the video image.
- 一种编码器,所述编码器包括第一确定单元、块划分单元和编码单元;其中,An encoder comprising a first determination unit, a block division unit and a coding unit; wherein,所述第一确定单元,配置为基于视频图像的纹理信息,确定当前块的最大单元尺寸信息;The first determining unit is configured to determine the maximum unit size information of the current block based on the texture information of the video image;所述块划分单元,配置为根据所述最大单元尺寸信息对所述当前块进行预处理,确定所述当前块的划分模式;以及根据所述划分模式,确定所述当前块的块划分参数;the block division unit, configured to preprocess the current block according to the maximum unit size information to determine a division mode of the current block; and determine a block division parameter of the current block according to the division mode;所述编码单元,配置为根据所述块划分参数,对所述当前块进行编码。The encoding unit is configured to encode the current block according to the block division parameter.
- 一种编码器,所述编码器包括第一存储器和第一处理器;其中,An encoder comprising a first memory and a first processor; wherein,所述第一存储器,用于存储能够在所述第一处理器上运行的计算机程序;the first memory for storing a computer program executable on the first processor;所述第一处理器,用于在运行所述计算机程序时,执行如权利要求1至18任一项所述的方法。The first processor is configured to execute the method according to any one of claims 1 to 18 when running the computer program.
- 一种解码器,所述解码器包括解析单元和第二确定单元;其中,A decoder, the decoder includes a parsing unit and a second determining unit; wherein,所述解析单元,配置为解析码流,确定当前块的块划分参数;The parsing unit is configured to parse the code stream and determine the block division parameter of the current block;所述解析单元,还配置为基于所述块划分参数,解析码流,确定所述当前块的预测值;以及基于所述块划分参数,解析码流,确定所述当前块的残差值;The parsing unit is further configured to parse the code stream based on the block division parameter to determine the predicted value of the current block; and based on the block division parameter, parse the code stream to determine the residual value of the current block;所述第二确定单元,配置为基于所述预测值和所述残差值,确定所述当前块的重建值。The second determination unit is configured to determine the reconstruction value of the current block based on the prediction value and the residual value.
- 一种解码器,所述解码器包括第二存储器和第二处理器;其中,A decoder comprising a second memory and a second processor; wherein,所述第二存储器,用于存储能够在所述第二处理器上运行的计算机程序;the second memory for storing a computer program executable on the second processor;所述第二处理器,用于在运行所述计算机程序时,执行如权利要求19至24任一项所述的方法。The second processor is configured to execute the method according to any one of claims 19 to 24 when running the computer program.
- 一种计算机存储介质,其中,所述计算机存储介质存储有计算机程序,所述计算机程序被执行时实现如权利要求1至18任一项所述的方法、或者如权利要求19至24任一项所述的方法。A computer storage medium, wherein the computer storage medium stores a computer program that, when executed, implements the method according to any one of claims 1 to 18, or any one of claims 19 to 24 the method described.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/091736 WO2022227082A1 (en) | 2021-04-30 | 2021-04-30 | Block division methods, encoders, decoders, and computer storage medium |
CN202180096081.8A CN117063467A (en) | 2021-04-30 | 2021-04-30 | Block dividing method, encoder, decoder, and computer storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2021/091736 WO2022227082A1 (en) | 2021-04-30 | 2021-04-30 | Block division methods, encoders, decoders, and computer storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022227082A1 true WO2022227082A1 (en) | 2022-11-03 |
Family
ID=83847566
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2021/091736 WO2022227082A1 (en) | 2021-04-30 | 2021-04-30 | Block division methods, encoders, decoders, and computer storage medium |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN117063467A (en) |
WO (1) | WO2022227082A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116260973A (en) * | 2023-03-31 | 2023-06-13 | 北京百度网讯科技有限公司 | Time domain filtering method and device, electronic equipment and storage medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102934436A (en) * | 2010-04-13 | 2013-02-13 | 三星电子株式会社 | Video-encoding method and video-encoding apparatus using prediction units based on encoding units determined in accordance with a tree structure, and video-decoding method and video-decoding apparatus using prediction units based on encoding units de |
CN110234008A (en) * | 2019-03-11 | 2019-09-13 | 杭州海康威视数字技术股份有限公司 | Coding method, coding/decoding method and device |
CN112135147A (en) * | 2019-06-24 | 2020-12-25 | 杭州海康威视数字技术股份有限公司 | Encoding method, decoding method and device |
-
2021
- 2021-04-30 CN CN202180096081.8A patent/CN117063467A/en active Pending
- 2021-04-30 WO PCT/CN2021/091736 patent/WO2022227082A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102934436A (en) * | 2010-04-13 | 2013-02-13 | 三星电子株式会社 | Video-encoding method and video-encoding apparatus using prediction units based on encoding units determined in accordance with a tree structure, and video-decoding method and video-decoding apparatus using prediction units based on encoding units de |
CN110234008A (en) * | 2019-03-11 | 2019-09-13 | 杭州海康威视数字技术股份有限公司 | Coding method, coding/decoding method and device |
CN112135147A (en) * | 2019-06-24 | 2020-12-25 | 杭州海康威视数字技术股份有限公司 | Encoding method, decoding method and device |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116260973A (en) * | 2023-03-31 | 2023-06-13 | 北京百度网讯科技有限公司 | Time domain filtering method and device, electronic equipment and storage medium |
CN116260973B (en) * | 2023-03-31 | 2024-03-19 | 北京百度网讯科技有限公司 | Time domain filtering method and device, electronic equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN117063467A (en) | 2023-11-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20220116664A1 (en) | Loop filtering method and device | |
US11785208B2 (en) | Prediction method for decoding and apparatus, and computer storage medium | |
WO2022178686A1 (en) | Encoding/decoding method, encoding/decoding device, encoding/decoding system, and computer readable storage medium | |
WO2022266971A1 (en) | Encoding method, decoding method, encoder, decoder and computer storage medium | |
CN114598873B (en) | Decoding method and device for quantization parameter | |
WO2021203924A1 (en) | Encoding method, decoding method, encoder, decoder, and storage medium | |
US20240236354A1 (en) | Coding/decoding method, code stream, coder, decoder, and storage medium | |
WO2022227082A1 (en) | Block division methods, encoders, decoders, and computer storage medium | |
CN113709491B (en) | Image decoding method, decoder, and storage medium | |
WO2022217442A1 (en) | Coefficient encoding/decoding method, encoder, decoder, and computer storage medium | |
WO2022193394A1 (en) | Coefficient coding/decoding method, encoder, decoder, and computer storage medium | |
CN116982262A (en) | State transition for dependent quantization in video coding | |
WO2020215216A1 (en) | Image decoding method, decoder and storage medium | |
CN112970257A (en) | Decoding prediction method, device and computer storage medium | |
WO2023193260A1 (en) | Encoding/decoding method, code stream, encoder, decoder, and storage medium | |
WO2022188239A1 (en) | Coefficient coding/decoding method, encoder, decoder, and computer storage medium | |
WO2024152352A1 (en) | Encoding method, decoding method, code stream, encoder, decoder, and storage medium | |
WO2023272517A1 (en) | Encoding and decoding method, bitstream, encoder, decoder, and computer storage medium | |
US20220046231A1 (en) | Video encoding/decoding method and device | |
US11375183B2 (en) | Methods and systems for combined lossless and lossy coding | |
WO2024007120A1 (en) | Encoding and decoding method, encoder, decoder and storage medium | |
EP4087254A1 (en) | Inter-frame prediction method, encoder, decoder and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21938555 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202180096081.8 Country of ref document: CN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21938555 Country of ref document: EP Kind code of ref document: A1 |